split-function-python

Split Function Python

Python Split String [ Delimiter, Space, Character, Regex, Multiple Delimiters ]

Parting a String into separate words is quite possibly the most widely recognized activities performed on a String. We can utilize the split() method of the str class to play out the split operation. In this Python Split String article, we will figure out how to part string in Python dependent on a delimiter, comma, space, character, regex, and multiple delimiters.

Python Split String

Python Split String

 

All engineers would have run over the present circumstance, where we need to part a total string into discrete words.

Assume in the event that we have a string of usernames separated by a comma, we need to part them into individual username so that we can play out any procedure on it, for example, checking the number of users available.

>>> usernames = "Tim, Bob, Bill, Tom, Sam"

We can use the split() method to split the users based on comma delimiter. Let’s understand the split() method first before getting to the solution.

String split() method

The split() method returns a list of words in the string separated by the delimiter passed. The syntax of the split() method is

str.split(separator, maxsplit)

  • The separator acts as the delimiter, and the string gets part dependent on the separator. If no separator is indicated, at that point the split() method splits the string based on whitespace
  • The maxsplit number determines the maximum number of Split. In the event that the maxsplit is not specified, then it is considered as -1 meaning no cutoff.

As a first step, let’s try to fix the issue stated above.

>>> usernames = "Tim, Bob, Bill, Tom, Sam"
>>> listOfUsers = usernames.split(',')
>>> type(listOfUsers)
<class 'list'>
>>> print("Number of Users available are "+str(len(listOfUsers)))
Number of Users available are 5

Now the listOfUsers will have all the individual users separated.

Let’s pass the maxsplit number as 2

>>> listOfUsers = usernames.split(',',2)
>>> listOfUsers
['Tim', ' Bob', ' Bill, Tom, Sam']

We can see that only 2 splits have happened ‘Tim’ and ‘Bob’ are separated, and the remaining users [Bill, Tom, Sam] are not split

Let’s try maxsplit as 3 this time

>>> listOfUsers = usernames.split(',',3)
>>> listOfUsers
['Tim', ' Bob', ' Bill', ' Tom, Sam']

Now 3 splits have happened ‘Tim’, ‘Bob’, and ‘Bill’ are separated.

Split by Space

Whenever the delimiter is not specified or is null, then the string will be split using the Space / Whitespace as a delimiter.

>>> msg = "Welcome to Java Interview Point"
>>> words = msg.split()
>>> words
['Welcome', 'to', 'Java ', 'Interview', 'Point']

Whenever there are multiple consecutive whitespaces, it is considered as a single separator. In the below snippet, we have multiple spaces between each word.

>>> msg = "Welcome      to     Java Interview Point"
>>> words = msg.split()
>>> words
['Welcome', 'to', 'Java', 'Interview', 'Point']

Whitespace includes newline (\n) and tab space (\t) characters as well. So if the string contains \n or \t, it is considered as space only.

>>> msg = "Welcome\nto\tJava Interview Point"
>>> words = msg.split()
>>> words
['Welcome', 'to', 'Java','Interview','Point']

Python Split String by Character

There are three diverse ways by which we can part a string into a list of characters.

  1. Utilizing List cut task
  2. By Passing the String to the rundown constructor
  3. With For Loop

1.Using List slice assignment

Slice Assignment is a special syntax for Lists, using which we can alter the contents of the lists. Let’s split the string into characters using List slice assignment

>>> msg = "Welcome"
>>> chars = []
>>> chars[:] = msg
>>> chars
['W', 'e', 'l', 'c', 'o', 'm', 'e']

By specifying chars[:] on the left side of the = operator, we are telling Python to use Slice Assignment.

2. By Passing the String to list constructor

The list() constructor takes a solitary iterable contention which can be a sequence or any iterator object

We simply need to pass the sequence (string) to the list() constructor, as it is a sort of iterable, the list() constructors parts them into singular characters.

>>> msg = "Welcome"
>>> chars = list(msg)
>>> chars
['W', 'e', 'l', 'c', 'o', 'm', 'e']

3. With For Loop

This is a kind of manual approach where we take each character of the string and append it to the list.

>>> for char in msg:
       chars.append(char)
>>> chars
['W', 'e', 'l', 'c', 'o', 'm', 'e']

Python String Split by regex

We can use a regular expression to split a string, we need to import re module and use the split() method of it.

The syntax is

re.split(pattern, string, maxsplit, flags)

For example, let’s take a string separated by underscore ‘_’, we just need to pass the delimiter inside the square brackets []

>>> import re
>>> message = "Welcome_to_Javainterview_Point"
>>> words = re.split('[_]', message)
>>> words
['Welcome', 'to', 'Javainterview', 'Point']

Let’s try with the maxsplit as 2

>>> words = re.split('[_]', message,2)
>>> words
['Welcome', 'to', 'Javainterview_Point']

The flags parameter allows you to modify the way the Regular expression works. We can use flags in two ways, either the long name or short name.
For example, if we want the regex to ignore the cases, then we can use the flag IGNORECASE, or I.

>>> numbers = "1234aaaa567BBB890ccc987"
>>> numberList = re.split('[a-c]+', numbers, flags = re.IGNORECASE)
>>> numberList
['1234', '567', '890', '987']

Splitting with Multiple Delimiters

We can also pass multiple delimiters to the re.split() method. Let’s try to split the string based on the semicolon, comma, and space as delimiters

>>> text = "one,two;three    four,five   six"
>>> numbers = re.split('[;,\s]+', text)
>>> numbers
['one', 'two', 'three', 'four', 'five', 'six']

rsplit() method – Split from right

rsplit() is similar to the split() method of the str class, except for the fact that it starts splitting the string from the right end.

>>> usernames = "Tim, Bob, Bill, Tom, Sam"
>>> users = usernames.rsplit(',')
>>> users
['Tim', ' Bob', ' Bill', ' Tom', ' Sam']

Looks the same, right? We can see the difference only when we give the maxsplit argument.

>>> users = usernames.rsplit(',', 2)
>>> users
['Tim, Bob, Bill', ' Tom', ' Sam']
>>> users = usernames.rsplit(',', 3)
>>> users
['Tim, Bob', ' Bill', ' Tom', ' Sam']

Now we can see that it starts splitting the words from the right.

splitlines() method – Splitting String by line break

The splitlines() method splits the string based on the line break characters such as \n, \r, \r\n, etc..

>>> msg = "Welcome\nTo\rJavaInterview\r\nPoint"
>>> words = msg.splitlines()
>>> words
['Welcome', 'To', 'JavaInterview', 'Point']

I trust, I have covered a large portion of the approaches to part a string in Python. Don’t hesitate to drop a remark on the off chance that you discovered anything missing or should be added.

Glad Learning!!

Recorded Under: PythonTagged With: Character, Comma, delimiter, Multiple Delimiters, Python Split String, Python String Split, Regex, rsplit(), Space, split, split string, splitlines(), String Split