Complete Python Regex Replace Guide using re.sub()
Regular expressions are powerful for text processing in Python. The re module provides excellent support for complex regex find and replace operations using a dedicated sub()
method. This allows modifying strings matching patterns easily. In this article, we’ll explore python regex replace in detail with examples.
Python Regex Replace with re.sub()
In python you can get hold of any regex based properties with the help of re
module. You simply need to import the module into your python file and use re.sub()
method to perform the job of replacing in a string.
The re.sub()
method in Python provides powerful search and replace functionality using regular expressions. The regex replace works by specifying the string to modify, defining a regex pattern to match against, and providing a replacement substring. Here is the syntax of the same below:
re.sub(pattern, replacement, string, count, flags)
Here, pattern, replacement, and string are mandatory to be provided while count and flags are default to 0 and are optional. Let’s discuss what these 5 are one by one:
pattern (required)
: This specifies the regular expression pattern to search for in the input string. Raw strings like r”pattern” are recommended.replacement (required)
: The replacement substring that matched pattern text gets substituted with. Can also be a callable function.string (required)
: The input text string or sequence where pattern matching and substitution is performed.count (optional)
: How many pattern occurrences to replace. Default of 0 means replace all occurrences. Specify 1 to replace just first match.flags (optional)
: Allows modifying regex engine behavior. Value is bitmask of re flags like re.IGNORECASE, re.MULTILINE etc.
The method matches the regex pattern against string, then replaces matched text with repl substring up to count times based on provided flags.
Python Regex Replace Examples
Regex Example to remove hyphen from a string:
In this first example we will remove all the hyphens from a telephone number so that all the digits can occur together in the string. Here’s how we are going to do it:
import re
phone = "412-555-1234"
formatted = re.sub("\D", "", phone)
print(formatted) # Output: "4125551234"
Here, “\D” matches any non-digit character. The empty quotes replace matches with nothing effectively removing non-numeric symbols. Remember, “\d” matches digits; “\D” matches non-digits.
Regex Example to remove all whitespaces from a string
In our second example, we will remove all the whitespaces that are occurring in the string. Here’s the code for same:
import re
text = "Remove Whitespace from this text"
trimmmed_text = re.sub(r"\s", "", text)
print(trimmmed_text) # Output: RemoveWhitespacefromthistext
The “\s” pattern matches all the whitespaces in the text and replace the whitespace by closing it with “”.
Regex Example to extract domain from a url
In case you need to extract a domain from a given url, you can use regex to specify this pattern: r"^https?://(www\.?"
Here’s a code for extracting the pypixel.com domain from it’s url.
import re
url = "https://www.pypixel.com"
domain = re.sub(r"^https?://(www\.)?", "", url)
print(domain) # Output: pypixel.com
Regex Example to replace a text from a string
This one is the simplest and the most common example that you will come across where you will be replacing a word in your strong that you won’t require. Here’s how you can do that:
import re
random_string = "This website are PyPixel"
updated_string = re.sub(r”are”, “is”, random_string)
print(updated_string) # Output: This website is PyPixel
Regex Example to remove characters from a string
In this example, we’ll remove all the characters from a string. It can be a punctuation mark(.), an exclamation mark(!), semi-colon(;) etc.
import re
text = "Hello, world! This has many symbols like . , : ; ?"
cleaned = re.sub(r'[^\w\s]','',text)
print(cleaned) # Output: Hello world This has many symbols like
The patterns r'[^\w\s]'
find/replace all non-word characters in the string, removes them by replacing with empty string resulting in characters getting stripped.
Conclusion
In this article, Python Regex Replace Patterns were discussed with some commonly used examples. Python’s re
module brings regex capabilities into our code. You can use re.sub()
method to use python regex replace patterns for multiple use-cases. You may require regex while building projects that require user input to be manipulated in some manner.
Moreover, here’s a great tool called Regex 101, that let’s you test your regex pattern online, it’s really handy and definitely worth trying in case you need quick outputs.
FAQs
Is regex replace case-sensitive?
Yes, by default regex operations are case-sensitive. Use flags like re.IGNORECASE to make case-insensitive.
How to replace only first or nth match occurrence?
Specify count= parameter on re.sub() with number of replacements needed and it’ll replace those many matched patterns for you.
What if I want to access matches and reformat dynamically?
You can utilize callback function as replacement parameter to customize formatting.
Can I match and replace without knowing full string beforehand?
Yes, try using re.compile() to precompile pattern for reusability first.
How can I improve regex pattern readability?
You can simple use verbose mode with comments – r”(?x)pattern” along with method chaining.