赞
踩
The regular expression is a popular topic in system administrators and developers. A regular expression is used to find structured text or string in single or multiple files. The best side of regular expression we can define whatever we want to match string in texts. Python supports and provides a lot of methods for regular expressions and related operations. In this tutorial, we will look at these regex functions in detail.
正则表达式是系统管理员和开发人员中的热门话题。 正则表达式用于在单个或多个文件中查找结构化文本或字符串。 正则表达式的最佳方面是,我们可以定义我们想要匹配文本中的字符串的任何内容。 Python支持并提供许多用于正则表达式和相关操作的方法。 在本教程中,我们将详细研究这些正则表达式函数。
In order to work with regular expressions in python, we need to import regular expression library which is named as a shortcut of regular expression
as regex
.
为了与Python正则表达式的工作,我们需要导入它命名为快捷方式正则表达式库regular expression
的regex
。
import regex
The match function is one of the most popular functions which will apply regex pattern into the given string. We will use match
function with pattern
and string
parameters. There is also flags
parameter which can be used to provide some flags like the case, interpretation, etc. If we do not provide flags
there will be no error.
match函数是最受欢迎的函数之一,它将正则表达式模式应用到给定的字符串中。 我们将match
函数与pattern
和string
参数一起使用。 还有flags
参数,可用于提供一些标志,例如大小写,解释等。如果我们不提供flags
则不会有错误。
re.match(PATTERN,STRING,FLAG)
In this example, we want to find words that are delimited by spaces in the given string. Each word provides single match and those matches will be grouped.
在此示例中,我们要查找由给定字符串中的空格分隔的单词。 每个单词提供单个匹配项,这些匹配项将被分组。
- line="This is an example about regular expression"
-
- matches = re.match('\w+',line)
-
- matches.group(0)
In the previous part, we have simply printed the first group which index is but we may have more than one word to match in a line. It is called a group in the regex. We can match multiple different patterns in a single match.
在上一部分中,我们仅打印了索引为但是我们可能有多个单词要匹配。 它在正则表达式中称为组。 我们可以在一次匹配中匹配多个不同的模式。
In this example we will match words starts with T
and a
into two groups.
在此示例中,我们将以T
和a
开头的单词匹配为两组。
- line="This is an example about regular expression"
- matches = re.match('(T\w+).*example\s(a\w+)',line)
- matches.group(0)
- #'This is an example about'
- matches.group(1)
- #'This'
- matches.group(2)
- #'about'
As we see matched pattern results are assigned into groups. We can get them by providing an index about these groups.
如我们所见,匹配的模式结果被分为几组。 我们可以通过提供有关这些组的索引来获取它们。
Search is similar to the match function but the main difference is match looks up to the first match and then stops but the search will look at to the end of the string and will find multiple matches if exists. The syntax of the search
function is the same match
functions.
搜索类似于匹配功能,但主要区别是匹配查找到第一个匹配项,然后停止,但搜索将查找到字符串的末尾,如果存在则查找多个匹配项。 search
功能的语法与match
功能相同。
re.search(PATTERN,STRING,FLAG)
研究(样式,字符串,标志)
- line="This is an example about regular expression"
- matches = re.search('(T\w+).*example\s(a\w+)',line)
- matches.group(0)
- #'This is an example about'
- matches.group(1)
- #'This'
- matches.group(2)
- #'about'
Python regex functions support finding given text and replacing the text with a new one. We will use sub
functions in order to replace. sub
function supports the following syntax.
Python regex函数支持查找给定的文本并将其替换为新文本。 我们将使用sub
来替换。 sub
支持以下语法。
re.sub(PATTERN,NEWTEXT,STRING,FLAG)
We will change regular
word with unregular
word in this example.
在此示例中,我们将用unregular
词更改regular
词。
- line="This is an example about regular expression"
- matches = re.sub('regular','unregular',line)
- print(matches)
Options flags generally provided as last parameter to the related regex functions. Option flags generally used to case-insensitive match, interpret with current locale etc. Here is a list of option flags.
选项标志通常作为相关正则表达式功能的最后一个参数提供。 选项标记通常用于不区分大小写的匹配,使用当前语言环境进行解释等。这是选项标记的列表。
re.I
is used case-insensitive match
re.I
用于不区分大小写的匹配
re.L
is used for current locale
re.L
用于当前语言环境
re.M
makes $
match end of line
re.M
使$
match在行尾
re.S
makes .
match any character, including newline
re.S
.
匹配任何字符,包括换行符
We can use option flags in order to make case-insensitive match or search with regular expression. We will provide re.I
as last arguments to the relevant function like below.
我们可以使用选项标志来进行不区分大小写的匹配或使用正则表达式进行搜索。 我们将提供re.I
作为相关函数的最后一个参数,如下所示。
matches = re.sub('regular','unregular',line,re.I)
翻译自: https://www.poftut.com/python-regular-expression-operations-regex/
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。