赞
踩
$regexFind
在聚合表达式中提供正则表达式(regex)模式匹配功能。如果找到匹配,则返回包含第一个匹配信息的文档。如果未找到匹配,则返回空值。
在MongoDB 4.2 之前,聚合管道只能在$match
阶段使用查询操作符 $regex
。
{ $regexFind: { input: <expression> , regex: <expression>, options: <expression> } }
$regexFind
可以用来比较任何类型的值,针对不同的类型使用特定的BSON比较顺序。
input
:要应用正则表达式模式的字符串。可以是字符串或任何解析为字符串的有效表达式。
regex
:要应用的正则表达式模式。可以是任何解析为字符串或正则表达式模式/<pattern>/
的有效表达式。使用regex /<pattern>/
时,还可以指定regex选项 i
和 m
(但不能指定 s
或 x
选项):
"pattern"
/<pattern>/
/<pattern>/<options>
options
:可选字段,以下<options>
可与正则表达式一起使用。
i
:大小写不敏感,可同时匹配大写和小写,可以在选项字段中指定该选项,也可以将其作为 regex 字段的一部分。m
:对于包含锚点(即:^
表示开始,$
表示结束)的模式,如果字符串有多行值,则匹配每行的开头或结尾。如果不使用该选项,这些锚点将匹配字符串的开头或结尾。如果模式中不包含锚点,或者字符串值中没有换行符(如 \n
),则 m
选项不起作用。x
:"Extended"功能可忽略模式中的所有空白字符,除非转义或包含在字符类中。此外,它会忽略中间的字符,包括未转义的井号/井号 (#
) 字符和下一个新行,以便在复杂的模式中包含注释。这只适用于数据字符;空白字符永远不会出现在模式的特殊字符序列中。x
选项不影响VT字符(即代码 11
)的处理。.
)匹配包括换行符在内的所有字符。只能在options
字段中指定选项。如果运算符没有找到匹配项,则运算符的结果为空。
如果运算符找到了匹配项,运算符的结果就是包含以下内容的文档:
()
指定。{ "match" : <string>, "idx" : <num>, "captures" : <array of strings> }
从 6.1 版开始,MongoDB 使用 PCRE2(Perl 兼容正则表达式)库来实现正则表达式模式匹配。
$regexFind
忽略为集合、db.collection.aggregate()和索引(如果使用)指定的排序规则。
例如,创建一个排序规则强度为 1
的示例集合(即仅比较基本字符并忽略其他差异,例如大小写和变音符号):
db.createCollection( "myColl", { collation: { locale: "fr", strength: 1 } } )
插入以下文档:
db.myColl.insertMany([
{ _id: 1, category: "café" },
{ _id: 2, category: "cafe" },
{ _id: 3, category: "cafE" }
])
使用集合的排序规则,以下操作执行不区分大小写和不区分变音符号的匹配:
db.myColl.aggregate( [ { $match: { category: "cafe" } } ] )
该操作返回以下 3 个文档:
{ "_id" : 1, "category" : "café" }
{ "_id" : 2, "category" : "cafe" }
{ "_id" : 3, "category" : "cafE" }
但是,聚合表达式 $regexFind
忽略排序规则;也就是说,以下正则表达式模式匹配示例区分大小写和变音符号:
db.myColl.aggregate( [ { $addFields: { resultObject: { $regexFind: { input: "$category", regex: /cafe/ } } } } ] )
db.myColl.aggregate(
[ { $addFields: { resultObject: { $regexFind: { input: "$category", regex: /cafe/ } } } } ],
{ collation: { locale: "fr", strength: 1 } } // 在$regexFind中被忽略
)
两个操作都会返回以下内容:
{ "_id" : 1, "category" : "café", "resultObject" : null }
{ "_id" : 2, "category" : "cafe", "resultObject" : { "match" : "cafe", "idx" : 0, "captures" : [ ] } }
{ "_id" : 3, "category" : "cafE", "resultObject" : null }
要执行不区分大小写的正则表达式模式匹配,可改用 i
选项。
如果正则表达式模式包含捕获组并且该模式在输入中找到匹配项,则结果中的捕获数组对应于匹配字符串捕获的组。捕获组在正则表达式模式中使用未转义的括号 ()
指定。捕获数组的长度等于模式中捕获组的数量,并且数组的顺序与捕获组出现的顺序匹配。
使用下面的脚本创建contacts
集合:
db.contacts.insertMany([
{ "_id": 1, "fname": "Carol", "lname": "Smith", "phone": "718-555-0113" },
{ "_id": 2, "fname": "Daryl", "lname": "Doe", "phone": "212-555-8832" },
{ "_id": 3, "fname": "Polly", "lname": "Andrews", "phone": "208-555-1932" },
{ "_id": 4, "fname": "Colleen", "lname": "Duncan", "phone": "775-555-0187" },
{ "_id": 5, "fname": "Luna", "lname": "Clarke", "phone": "917-555-4414" }
])
以下管道将正则表达式模式 /(C(ar)*)ol/
应用于 fname
字段:
db.contacts.aggregate([
{
$project: {
returnObject: {
$regexFind: { input: "$fname", regex: /(C(ar)*)ol/ }
}
}
}
])
正则表达式模式找到与 fname
值 Carol
和 Colleen
的匹配项:
{ "_id" : 1, "returnObject" : { "match" : "Carol", "idx" : 0, "captures" : [ "Car", "ar" ] } }
{ "_id" : 2, "returnObject" : null }
{ "_id" : 3, "returnObject" : null }
{ "_id" : 4, "returnObject" : { "match" : "Col", "idx" : 0, "captures" : [ "C", null ] } }
{ "_id" : 5, "returnObject" : null }
该模式包含捕获组 (C(ar)*)
,其中包含嵌套组 (ar)
,捕获数组中的元素对应于两个捕获组,如果某个组(例如 Colleen 和组 (ar)
)未捕获匹配的文档,则 $regexFind
会用空占位符替换该组。
如前面的示例所示,捕获数组包含每个捕获组的一个元素(对于非捕获使用 null
)。下面的示例通过将捕获组的逻辑或应用到电话字段来搜索具有纽约市区号的电话号码。每组代表纽约市的一个区号:
db.contacts.aggregate([
{
$project: {
nycContacts: {
$regexFind: { input: "$phone", regex: /^(718).*|^(212).*|^(917).*/ }
}
}
}
])
对于与正则表达式模式匹配的文档,捕获数组包含匹配的捕获组,并将任何非捕获组替换为 null
:
{ "_id" : 1, "nycContacts" : { "match" : "718-555-0113", "idx" : 0, "captures" : [ "718", null, null ] } }
{ "_id" : 2, "nycContacts" : { "match" : "212-555-8832", "idx" : 0, "captures" : [ null, "212", null ] } }
{ "_id" : 3, "nycContacts" : null }
{ "_id" : 4, "nycContacts" : null }
{ "_id" : 5, "nycContacts" : { "match" : "917-555-4414", "idx" : 0, "captures" : [ null, null, "917" ] } }
使用脚本创建products
集合:
db.products.insertMany([
{ _id: 1, description: "Single LINE description." },
{ _id: 2, description: "First lines\nsecond line" },
{ _id: 3, description: "Many spaces before line" },
{ _id: 4, description: "Multiple\nline descriptions" },
{ _id: 5, description: "anchors, links and hyperlinks" },
{ _id: 6, description: "métier work vocation" }
])
默认情况下,$regexFind
执行区分大小写的匹配。例如,以下聚合对描述字段执行区分大小写的 $regexFind
。正则表达式模式 /line/
未指定任何分组:
db.products.aggregate([
{ $addFields: { returnObject: { $regexFind: { input: "$description", regex: /line/ } } } }
])
操作返回下面的结果:
{ "_id" : 1, "description" : "Single LINE description.", "returnObject" : null }
{ "_id" : 2, "description" : "First lines\nsecond line", "returnObject" : { "match" : "line", "idx" : 6, "captures" : [ ] } }
{ "_id" : 3, "description" : "Many spaces before line", "returnObject" : { "match" : "line", "idx" : 23, "captures" : [ ] } }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "returnObject" : { "match" : "line", "idx" : 9, "captures" : [ ] } }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "returnObject" : null }
{ "_id" : 6, "description" : "métier work vocation", "returnObject" : null }
以下正则表达式模式 /lin(e|k)/
指定模式中的分组 (e|k)
:
db.products.aggregate([
{ $addFields: { returnObject: { $regexFind: { input: "$description", regex: /lin(e|k)/ } } } }
])
操作返回下面的结果:
{ "_id" : 1, "description" : "Single LINE description.", "returnObject" : null }
{ "_id" : 2, "description" : "First lines\nsecond line", "returnObject" : { "match" : "line", "idx" : 6, "captures" : [ "e" ] } }
{ "_id" : 3, "description" : "Many spaces before line", "returnObject" : { "match" : "line", "idx" : 23, "captures" : [ "e" ] } }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "returnObject" : { "match" : "line", "idx" : 9, "captures" : [ "e" ] } }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "returnObject" : { "match" : "link", "idx" : 9, "captures" : [ "k" ] } }
{ "_id" : 6, "description" : "métier work vocation", "returnObject" : null }
在返回选项中,idx
字段是代码点索引,而不是字节索引。可以使用正则表达式模式 /tier/
:
db.products.aggregate([
{ $addFields: { returnObject: { $regexFind: { input: "$description", regex: /tier/ } } } }
])
该操作返回以下内容,其中只有最后一条记录与模式匹配,并且返回的 idx
为 2
(如果使用字节索引则为 3
)
{ "_id" : 1, "description" : "Single LINE description.", "returnObject" : null }
{ "_id" : 2, "description" : "First lines\nsecond line", "returnObject" : null }
{ "_id" : 3, "description" : "Many spaces before line", "returnObject" : null }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "returnObject" : null }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "returnObject" : null }
{ "_id" : 6, "description" : "métier work vocation",
"returnObject" : { "match" : "tier", "idx" : 2, "captures" : [ ] } }
**注意:**不能在 regex 和选项字段中同时指定选项。
要执行不区分大小写的模式匹配,要将i
选项包含在正则表达式字段或选项字段中:
//将 i 指定为正则表达式字段的一部分
{ $regexFind: { input: "$description", regex: /line/i } }
//在选项字段中指定 i
{ $regexFind: { input: "$description", regex: /line/, options: "i" } }
{ $regexFind: { input: "$description", regex: "line", options: "i" } }
例如,以下聚合对描述字段执行不区分大小写的 $regexFind
。正则表达式模式 /line/
不指定任何分组:
db.products.aggregate([
{ $addFields: { returnObject: { $regexFind: { input: "$description", regex: /line/i } } } }
])
操作返回下面的结果:
{ "_id" : 1, "description" : "Single LINE description.", "returnObject" : { "match" : "LINE", "idx" : 7, "captures" : [ ] } }
{ "_id" : 2, "description" : "First lines\nsecond line", "returnObject" : { "match" : "line", "idx" : 6, "captures" : [ ] } }
{ "_id" : 3, "description" : "Many spaces before line", "returnObject" : { "match" : "line", "idx" : 23, "captures" : [ ] } }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "returnObject" : { "match" : "line", "idx" : 9, "captures" : [ ] } }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "returnObject" : null }
{ "_id" : 6, "description" : "métier work vocation", "returnObject" : null }
要匹配多行字符串的每一行的指定锚点(例如 ^
、$
),请将 m
选项包含在正则表达式字段或选项字段中:
// 将 m 指定为正则表达式字段的一部分
{ $regexFind: { input: "$description", regex: /line/m } }
// 在选项字段中指定 m
{ $regexFind: { input: "$description", regex: /line/, options: "m" } }
{ $regexFind: { input: "$description", regex: "line", options: "m" } }
以下示例同时包含 i
和 m
选项,用于匹配以字母 s或
S` 开头的多行字符串的行:
db.products.aggregate([
{ $addFields: { returnObject: { $regexFind: { input: "$description", regex: /^s/im } } } }
])
操作返回下面的结果:
{ "_id" : 1, "description" : "Single LINE description.", "returnObject" : { "match" : "S", "idx" : 0, "captures" : [ ] } }
{ "_id" : 2, "description" : "First lines\nsecond line", "returnObject" : { "match" : "s", "idx" : 12, "captures" : [ ] } }
{ "_id" : 3, "description" : "Many spaces before line", "returnObject" : null }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "returnObject" : null }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "returnObject" : null }
{ "_id" : 6, "description" : "métier work vocation", "returnObject" : null }
要忽略模式中所有未转义的空白字符和注释(由未转义的哈希 #
字符和下一个换行符表示),需在选项字段中包含 s
选项:
// 在选项字段中指定 x
{ $regexFind: { input: "$description", regex: /line/, options: "x" } }
{ $regexFind: { input: "$description", regex: "line", options: "x" } }
以下示例包含用于跳过未转义空格和注释的 x
选项:
db.products.aggregate([
{ $addFields: { returnObject: { $regexFind: { input: "$description", regex: /lin(e|k) # matches line or link/, options:"x" } } } }
])
操作返回下面的结果:
{ "_id" : 1, "description" : "Single LINE description.", "returnObject" : null }
{ "_id" : 2, "description" : "First lines\nsecond line", "returnObject" : { "match" : "line", "idx" : 6, "captures" : [ "e" ] } }
{ "_id" : 3, "description" : "Many spaces before line", "returnObject" : { "match" : "line", "idx" : 23, "captures" : [ "e" ] } }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "returnObject" : { "match" : "line", "idx" : 9, "captures" : [ "e" ] } }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "returnObject" : { "match" : "link", "idx" : 9, "captures" : [ "k" ] } }
{ "_id" : 6, "description" : "métier work vocation", "returnObject" : null }
要允许模式中的点字符(即 .
)匹配包括换行符在内的所有字符,需在选项字段中包含 s
选项:
// 在选项字段中指定 s
{ $regexFind: { input: "$description", regex: /m.*line/, options: "s" } }
{ $regexFind: { input: "$description", regex: "m.*line", options: "s" } }
以下示例包含 s
选项以允许点字符(即 .
)匹配包括换行符在内的所有字符,以及 i
选项以执行不区分大小写的匹配:
db.products.aggregate([
{ $addFields: { returnObject: { $regexFind: { input: "$description", regex:/m.*line/, options: "si" } } } }
])
操作返回下面的结果:
{ "_id" : 1, "description" : "Single LINE description.", "returnObject" : null }
{ "_id" : 2, "description" : "First lines\nsecond line", "returnObject" : null }
{ "_id" : 3, "description" : "Many spaces before line", "returnObject" : { "match" : "Many spaces before line", "idx" : 0, "captures" : [ ] } }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "returnObject" : { "match" : "Multiple\nline", "idx" : 0, "captures" : [ ] } }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "returnObject" : null }
{ "_id" : 6, "description" : "métier work vocation", "returnObject" : null }
使用下面的脚本创建feedback
集合:
db.feedback.insertMany([
{ "_id" : 1, comment: "Hi, I'm just reading about MongoDB -- aunt.arc.tica@example.com" },
{ "_id" : 2, comment: "I wanted to concatenate a string" },
{ "_id" : 3, comment: "How do I convert a date to string? cam@mongodb.com" },
{ "_id" : 4, comment: "It's just me. I'm testing. fred@MongoDB.com" }
])
以下聚合使用$regexFind
从评论字段中提取电子邮件(不区分大小写):
db.feedback.aggregate( [
{ $addFields: {
"email": { $regexFind: { input: "$comment", regex: /[a-z0-9_.+-]+@[a-z0-9_.+-]+\.[a-z0-9_.+-]+/i } }
} },
{ $set: { email: "$email.match"} }
] )
该阶段使用 $addFields
向文档添加新的字段电子邮件。新字段包含对评论字段执行 $regexFind
的结果:
{ "_id" : 1, "comment" : "Hi, I'm just reading about MongoDB -- aunt.arc.tica@example.com", "email" : { "match" : "aunt.arc.tica@example.com", "idx" : 38, "captures" : [ ] } }
{ "_id" : 2, "comment" : "I wanted to concatenate a string", "email" : null }
{ "_id" : 3, "comment" : "I can't find how to convert a date to string. cam@mongodb.com", "email" : { "match" : "cam@mongodb.com", "idx" : 46, "captures" : [ ] } }
{ "_id" : 4, "comment" : "It's just me. I'm testing. fred@MongoDB.com", "email" : { "match" : "fred@MongoDB.com", "idx" : 28, "captures" : [ ] } }
该阶段使用 $addFields
向文档添加新的字段电子邮件。新字段包含对评论字段执行 $regexFind
的结果:
{ "_id" : 1, "comment" : "Hi, I'm just reading about MongoDB -- aunt.arc.tica@example.com", "email" : { "match" : "aunt.arc.tica@example.com", "idx" : 38, "captures" : [ ] } }
{ "_id" : 2, "comment" : "I wanted to concatenate a string", "email" : null }
{ "_id" : 3, "comment" : "I can't find how to convert a date to string. cam@mongodb.com", "email" : { "match" : "cam@mongodb.com", "idx" : 46, "captures" : [ ] } }
{ "_id" : 4, "comment" : "It's just me. I'm testing. fred@MongoDB.com", "email" : { "match" : "fred@MongoDB.com", "idx" : 28, "captures" : [ ] } }
该阶段使用 $set
将电子邮件重置为当前的"$email.match"
值。如果 email
的当前值为 null
,则 email
的新值将设置为 null
。
{ "_id" : 1, "comment" : "Hi, I'm just reading about MongoDB -- aunt.arc.tica@example.com", "email" : "aunt.arc.tica@example.com" }
{ "_id" : 2, "comment" : "I wanted to concatenate a string" }
{ "_id" : 3, "comment" : "I can't find how to convert a date to string. cam@mongodb.com", "email" : "cam@mongodb.com" }
{ "_id" : 4, "comment" : "It's just me. I'm testing. fred@MongoDB.com", "email" : "fred@MongoDB.com" }
使用下面的脚本创建contacts
集合:
db.contacts.insertMany([
{ "_id" : 1, name: "Aunt Arc Tikka", details: [ "+672-19-9999", "aunt.arc.tica@example.com" ] },
{ "_id" : 2, name: "Belle Gium", details: [ "+32-2-111-11-11", "belle.gium@example.com" ] },
{ "_id" : 3, name: "Cam Bo Dia", details: [ "+855-012-000-0000", "cam.bo.dia@example.com" ] },
{ "_id" : 4, name: "Fred", details: [ "+1-111-222-3333" ] }
])
以下聚合使用 $regexFind
将详细信息数组转换为包含电子邮件和电话字段的嵌入文档:
db.contacts.aggregate( [
{ $unwind: "$details" },
{ $addFields: {
"regexemail": { $regexFind: { input: "$details", regex: /^[a-z0-9_.+-]+@[a-z0-9_.+-]+\.[a-z0-9_.+-]+$/, options: "i" } },
"regexphone": { $regexFind: { input: "$details", regex: /^[+]{0,1}[0-9]*\-?[0-9_\-]+$/ } }
} },
{ $project: { _id: 1, name: 1, details: { email: "$regexemail.match", phone: "$regexphone.match" } } },
{ $group: { _id: "$_id", name: { $first: "$name" }, details: { $mergeObjects: "$details"} } },
{ $sort: { _id: 1 } }
])
阶段 $unwinds
将数组展开到单独的文档中:
{ "_id" : 1, "name" : "Aunt Arc Tikka", "details" : "+672-19-9999" }
{ "_id" : 1, "name" : "Aunt Arc Tikka", "details" : "aunt.arc.tica@example.com" }
{ "_id" : 2, "name" : "Belle Gium", "details" : "+32-2-111-11-11" }
{ "_id" : 2, "name" : "Belle Gium", "details" : "belle.gium@example.com" }
{ "_id" : 3, "name" : "Cam Bo Dia", "details" : "+855-012-000-0000" }
{ "_id" : 3, "name" : "Cam Bo Dia", "details" : "cam.bo.dia@example.com" }
{ "_id" : 4, "name" : "Fred", "details" : "+1-111-222-3333" }
该阶段使用 $addFields
将新字段添加到包含电话号码和电子邮件的 $regexFind
结果的文档中:
{ "_id" : 1, "name" : "Aunt Arc Tikka", "details" : "+672-19-9999", "regexemail" : null, "regexphone" : { "match" : "+672-19-9999", "idx" : 0, "captures" : [ ] } }
{ "_id" : 1, "name" : "Aunt Arc Tikka", "details" : "aunt.arc.tica@example.com", "regexemail" : { "match" : "aunt.arc.tica@example.com", "idx" : 0, "captures" : [ ] }, "regexphone" : null }
{ "_id" : 2, "name" : "Belle Gium", "details" : "+32-2-111-11-11", "regexemail" : null, "regexphone" : { "match" : "+32-2-111-11-11", "idx" : 0, "captures" : [ ] } }
{ "_id" : 2, "name" : "Belle Gium", "details" : "belle.gium@example.com", "regexemail" : { "match" : "belle.gium@example.com", "idx" : 0, "captures" : [ ] }, "regexphone" : null }
{ "_id" : 3, "name" : "Cam Bo Dia", "details" : "+855-012-000-0000", "regexemail" : null, "regexphone" : { "match" : "+855-012-000-0000", "idx" : 0, "captures" : [ ] } }
{ "_id" : 3, "name" : "Cam Bo Dia", "details" : "cam.bo.dia@example.com", "regexemail" : { "match" : "cam.bo.dia@example.com", "idx" : 0, "captures" : [ ] }, "regexphone" : null }
{ "_id" : 4, "name" : "Fred", "details" : "+1-111-222-3333", "regexemail" : null, "regexphone" : { "match" : "+1-111-222-3333", "idx" : 0, "captures" : [ ] } }
该阶段使用$project
输出带有_id
字段、name
字段和details
字段的文档。详细信息字段设置为包含电子邮件和电话字段的文档,其值分别由 regexemail
和 regexphone
字段:
{ "_id" : 1, "name" : "Aunt Arc Tikka", "details" : { "phone" : "+672-19-9999" } }
{ "_id" : 1, "name" : "Aunt Arc Tikka", "details" : { "email" : "aunt.arc.tica@example.com" } }
{ "_id" : 2, "name" : "Belle Gium", "details" : { "phone" : "+32-2-111-11-11" } }
{ "_id" : 2, "name" : "Belle Gium", "details" : { "email" : "belle.gium@example.com" } }
{ "_id" : 3, "name" : "Cam Bo Dia", "details" : { "phone" : "+855-012-000-0000" } }
{ "_id" : 3, "name" : "Cam Bo Dia", "details" : { "email" : "cam.bo.dia@example.com" } }
{ "_id" : 4, "name" : "Fred", "details" : { "phone" : "+1-111-222-3333" } }
该阶段使用 $group
根据输入文档的 _id
值对输入文档进行分组。该阶段使用 $mergeObjects
表达式来合并详细信息文档:
{ "_id" : 3, "name" : "Cam Bo Dia", "details" : { "phone" : "+855-012-000-0000", "email" : "cam.bo.dia@example.com" } }
{ "_id" : 4, "name" : "Fred", "details" : { "phone" : "+1-111-222-3333" } }
{ "_id" : 1, "name" : "Aunt Arc Tikka", "details" : { "phone" : "+672-19-9999", "email" : "aunt.arc.tica@example.com" } }
{ "_id" : 2, "name" : "Belle Gium", "details" : { "phone" : "+32-2-111-11-11", "email" : "belle.gium@example.com" } }
该阶段使用 $sort
按 _id
字段对文档进行排序:
{ "_id" : 1, "name" : "Aunt Arc Tikka", "details" : { "phone" : "+672-19-9999", "email" : "aunt.arc.tica@example.com" } }
{ "_id" : 2, "name" : "Belle Gium", "details" : { "phone" : "+32-2-111-11-11", "email" : "belle.gium@example.com" } }
{ "_id" : 3, "name" : "Cam Bo Dia", "details" : { "phone" : "+855-012-000-0000", "email" : "cam.bo.dia@example.com" } }
{ "_id" : 4, "name" : "Fred", "details" : { "phone" : "+1-111-222-3333" } }
使用下面的脚本创建employees
集合:
db.employees.insertMany([
{ "_id" : 1, name: "Aunt Arc Tikka", "email" : "aunt.tica@example.com" },
{ "_id" : 2, name: "Belle Gium", "email" : "belle.gium@example.com" },
{ "_id" : 3, name: "Cam Bo Dia", "email" : "cam.dia@example.com" },
{ "_id" : 4, name: "Fred" }
])
员工电子邮件的格式为 <firstname>.<lastname>@example.com
,使用 $regexFind
结果中返回的捕获字段,可以解析出员工的用户名:
db.employees.aggregate( [
{ $addFields: {
"username": { $regexFind: { input: "$email", regex: /^([a-z0-9_.+-]+)@[a-z0-9_.+-]+\.[a-z0-9_.+-]+$/, options: "i" } },
} },
{ $set: { username: { $arrayElemAt: [ "$username.captures", 0 ] } } }
] )
该阶段使用 $addFields
将新字段username
添加到文档中,新字段包含对email
字段执行 $regexFind
的结果:
{ "_id" : 1, "name" : "Aunt Arc Tikka", "email" : "aunt.tica@example.com", "username" : { "match" : "aunt.tica@example.com", "idx" : 0, "captures" : [ "aunt.tica" ] } }
{ "_id" : 2, "name" : "Belle Gium", "email" : "belle.gium@example.com", "username" : { "match" : "belle.gium@example.com", "idx" : 0, "captures" : [ "belle.gium" ] } }
{ "_id" : 3, "name" : "Cam Bo Dia", "email" : "cam.dia@example.com", "username" : { "match" : "cam.dia@example.com", "idx" : 0, "captures" : [ "cam.dia" ] } }
{ "_id" : 4, "name" : "Fred", "username" : null }
该阶段使用 $set
将username
重置为"$username.captures"
数组的第零个元素。如果 username
的当前值为 null
,则将 username
的新值设置为 null
:
{ "_id" : 1, "name" : "Aunt Arc Tikka", "email" : "aunt.tica@example.com", "username" : "aunt.tica" }
{ "_id" : 2, "name" : "Belle Gium", "email" : "belle.gium@example.com", "username" : "belle.gium" }
{ "_id" : 3, "name" : "Cam Bo Dia", "email" : "cam.dia@example.com", "username" : "cam.dia" }
{ "_id" : 4, "name" : "Fred", "username" : null }
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。