当前位置:   article > 正文

jsonpath:使用Python处理JSON数据_python json数组

python json数组

使用Python处理JSON数据

25.1 JSON简介

25.1.1 什么是JSON

    JSON全称为JavaScript Object Notation,一般翻译为JS标记,是一种轻量级的数据交换格式。是基于ECMAScript的一个子集,采用完全独立于编程语言的文本格式来存储和表示数据。简洁和清晰的层次结构使得JSON成为理想的数据交换语言,其主要特点有:易于阅读易于机器生成有效提升网络速度等。

25.1.2 JSON的两种结构

    JSON简单来说,可以理解为JavaScript中的数组对象,通过这两种结构,可以表示各种复杂的结构。

25.1.2.1 数组

    数组在JavaScript是使用中括号[ ]来定义的,一般定义格式如下所示:

let array=["Surpass","28","Shanghai"];

    若要对数组取值,则需要使用索引。元素的类型可以是数字字符串数组对象等。

25.1.2.2 对象

    对象在JavaScript是使用大括号{ }来定义的,一般定义格式如下所示:

  1. let personInfo={
  2. name:"Surpass",
  3. age:28,
  4. location:"Shanghai"
  5. }

    对象一般是基于keyvalue,在JavaScript中,其取值方式也非常简单variable.key即可。元素value的类型可以是数字字符串数组对象等。

25.1.3 支持的数据格式

    JSON支持的主要数据格式如下所示:

  • 数组:使用中括号
  • 对象:使用大括号
  • 整型浮点型布尔类型null
  • 字符串类型:必须使用双引号,不能使用单引号

    多个数据之间使用逗号做为分隔符,基与Python中的数据类型对应表如下所示:

JSONPython
Objectdict
arraylist
stringstr
number(int)int
number(real)float
trueTrue
falseFalse
nullNone

25.2 Python对JSON的支持

25.2.1 Python 和 JSON 数据类型

    在Python中主要使用json模块来对JSON数据进行处理。在使用前,需要导入json模块,用法如下所示:

import json

    json模块中主要包含以下四个操作函数,如下所示:

    在json的处理过种中,Python中的原始类型与JSON类型会存在相互转换,具体的转换表如下所示:

  • Python 转换为 JSON
PythonJSON
dictObject
listarray
tuplearray
strstring
intnumber
floatnumber
Truetrue
Falsefalse
Nonenull
  • JSON 转换为 Python
JSONPython
Objectdict
arraylist
stringstr
number(int)int
number(real)float
trueTrue
falseFalse
nullNone
25.2.2 json模块常用方法

    关于Python 内置的json模块,可以查看之前我写的文章:https://www.cnblogs.com/surpassme/p/13034972.html

25.3 使用JSONPath处理JSON数据

    内置的json模块,在处理简单的JSON数据时,易用且非常非常方便,但在处理比较复杂且特别大的JSON数据,还是有一些费力,今天我们使用一个第三方的工具来处理JSON数据,叫JSONPath

25.3.1 什么是JSONPath

    JSONPath是一种用于解析JSON数据的表达语言。经常用于解析和处理多层嵌套的JSON数据,其用法与解析XML数据的XPath表达式语言非常相似。

25.3.2 安装

    安装方法如下所示:

# pip install -U jsonpath
25.3.3 JSONPath语法

    JSONPath语法与XPath非常相似,其对应参照表如下所示:

XPathJSONPath描述
/$根节点/元素
.@当前节点/元素
/. or []子元素
..n/a父元素
//..递归向下搜索子元素
**通配符,表示所有元素
@n/a访问属性,JSON结构的数据没有这种属性
[][]子元素操作符(可以在里面做简单的迭代操作,如数据索引,根据内容选值等)
|[,]支持迭代器中做多选
n/a[start :end :step]数组分割操作
[]?()筛选表达式
n/a()支持表达式计算
()n/a分组,JSONPath不支持

以上内容可查阅官方文档:JSONPath - XPath for JSON

    我们以下示例数据为例,来进行对比,如下所示:

  1. { "store":
  2. {
  3. "book": [
  4. { "category": "reference",
  5. "author": "Nigel Rees",
  6. "title": "Sayings of the Century",
  7. "price": 8.95
  8. },
  9. { "category": "fiction",
  10. "author": "Evelyn Waugh",
  11. "title": "Sword of Honour",
  12. "price": 12.99
  13. },
  14. { "category": "fiction",
  15. "author": "Herman Melville",
  16. "title": "Moby Dick",
  17. "isbn": "0-553-21311-3",
  18. "price": 8.99
  19. },
  20. { "category": "fiction",
  21. "author": "J. R. R. Tolkien",
  22. "title": "The Lord of the Rings",
  23. "isbn": "0-395-19395-8",
  24. "price": 22.99
  25. }
  26. ],
  27. "bicycle": {
  28. "color": "red",
  29. "price": 19.95
  30. }
  31. }
  32. }
XPathJSONPath结果
/store/book/author$.store.book[*].author获取book节点中所有author
//author$..author获取所有author
/store/*$.store.*获取store的元素,包含book和bicycle
/store//price$.store..price获取store中的所有price
//book[3]$..book[2]获取第三本书所有信息
//book[last()]..����[(@.�����ℎ−1)]..book[-1:]获取最后一本书的信息
//book[position()❤️]..����[0,1]..book[:2]获取前面的两本书
//book[isbn]$..book[?(@.isbn)]根据isbn进行过滤
//book[price<10]$..book[?(@.price<10)]根据price进行筛选
//*$..*所有元素

在XPath中,下标是1开始,而在JSONPath中是从0开始

JSONPath在线练习网址:JSONPath Online Evaluator

25.3.4 JSONPath用法

    其基本用法形式如下所示:

jsonPath(obj, expr [, args])

    基参数如下所示:

  • obj (object|array):

    JSON数据对象

  • expr (string):

    JSONPath表达式

  • args (object|undefined):

    改变输出格式,比如是输出是值还是路径,

args.resultType可选的输出格式为:"VALUE"、"PATH"、"IPATH"

  • 返回类型为(array|false):

    若返回array,则代表成功匹配到数据,false则代表未匹配到数据。

25.3.5 在Python中的使用
  1. from jsonpath import jsonpath
  2. import json
  3. data = {
  4. "store":
  5. {
  6. "book": [
  7. {
  8. "category": "reference",
  9. "author": "Nigel Rees",
  10. "title": "Sayings of the Century",
  11. "price": 8.95
  12. },
  13. {
  14. "category": "fiction",
  15. "author": "Evelyn Waugh",
  16. "title": "Sword of Honour",
  17. "price": 12.99
  18. },
  19. {
  20. "category": "fiction",
  21. "author": "Herman Melville",
  22. "title": "Moby Dick",
  23. "isbn": "0-553-21311-3",
  24. "price": 8.99
  25. },
  26. {
  27. "category": "fiction",
  28. "author": "J. R. R. Tolkien",
  29. "title": "The Lord of the Rings",
  30. "isbn": "0-395-19395-8",
  31. "price": 22.99
  32. }
  33. ],
  34. "bicycle": {
  35. "color": "red",
  36. "price": 19.95
  37. }
  38. }
  39. }
  40. # 获取book节点中所有author
  41. getAllBookAuthor=jsonpath(data,"$.store.book[*].author")
  42. print(f"getAllBookAuthor is :{json.dumps(getAllBookAuthor,indent=4)}")
  43. # 获取book节点中所有author
  44. getAllAuthor=jsonpath(data,"$..author")
  45. print(f"getAllAuthor is {json.dumps(getAllAuthor,indent=4)}")
  46. # 获取store的元素,包含book和bicycle
  47. getAllStoreElement=jsonpath(data,"$.store.*")
  48. print(f"getAllStoreElement is {json.dumps(getAllStoreElement,indent=4)}")
  49. # 获取store中的所有price
  50. getAllStorePriceA=jsonpath(data,"$[store]..price")
  51. getAllStorePriceB=jsonpath(data,"$.store..price")
  52. print(f"getAllStorePrictA is {getAllStorePriceA}\ngetAllStorePriceB is {getAllStorePriceB}")
  53. # 获取第三本书所有信息
  54. getThirdBookInfo=jsonpath(data,"$..book[2]")
  55. print(f"getThirdBookInfo is {json.dumps(getThirdBookInfo,indent=4)}")
  56. # 获取最后一本书的信息
  57. getLastBookInfo=jsonpath(data,"$..book[-1:]")
  58. print(f"getLastBookInfo is {json.dumps(getLastBookInfo,indent=4)}")
  59. # 获取前面的两本书
  60. getFirstAndSecondBookInfo=jsonpath(data,"$..book[:2]")
  61. print(f"getFirstAndSecondBookInfo is {json.dumps(getFirstAndSecondBookInfo,indent=4)}")
  62. # 根据isbn进行过滤
  63. getWithFilterISBN=jsonpath(data,"$..book[?(@.isbn)]")
  64. print(f"getWithFilterISBN is {json.dumps(getWithFilterISBN,indent=4)}")
  65. # 根据price进行筛选
  66. getWithFilterPrice=jsonpath(data,"$..book[?(@.price<10)]")
  67. print(f"getWithFilterPrice is {json.dumps(getWithFilterPrice,indent=4)}")
  68. # 所有元素
  69. getAllElement=jsonpath(data,"$..*")
  70. print(f"getAllElement is {json.dumps(getAllElement,indent=4)}")
  71. # 未能匹配到元素时
  72. noMatchElement=jsonpath(data,"$..surpass")
  73. print(f"noMatchElement is {noMatchElement}")
  74. # 调整输出格式
  75. controlleOutput=jsonpath(data,expr="$..author",result_type="PATH")
  76. print(f"controlleOutput is {json.dumps(controlleOutput,indent=4)}")

    最终输出结果如下扬尘:

  1. getAllBookAuthor is :[
  2. "Nigel Rees",
  3. "Evelyn Waugh",
  4. "Herman Melville",
  5. "J. R. R. Tolkien"
  6. ]
  7. getAllAuthor is [
  8. "Nigel Rees",
  9. "Evelyn Waugh",
  10. "Herman Melville",
  11. "J. R. R. Tolkien"
  12. ]
  13. getAllStoreElement is [
  14. [
  15. {
  16. "category": "reference",
  17. "author": "Nigel Rees",
  18. "title": "Sayings of the Century",
  19. "price": 8.95
  20. },
  21. {
  22. "category": "fiction",
  23. "author": "Evelyn Waugh",
  24. "title": "Sword of Honour",
  25. "price": 12.99
  26. },
  27. {
  28. "category": "fiction",
  29. "author": "Herman Melville",
  30. "title": "Moby Dick",
  31. "isbn": "0-553-21311-3",
  32. "price": 8.99
  33. },
  34. {
  35. "category": "fiction",
  36. "author": "J. R. R. Tolkien",
  37. "title": "The Lord of the Rings",
  38. "isbn": "0-395-19395-8",
  39. "price": 22.99
  40. }
  41. ],
  42. {
  43. "color": "red",
  44. "price": 19.95
  45. }
  46. ]
  47. getAllStorePrictA is [8.95, 12.99, 8.99, 22.99, 19.95]
  48. getAllStorePriceB is [8.95, 12.99, 8.99, 22.99, 19.95]
  49. getThirdBookInfo is [
  50. {
  51. "category": "fiction",
  52. "author": "Herman Melville",
  53. "title": "Moby Dick",
  54. "isbn": "0-553-21311-3",
  55. "price": 8.99
  56. }
  57. ]
  58. getLastBookInfo is [
  59. {
  60. "category": "fiction",
  61. "author": "J. R. R. Tolkien",
  62. "title": "The Lord of the Rings",
  63. "isbn": "0-395-19395-8",
  64. "price": 22.99
  65. }
  66. ]
  67. getFirstAndSecondBookInfo is [
  68. {
  69. "category": "reference",
  70. "author": "Nigel Rees",
  71. "title": "Sayings of the Century",
  72. "price": 8.95
  73. },
  74. {
  75. "category": "fiction",
  76. "author": "Evelyn Waugh",
  77. "title": "Sword of Honour",
  78. "price": 12.99
  79. }
  80. ]
  81. getWithFilterISBN is [
  82. {
  83. "category": "fiction",
  84. "author": "Herman Melville",
  85. "title": "Moby Dick",
  86. "isbn": "0-553-21311-3",
  87. "price": 8.99
  88. },
  89. {
  90. "category": "fiction",
  91. "author": "J. R. R. Tolkien",
  92. "title": "The Lord of the Rings",
  93. "isbn": "0-395-19395-8",
  94. "price": 22.99
  95. }
  96. ]
  97. getWithFilterPrice is [
  98. {
  99. "category": "reference",
  100. "author": "Nigel Rees",
  101. "title": "Sayings of the Century",
  102. "price": 8.95
  103. },
  104. {
  105. "category": "fiction",
  106. "author": "Herman Melville",
  107. "title": "Moby Dick",
  108. "isbn": "0-553-21311-3",
  109. "price": 8.99
  110. }
  111. ]
  112. getAllElement is [
  113. {
  114. "book": [
  115. {
  116. "category": "reference",
  117. "author": "Nigel Rees",
  118. "title": "Sayings of the Century",
  119. "price": 8.95
  120. },
  121. {
  122. "category": "fiction",
  123. "author": "Evelyn Waugh",
  124. "title": "Sword of Honour",
  125. "price": 12.99
  126. },
  127. {
  128. "category": "fiction",
  129. "author": "Herman Melville",
  130. "title": "Moby Dick",
  131. "isbn": "0-553-21311-3",
  132. "price": 8.99
  133. },
  134. {
  135. "category": "fiction",
  136. "author": "J. R. R. Tolkien",
  137. "title": "The Lord of the Rings",
  138. "isbn": "0-395-19395-8",
  139. "price": 22.99
  140. }
  141. ],
  142. "bicycle": {
  143. "color": "red",
  144. "price": 19.95
  145. }
  146. },
  147. [
  148. {
  149. "category": "reference",
  150. "author": "Nigel Rees",
  151. "title": "Sayings of the Century",
  152. "price": 8.95
  153. },
  154. {
  155. "category": "fiction",
  156. "author": "Evelyn Waugh",
  157. "title": "Sword of Honour",
  158. "price": 12.99
  159. },
  160. {
  161. "category": "fiction",
  162. "author": "Herman Melville",
  163. "title": "Moby Dick",
  164. "isbn": "0-553-21311-3",
  165. "price": 8.99
  166. },
  167. {
  168. "category": "fiction",
  169. "author": "J. R. R. Tolkien",
  170. "title": "The Lord of the Rings",
  171. "isbn": "0-395-19395-8",
  172. "price": 22.99
  173. }
  174. ],
  175. {
  176. "color": "red",
  177. "price": 19.95
  178. },
  179. {
  180. "category": "reference",
  181. "author": "Nigel Rees",
  182. "title": "Sayings of the Century",
  183. "price": 8.95
  184. },
  185. {
  186. "category": "fiction",
  187. "author": "Evelyn Waugh",
  188. "title": "Sword of Honour",
  189. "price": 12.99
  190. },
  191. {
  192. "category": "fiction",
  193. "author": "Herman Melville",
  194. "title": "Moby Dick",
  195. "isbn": "0-553-21311-3",
  196. "price": 8.99
  197. },
  198. {
  199. "category": "fiction",
  200. "author": "J. R. R. Tolkien",
  201. "title": "The Lord of the Rings",
  202. "isbn": "0-395-19395-8",
  203. "price": 22.99
  204. },
  205. "reference",
  206. "Nigel Rees",
  207. "Sayings of the Century",
  208. 8.95,
  209. "fiction",
  210. "Evelyn Waugh",
  211. "Sword of Honour",
  212. 12.99,
  213. "fiction",
  214. "Herman Melville",
  215. "Moby Dick",
  216. "0-553-21311-3",
  217. 8.99,
  218. "fiction",
  219. "J. R. R. Tolkien",
  220. "The Lord of the Rings",
  221. "0-395-19395-8",
  222. 22.99,
  223. "red",
  224. 19.95
  225. ]
  226. noMatchElement is False
  227. controlleOutput is [
  228. "$['store']['book'][0]['author']",
  229. "$['store']['book'][1]['author']",
  230. "$['store']['book'][2]['author']",
  231. "$['store']['book'][3]['author']"
  232. ]
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/神奇cpp/article/detail/927133
推荐阅读
相关标签
  

闽ICP备14008679号