当前位置:   article > 正文

关于Python3爬虫抓取豆瓣电影的案例-利用正则表达式_如何爬取电影简介的导演用正则表达式

如何爬取电影简介的导演用正则表达式

最近在学习Python3爬虫,看了这本书《Python3网络爬虫开发实战》(并非打广告),看到了里面提到一个例子,爬取X眼电影的数据,今天试着自己实战一下。

主要是参考了以下资料:

1.书籍:《Python3网络爬虫开发实战》

2.博客:https://blog.csdn.net/skrskr66/article/details/85228193?utm_medium=distribute.pc_relevant.none-task-blog-baidujs-3

如果侵权,请联系本人,谢谢 。

环境那些自行配置,废话少说,上代码。。。。。

开发工具:PyCharm

运行环境:Mac系统   Python3

1.导入所需框架

  1. import requests
  2. import re

如果导入报错,先核实项目是否引入对应的包了,如图所示

 

2.引入请求头和返回code处理

  1. def get_douban_top(url):
  2. headers = {
  3. 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'
  4. }
  5. response = requests.get(url, headers=headers)
  6. if response.status_code == 200:
  7. return response.text
  8. return None

 

ps:请求头直接去网页复制即可,如图所示:

 

3.main直接执行结果

  1. def main():
  2. url = 'https://movie.douban.com/top250'
  3. html = get_douban_top(url)
  4. print(html)
  5. main()

 

完整代码:

  1. import requests
  2. import re
  3. def get_douban_top(url):
  4. headers = {
  5. 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'
  6. }
  7. response = requests.get(url, headers=headers)
  8. if response.status_code == 200:
  9. return response.text
  10. return None
  11. def main():
  12. url = 'https://movie.douban.com/top250'
  13. html = get_douban_top(url)
  14. print(html)
  15. main()

 

运行结果:

  1. /Users/wubaihua/PythonThreewDemo/bin/python /Users/wubaihua/Desktop/Dev/PythonProject/PythonThreewDemo/DouBanTop.py
  2. <!DOCTYPE html>
  3. <html lang="zh-CN" class="ua-mac ua-webkit">
  4. <head>
  5. <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  6. <meta name="renderer" content="webkit">
  7. <meta name="referrer" content="always">
  8. <meta name="google-site-verification" content="ok0wCgT20tBBgo9_zat2iAcimtN4Ftf5ccsh092Xeyw" />
  9. <title>
  10. 豆瓣电影 Top 250
  11. </title>
  12. <meta name="baidu-site-verification" content="cZdR4xxR7RxmM4zE" />
  13. <meta http-equiv="Pragma" content="no-cache">
  14. <meta http-equiv="Expires" content="Sun, 6 Mar 2005 01:00:00 GMT">
  15. <link rel="apple-touch-icon" href="https://img3.doubanio.com/f/movie/d59b2715fdea4968a450ee5f6c95c7d7a2030065/pics/movie/apple-touch-icon.png">
  16. <link href="https://img3.doubanio.com/f/shire/3e5dfc68b0f376484c50cf08a58bbca3700911dc/css/douban.css" rel="stylesheet" type="text/css">
  17. <link href="https://img3.doubanio.com/f/shire/ae3f5a3e3085968370b1fc63afcecb22d3284848/css/separation/_all.css" rel="stylesheet" type="text/css">
  18. <link href="https://img3.doubanio.com/f/movie/8864d3756094f5272d3c93e30ee2e324665855b0/css/movie/base/init.css" rel="stylesheet">
  19. <script type="text/javascript">var _head_start = new Date();</script>
  20. <script type="text/javascript" src="https://img3.doubanio.com/f/movie/0495cb173e298c28593766009c7b0a953246c5b5/js/movie/lib/jquery.js"></script>
  21. <script type="text/javascript" src="https://img3.doubanio.com/f/shire/5ecaf46d6954d5a30bc7d99be86ae34031646e00/js/douban.js"></script>
  22. <script type="text/javascript" src="https://img3.doubanio.com/f/shire/0efdc63b77f895eaf85281fb0e44d435c6239a3f/js/separation/_all.js"></script>
  23. <link href="https://img3.doubanio.com/f/movie/2c95f768ea74284b900c04c0209b0a44f0a0de52/css/movie/top_movies.css" rel="stylesheet" type="text/css" />
  24. <script type="text/javascript" src="https://img3.doubanio.com/f/shire/77323ae72a612bba8b65f845491513ff3329b1bb/js/do.js" data-cfg-autoload="false"></script>
  25. <script type='text/javascript'>
  26. Do.ready(function(){
  27. $("#mine-selector input[type='checkbox']").click(function(){
  28. var val = $(this).is(":checked")?$(this).val():"";
  29. window.location.href = '/top250?filter=' + val;
  30. })
  31. })
  32. </script>
  33. <style type="text/css">
  34. .site-nav-logo img{margin-bottom:0;}
  35. </style>
  36. <style type="text/css">img { max-width: 100%; }</style>
  37. <script type="text/javascript"></script>
  38. <link rel="stylesheet" href="https://img3.doubanio.com/misc/mixed_static/562925b5e3824700.css">
  39. <link rel="shortcut icon" href="https://img3.doubanio.com/favicon.ico" type="image/x-icon">
  40. </head>
  41. <body>
  42. <script type="text/javascript">var _body_start = new Date();</script>
  43. <link href="//img3.doubanio.com/dae/accounts/resources/1f8d15e/shire/bundle.css" rel="stylesheet" type="text/css">
  44. <div id="db-global-nav" class="global-nav">
  45. <div class="bd">
  46. <div class="top-nav-info">
  47. <a href="https://accounts.douban.com/passport/login?source=movie" class="nav-login" rel="nofollow">登录/注册</a>
  48. </div>
  49. <div class="top-nav-doubanapp">
  50. <a href="https://www.douban.com/doubanapp/app?channel=top-nav" class="lnk-doubanapp">下载豆瓣客户端</a>
  51. <div id="doubanapp-tip">
  52. <a href="https://www.douban.com/doubanapp/app?channel=qipao" class="tip-link">豆瓣 <span class="version">6.0</span> 全新发布</a>
  53. <a href="javascript: void 0;" class="tip-close">×</a>
  54. </div>
  55. <div id="top-nav-appintro" class="more-items">
  56. <p class="appintro-title">豆瓣</p>
  57. <p class="qrcode">扫码直接下载</p>
  58. <div class="download">
  59. <a href="https://www.douban.com/doubanapp/redirect?channel=top-nav&direct_dl=1&download=iOS">iPhone</a>
  60. <span>·</span>
  61. <a href="https://www.douban.com/doubanapp/redirect?channel=top-nav&direct_dl=1&download=Android" class="download-android">Android</a>
  62. </div>
  63. </div>
  64. </div>
  65. <div class="global-nav-items">
  66. <ul>
  67. <li class="">
  68. <a href="https://www.douban.com" target="_blank" data-moreurl-dict="{&quot;from&quot;:&quot;top-nav-click-main&quot;,&quot;uid&quot;:&quot;0&quot;}">豆瓣</a>
  69. </li>
  70. <li class="">
  71. <a href="https://book.douban.com" target="_blank" data-moreurl-dict="{&quot;from&quot;:&quot;top-nav-click-book&quot;,&quot;uid&quot;:&quot;0&quot;}">读书</a>
  72. </li>
  73. <li class="on">
  74. <a href="https://movie.douban.com" data-moreurl-dict="{&quot;from&quot;:&quot;top-nav-click-movie&quot;,&quot;uid&quot;:&quot;0&quot;}">电影</a>
  75. </li>
  76. <li class="">
  77. <a href="https://music.douban.com" target="_blank" data-moreurl-dict="{&quot;from&quot;:&quot;top-nav-click-music&quot;,&quot;uid&quot;:&quot;0&quot;}">音乐</a>
  78. </li>
  79. <li class="">
  80. <a href="https://www.douban.com/location" target="_blank" data-moreurl-dict="{&quot;from&quot;:&quot;top-nav-click-location&quot;,&quot;uid&quot;:&quot;0&quot;}">同城</a>
  81. </li>
  82. <li class="">
  83. <a href="https://www.douban.com/group" target="_blank" data-moreurl-dict="{&quot;from&quot;:&quot;top-nav-click-group&quot;,&quot;uid&quot;:&quot;0&quot;}">小组</a>
  84. </li>
  85. <li class="">
  86. <a href="https://read.douban.com&#47;?dcs=top-nav&amp;dcm=douban" target="_blank" data-moreurl-dict="{&quot;from&quot;:&quot;top-nav-click-read&quot;,&quot;uid&quot;:&quot;0&quot;}">阅读</a>
  87. </li>
  88. <li class="">
  89. <a href="https://douban.fm&#47;?from_=shire_top_nav" target="_blank" data-moreurl-dict="{&quot;from&quot;:&quot;top-nav-click-fm&quot;,&quot;uid&quot;:&quot;0&quot;}">FM</a>
  90. </li>
  91. <li class="">
  92. <a href="https://time.douban.com&#47;?dt_time_source=douban-web_top_nav" target="_blank" data-moreurl-dict="{&quot;from&quot;:&quot;top-nav-click-time&quot;,&quot;uid&quot;:&quot;0&quot;}">时间</a>
  93. </li>
  94. <li class="">
  95. <a href="https://market.douban.com&#47;?utm_campaign=douban_top_nav&amp;utm_source=douban&amp;utm_medium=pc_web" target="_blank" data-moreurl-dict="{&quot;from&quot;:&quot;top-nav-click-market&quot;,&quot;uid&quot;:&quot;0&quot;}">豆品</a>
  96. </li>
  97. </ul>
  98. </div>
  99. </div>
  100. </div>
  101. <script>
  102. ;window._GLOBAL_NAV = {
  103. DOUBAN_URL: "https://www.douban.com",
  104. N_NEW_NOTIS: 0,
  105. N_NEW_DOUMAIL: 0
  106. };
  107. </script>
  108. <script src="//img3.doubanio.com/dae/accounts/resources/1f8d15e/shire/bundle.js" defer="defer"></script>
  109. <link href="//img3.doubanio.com/dae/accounts/resources/1f8d15e/movie/bundle.css" rel="stylesheet" type="text/css">
  110. <div id="db-nav-movie" class="nav">
  111. <div class="nav-wrap">
  112. <div class="nav-primary">
  113. <div class="nav-logo">
  114. <a href="https:&#47;&#47;movie.douban.com">豆瓣电影</a>
  115. </div>
  116. <div class="nav-search">
  117. <form action="https:&#47;&#47;search.douban.com&#47;movie/subject_search" method="get">
  118. <fieldset>
  119. <legend>搜索:</legend>
  120. <label for="inp-query">
  121. </label>
  122. <div class="inp"><input id="inp-query" name="search_text" size="22" maxlength="60" placeholder="搜索电影、电视剧、综艺、影人" value=""></div>
  123. <div class="inp-btn"><input type="submit" value="搜索"></div>
  124. <input type="hidden" name="cat" value="1002" />
  125. </fieldset>
  126. </form>
  127. </div>
  128. </div>
  129. </div>
  130. <div class="nav-secondary">
  131. <div class="nav-items">
  132. <ul>
  133. <li ><a href="https://movie.douban.com/cinema/nowplaying/"
  134. >影讯&购票</a>
  135. </li>
  136. <li ><a href="https://movie.douban.com/explore"
  137. >选电影</a>
  138. </li>
  139. <li ><a href="https://movie.douban.com/tv/"
  140. >电视剧</a>
  141. </li>
  142. <li ><a href="https://movie.douban.com/chart"
  143. >排行榜</a>
  144. </li>
  145. <li ><a href="https://movie.douban.com/tag/"
  146. >分类</a>
  147. </li>
  148. <li ><a href="https://movie.douban.com/review/best/"
  149. >影评</a>
  150. </li>
  151. <li ><a href="https://movie.douban.com/annual/2019?source=navigation"
  152. >2019年度榜单</a>
  153. </li>
  154. <li ><a href="https://m.douban.com/standbyme/annual2019?source=navigation"
  155. target="_blank"
  156. >2019书影音报告</a>
  157. </li>
  158. </ul>
  159. </div>
  160. <a href="https://movie.douban.com/annual/2019?source=movie_navigation" class="movieannual"></a>
  161. </div>
  162. </div>
  163. <script id="suggResult" type="text/x-jquery-tmpl">
  164. <li data-link="{{= url}}">
  165. <a href="{{= url}}" onclick="moreurl(this, {from:'movie_search_sugg', query:'{{= keyword }}', subject_id:'{{= id}}', i: '{{= index}}', type: '{{= type}}'})">
  166. <img src="{{= img}}" width="40" />
  167. <p>
  168. <em>{{= title}}</em>
  169. {{if year}}
  170. <span>{{= year}}</span>
  171. {{/if}}
  172. {{if sub_title}}
  173. <br /><span>{{= sub_title}}</span>
  174. {{/if}}
  175. {{if address}}
  176. <br /><span>{{= address}}</span>
  177. {{/if}}
  178. {{if episode}}
  179. {{if episode=="unknow"}}
  180. <br /><span>集数未知</span>
  181. {{else}}
  182. <br /><span>{{= episode}}</span>
  183. {{/if}}
  184. {{/if}}
  185. </p>
  186. </a>
  187. </li>
  188. </script>
  189. <script src="//img3.doubanio.com/dae/accounts/resources/1f8d15e/movie/bundle.js" defer="defer"></script>
  190. <div id="wrapper">
  191. <div id="content">
  192. <h1>豆瓣电影 Top 250</h1>
  193. <div class="grid-16-8 clearfix">
  194. <div class="article">
  195. <div class="opt mod">
  196. <div class="tabs">
  197. </div>
  198. <span id="mine-selector">
  199. <input type="checkbox" value="unwatched">我没看过的
  200. </span>
  201. </div>
  202. <ol class="grid_view">
  203. <li>
  204. <div class="item">
  205. <div class="pic">
  206. <em class="">1</em>
  207. <a href="https://movie.douban.com/subject/1292052/">
  208. <img width="100" alt="肖申克的救赎" src="https://img3.doubanio.com/view/photo/s_ratio_poster/public/p480747492.jpg" class="">
  209. </a>
  210. </div>
  211. <div class="info">
  212. <div class="hd">
  213. <a href="https://movie.douban.com/subject/1292052/" class="">
  214. <span class="title">肖申克的救赎</span>
  215. <span class="title">&nbsp;/&nbsp;The Shawshank Redemption</span>
  216. <span class="other">&nbsp;/&nbsp;月黑高飞(港) / 刺激1995(台)</span>
  217. </a>
  218. <span class="playable">[可播放]</span>
  219. </div>
  220. <div class="bd">
  221. <p class="">
  222. 导演: 弗兰克·德拉邦特 Frank Darabont&nbsp;&nbsp;&nbsp;主演: 蒂姆·罗宾斯 Tim Robbins /...<br>
  223. 1994&nbsp;/&nbsp;美国&nbsp;/&nbsp;犯罪 剧情
  224. </p>
  225. <div class="star">
  226. <span class="rating5-t"></span>
  227. <span class="rating_num" property="v:average">9.7</span>
  228. <span property="v:best" content="10.0"></span>
  229. <span>2028164人评价</span>
  230. </div>
  231. <p class="quote">
  232. <span class="inq">希望让人自由。</span>
  233. </p>
  234. </div>
  235. </div>
  236. </div>
  237. </li>
  238. <li>
  239. <div class="item">
  240. <div class="pic">
  241. <em class="">2</em>
  242. <a href="https://movie.douban.com/subject/1291546/">
  243. <img width="100" alt="霸王别姬" src="https://img3.doubanio.com/view/photo/s_ratio_poster/public/p2561716440.jpg" class="">
  244. </a>
  245. </div>
  246. <div class="info">
  247. <div class="hd">
  248. <a href="https://movie.douban.com/subject/1291546/" class="">
  249. <span class="title">霸王别姬</span>
  250. <span class="other">&nbsp;/&nbsp;再见,我的妾 / Farewell My Concubine</span>
  251. </a>
  252. <span class="playable">[可播放]</span>
  253. </div>
  254. <div class="bd">
  255. <p class="">
  256. 导演: 陈凯歌 Kaige Chen&nbsp;&nbsp;&nbsp;主演: 张国荣 Leslie Cheung / 张丰毅 Fengyi Zha...<br>
  257. 1993&nbsp;/&nbsp;中国大陆 中国香港&nbsp;/&nbsp;剧情 爱情 同性
  258. </p>
  259. <div class="star">
  260. <span class="rating5-t"></span>
  261. <span class="rating_num" property="v:average">9.6</span>
  262. <span property="v:best" content="10.0"></span>
  263. <span>1502834人评价</span>
  264. </div>
  265. <p class="quote">
  266. <span class="inq">风华绝代。</span>
  267. </p>
  268. </div>
  269. </div>
  270. </div>
  271. </li>
  272. <li>
  273. <div class="item">
  274. <div class="pic">
  275. <em class="">3</em>
  276. <a href="https://movie.douban.com/subject/1292720/">
  277. <img width="100" alt="阿甘正传" src="https://img9.doubanio.com/view/photo/s_ratio_poster/public/p1484728154.jpg" class="">
  278. </a>
  279. </div>
  280. <div class="info">
  281. <div class="hd">
  282. <a href="https://movie.douban.com/subject/1292720/" class="">
  283. <span class="title">阿甘正传</span>
  284. <span class="title">&nbsp;/&nbsp;Forrest Gump</span>
  285. <span class="other">&nbsp;/&nbsp;福雷斯特·冈普</span>
  286. </a>
  287. <span class="playable">[可播放]</span>
  288. </div>
  289. <div class="bd">
  290. <p class="">
  291. 导演: 罗伯特·泽米吉斯 Robert Zemeckis&nbsp;&nbsp;&nbsp;主演: 汤姆·汉克斯 Tom Hanks / ...<br>
  292. 1994&nbsp;/&nbsp;美国&nbsp;/&nbsp;剧情 爱情
  293. </p>
  294. <div class="star">
  295. <span class="rating5-t"></span>
  296. <span class="rating_num" property="v:average">9.5</span>
  297. <span property="v:best" content="10.0"></span>
  298. <span>1535358人评价</span>
  299. </div>
  300. <p class="quote">
  301. <span class="inq">一部美国近现代史。</span>
  302. </p>
  303. </div>
  304. </div>
  305. </div>
  306. </li>
  307. <li>
  308. <div class="item">
  309. <div class="pic">
  310. <em class="">4</em>
  311. <a href="https://movie.douban.com/subject/1295644/">
  312. <img width="100" alt="这个杀手不太冷" src="https://img3.doubanio.com/view/photo/s_ratio_poster/public/p511118051.jpg" class="">
  313. </a>
  314. </div>
  315. <div class="info">
  316. <div class="hd">
  317. <a href="https://movie.douban.com/subject/1295644/" class="">
  318. <span class="title">这个杀手不太冷</span>
  319. <span class="title">&nbsp;/&nbsp;Léon</span>
  320. <span class="other">&nbsp;/&nbsp;杀手莱昂 / 终极追杀令(台)</span>
  321. </a>
  322. </div>
  323. <div class="bd">
  324. <p class="">
  325. 导演: 吕克·贝松 Luc Besson&nbsp;&nbsp;&nbsp;主演: 让·雷诺 Jean Reno / 娜塔莉·波特曼 ...<br>
  326. 1994&nbsp;/&nbsp;法国&nbsp;/&nbsp;剧情 动作 犯罪
  327. </p>
  328. <div class="star">
  329. <span class="rating45-t"></span>
  330. <span class="rating_num" property="v:average">9.4</span>
  331. <span property="v:best" content="10.0"></span>
  332. <span>1727521人评价</span>
  333. </div>
  334. <p class="quote">
  335. <span class="inq">怪蜀黍和小萝莉不得不说的故事。</span>
  336. </p>
  337. </div>
  338. </div>
  339. </div>
  340. </li>
  341. <li>
  342. <div class="item">
  343. <div class="pic">
  344. <em class="">5</em>
  345. <a href="https://movie.douban.com/subject/1292063/">
  346. <img width="100" alt="美丽人生" src="https://img3.doubanio.com/view/photo/s_ratio_poster/public/p2578474613.jpg" class="">
  347. </a>
  348. </div>
  349. <div class="info">
  350. <div class="hd">
  351. <a href="https://movie.douban.com/subject/1292063/" class="">
  352. <span class="title">美丽人生</span>
  353. <span class="title">&nbsp;/&nbsp;La vita è bella</span>
  354. <span class="other">&nbsp;/&nbsp;一个快乐的传说(港) / Life Is Beautiful</span>
  355. </a>
  356. <span class="playable">[可播放]</span>
  357. </div>
  358. <div class="bd">
  359. <p class="">
  360. 导演: 罗伯托·贝尼尼 Roberto Benigni&nbsp;&nbsp;&nbsp;主演: 罗伯托·贝尼尼 Roberto Beni...<br>
  361. 1997&nbsp;/&nbsp;意大利&nbsp;/&nbsp;剧情 喜剧 爱情 战争
  362. </p>
  363. <div class="star">
  364. <span class="rating5-t"></span>
  365. <span class="rating_num" property="v:average">9.5</span>
  366. <span property="v:best" content="10.0"></span>
  367. <span>965268人评价</span>
  368. </div>
  369. <p class="quote">
  370. <span class="inq">最美的谎言。</span>
  371. </p>
  372. </div>
  373. </div>
  374. </div>
  375. </li>
  376. <li>
  377. <div class="item">
  378. <div class="pic">
  379. <em class="">6</em>
  380. <a href="https://movie.douban.com/subject/1292722/">
  381. <img width="100" alt="泰坦尼克号" src="https://img9.doubanio.com/view/photo/s_ratio_poster/public/p457760035.jpg" class="">
  382. </a>
  383. </div>
  384. <div class="info">
  385. <div class="hd">
  386. <a href="https://movie.douban.com/subject/1292722/" class="">
  387. <span class="title">泰坦尼克号</span>
  388. <span class="title">&nbsp;/&nbsp;Titanic</span>
  389. <span class="other">&nbsp;/&nbsp;铁达尼号(港 / 台)</span>
  390. </a>
  391. <span class="playable">[可播放]</span>
  392. </div>
  393. <div class="bd">
  394. <p class="">
  395. 导演: 詹姆斯·卡梅隆 James Cameron&nbsp;&nbsp;&nbsp;主演: 莱昂纳多·迪卡普里奥 Leonardo...<br>
  396. 1997&nbsp;/&nbsp;美国&nbsp;/&nbsp;剧情 爱情 灾难
  397. </p>
  398. <div class="star">
  399. <span class="rating45-t"></span>
  400. <span class="rating_num" property="v:average">9.4</span>
  401. <span property="v:best" content="10.0"></span>
  402. <span>1485880人评价</span>
  403. </div>
  404. <p class="quote">
  405. <span class="inq">失去的才是永恒的。 </span>
  406. </p>
  407. </div>
  408. </div>
  409. </div>
  410. </li>
  411. <li>
  412. <div class="item">
  413. <div class="pic">
  414. <em class="">7</em>
  415. <a href="https://movie.douban.com/subject/1291561/">
  416. <img width="100" alt="千与千寻" src="https://img1.doubanio.com/view/photo/s_ratio_poster/public/p2557573348.jpg" class="">
  417. </a>
  418. </div>
  419. <div class="info">
  420. <div class="hd">
  421. <a href="https://movie.douban.com/subject/1291561/" class="">
  422. <span class="title">千与千寻</span>
  423. <span class="title">&nbsp;/&nbsp;千と千尋の神隠し</span>
  424. <span class="other">&nbsp;/&nbsp;神隐少女(台) / 千与千寻的神隐</span>
  425. </a>
  426. <span class="playable">[可播放]</span>
  427. </div>
  428. <div class="bd">
  429. <p class="">
  430. 导演: 宫崎骏 Hayao Miyazaki&nbsp;&nbsp;&nbsp;主演: 柊瑠美 Rumi Hîragi / 入野自由 Miy...<br>
  431. 2001&nbsp;/&nbsp;日本&nbsp;/&nbsp;剧情 动画 奇幻
  432. </p>
  433. <div class="star">
  434. <span class="rating45-t"></span>
  435. <span class="rating_num" property="v:average">9.4</span>
  436. <span property="v:best" content="10.0"></span>
  437. <span>1589709人评价</span>
  438. </div>
  439. <p class="quote">
  440. <span class="inq">最好的宫崎骏,最好的久石让。 </span>
  441. </p>
  442. </div>
  443. </div>
  444. </div>
  445. </li>
  446. <li>
  447. <div class="item">
  448. <div class="pic">
  449. <em class="">8</em>
  450. <a href="https://movie.douban.com/subject/1295124/">
  451. <img width="100" alt="辛德勒的名单" src="https://img3.doubanio.com/view/photo/s_ratio_poster/public/p492406163.jpg" class="">
  452. </a>
  453. </div>
  454. <div class="info">
  455. <div class="hd">
  456. <a href="https://movie.douban.com/subject/1295124/" class="">
  457. <span class="title">辛德勒的名单</span>
  458. <span class="title">&nbsp;/&nbsp;Schindler&#39;s List</span>
  459. <span class="other">&nbsp;/&nbsp;舒特拉的名单(港) / 辛德勒名单</span>
  460. </a>
  461. <span class="playable">[可播放]</span>
  462. </div>
  463. <div class="bd">
  464. <p class="">
  465. 导演: 史蒂文·斯皮尔伯格 Steven Spielberg&nbsp;&nbsp;&nbsp;主演: 连姆·尼森 Liam Neeson...<br>
  466. 1993&nbsp;/&nbsp;美国&nbsp;/&nbsp;剧情 历史 战争
  467. </p>
  468. <div class="star">
  469. <span class="rating5-t"></span>
  470. <span class="rating_num" property="v:average">9.5</span>
  471. <span property="v:best" content="10.0"></span>
  472. <span>782438人评价</span>
  473. </div>
  474. <p class="quote">
  475. <span class="inq">拯救一个人,就是拯救整个世界。</span>
  476. </p>
  477. </div>
  478. </div>
  479. </div>
  480. </li>
  481. <li>
  482. <div class="item">
  483. <div class="pic">
  484. <em class="">9</em>
  485. <a href="https://movie.douban.com/subject/3541415/">
  486. <img width="100" alt="盗梦空间" src="https://img9.doubanio.com/view/photo/s_ratio_poster/public/p513344864.jpg" class="">
  487. </a>
  488. </div>
  489. <div class="info">
  490. <div class="hd">
  491. <a href="https://movie.douban.com/subject/3541415/" class="">
  492. <span class="title">盗梦空间</span>
  493. <span class="title">&nbsp;/&nbsp;Inception</span>
  494. <span class="other">&nbsp;/&nbsp;潜行凶间(港) / 全面启动(台)</span>
  495. </a>
  496. <span class="playable">[可播放]</span>
  497. </div>
  498. <div class="bd">
  499. <p class="">
  500. 导演: 克里斯托弗·诺兰 Christopher Nolan&nbsp;&nbsp;&nbsp;主演: 莱昂纳多·迪卡普里奥 Le...<br>
  501. 2010&nbsp;/&nbsp;美国 英国&nbsp;/&nbsp;剧情 科幻 悬疑 冒险
  502. </p>
  503. <div class="star">
  504. <span class="rating45-t"></span>
  505. <span class="rating_num" property="v:average">9.3</span>
  506. <span property="v:best" content="10.0"></span>
  507. <span>1468131人评价</span>
  508. </div>
  509. <p class="quote">
  510. <span class="inq">诺兰给了我们一场无法盗取的梦。</span>
  511. </p>
  512. </div>
  513. </div>
  514. </div>
  515. </li>
  516. <li>
  517. <div class="item">
  518. <div class="pic">
  519. <em class="">10</em>
  520. <a href="https://movie.douban.com/subject/3011091/">
  521. <img width="100" alt="忠犬八公的故事" src="https://img9.doubanio.com/view/photo/s_ratio_poster/public/p524964016.jpg" class="">
  522. </a>
  523. </div>
  524. <div class="info">
  525. <div class="hd">
  526. <a href="https://movie.douban.com/subject/3011091/" class="">
  527. <span class="title">忠犬八公的故事</span>
  528. <span class="title">&nbsp;/&nbsp;Hachi: A Dog&#39;s Tale</span>
  529. <span class="other">&nbsp;/&nbsp;忠犬小八(台) / 秋田犬八千(港)</span>
  530. </a>
  531. <span class="playable">[可播放]</span>
  532. </div>
  533. <div class="bd">
  534. <p class="">
  535. 导演: 莱塞·霍尔斯道姆 Lasse Hallström&nbsp;&nbsp;&nbsp;主演: 理查·基尔 Richard Ger...<br>
  536. 2009&nbsp;/&nbsp;美国 英国&nbsp;/&nbsp;剧情
  537. </p>
  538. <div class="star">
  539. <span class="rating45-t"></span>
  540. <span class="rating_num" property="v:average">9.4</span>
  541. <span property="v:best" content="10.0"></span>
  542. <span>1019680人评价</span>
  543. </div>
  544. <p class="quote">
  545. <span class="inq">永远都不能忘记你所爱的人。</span>
  546. </p>
  547. </div>
  548. </div>
  549. </div>
  550. </li>
  551. <li>
  552. <div class="item">
  553. <div class="pic">
  554. <em class="">11</em>
  555. <a href="https://movie.douban.com/subject/1292001/">
  556. <img width="100" alt="海上钢琴师" src="https://img9.doubanio.com/view/photo/s_ratio_poster/public/p2574551676.jpg" class="">
  557. </a>
  558. </div>
  559. <div class="info">
  560. <div class="hd">
  561. <a href="https://movie.douban.com/subject/1292001/" class="">
  562. <span class="title">海上钢琴师</span>
  563. <span class="title">&nbsp;/&nbsp;La leggenda del pianista sull&#39;oceano</span>
  564. <span class="other">&nbsp;/&nbsp;声光伴我飞(港) / 一九零零的传奇</span>
  565. </a>
  566. <span class="playable">[可播放]</span>
  567. </div>
  568. <div class="bd">
  569. <p class="">
  570. 导演: 朱塞佩·托纳多雷 Giuseppe Tornatore&nbsp;&nbsp;&nbsp;主演: 蒂姆·罗斯 Tim Roth / ...<br>
  571. 1998&nbsp;/&nbsp;意大利&nbsp;/&nbsp;剧情 音乐
  572. </p>
  573. <div class="star">
  574. <span class="rating45-t"></span>
  575. <span class="rating_num" property="v:average">9.3</span>
  576. <span property="v:best" content="10.0"></span>
  577. <span>1224081人评价</span>
  578. </div>
  579. <p class="quote">
  580. <span class="inq">每个人都要走一条自己坚定了的路,就算是粉身碎骨。 </span>
  581. </p>
  582. </div>
  583. </div>
  584. </div>
  585. </li>
  586. <li>
  587. <div class="item">
  588. <div class="pic">
  589. <em class="">12</em>
  590. <a href="https://movie.douban.com/subject/1292064/">
  591. <img width="100" alt="楚门的世界" src="https://img3.doubanio.com/view/photo/s_ratio_poster/public/p479682972.jpg" class="">
  592. </a>
  593. </div>
  594. <div class="info">
  595. <div class="hd">
  596. <a href="https://movie.douban.com/subject/1292064/" class="">
  597. <span class="title">楚门的世界</span>
  598. <span class="title">&nbsp;/&nbsp;The Truman Show</span>
  599. <span class="other">&nbsp;/&nbsp;真人Show(港) / 真人戏</span>
  600. </a>
  601. <span class="playable">[可播放]</span>
  602. </div>
  603. <div class="bd">
  604. <p class="">
  605. 导演: 彼得·威尔 Peter Weir&nbsp;&nbsp;&nbsp;主演: 金·凯瑞 Jim Carrey / 劳拉·琳妮 Lau...<br>
  606. 1998&nbsp;/&nbsp;美国&nbsp;/&nbsp;剧情 科幻
  607. </p>
  608. <div class="star">
  609. <span class="rating45-t"></span>
  610. <span class="rating_num" property="v:average">9.3</span>
  611. <span property="v:best" content="10.0"></span>
  612. <span>1092363人评价</span>
  613. </div>
  614. <p class="quote">
  615. <span class="inq">如果再也不能见到你,祝你早安,午安,晚安。</span>
  616. </p>
  617. </div>
  618. </div>
  619. </div>
  620. </li>
  621. <li>
  622. <div class="item">
  623. <div class="pic">
  624. <em class="">13</em>
  625. <a href="https://movie.douban.com/subject/3793023/">
  626. <img width="100" alt="三傻大闹宝莱坞" src="https://img3.doubanio.com/view/photo/s_ratio_poster/public/p579729551.jpg" class="">
  627. </a>
  628. </div>
  629. <div class="info">
  630. <div class="hd">
  631. <a href="https://movie.douban.com/subject/3793023/" class="">
  632. <span class="title">三傻大闹宝莱坞</span>
  633. <span class="title">&nbsp;/&nbsp;3 Idiots</span>
  634. <span class="other">&nbsp;/&nbsp;三个傻瓜(台) / 作死不离3兄弟(港)</span>
  635. </a>
  636. <span class="playable">[可播放]</span>
  637. </div>
  638. <div class="bd">
  639. <p class="">
  640. 导演: 拉库马·希拉尼 Rajkumar Hirani&nbsp;&nbsp;&nbsp;主演: 阿米尔·汗 Aamir Khan / 卡...<br>
  641. 2009&nbsp;/&nbsp;印度&nbsp;/&nbsp;剧情 喜剧 爱情 歌舞
  642. </p>
  643. <div class="star">
  644. <span class="rating45-t"></span>
  645. <span class="rating_num" property="v:average">9.2</span>
  646. <span property="v:best" content="10.0"></span>
  647. <span>1360715人评价</span>
  648. </div>
  649. <p class="quote">
  650. <span class="inq">英俊版憨豆,高情商版谢耳朵。</span>
  651. </p>
  652. </div>
  653. </div>
  654. </div>
  655. </li>
  656. <li>
  657. <div class="item">
  658. <div class="pic">
  659. <em class="">14</em>
  660. <a href="https://movie.douban.com/subject/2131459/">
  661. <img width="100" alt="机器人总动员" src="https://img3.doubanio.com/view/photo/s_ratio_poster/public/p1461851991.jpg" class="">
  662. </a>
  663. </div>
  664. <div class="info">
  665. <div class="hd">
  666. <a href="https://movie.douban.com/subject/2131459/" class="">
  667. <span class="title">机器人总动员</span>
  668. <span class="title">&nbsp;/&nbsp;WALL·E</span>
  669. <span class="other">&nbsp;/&nbsp;太空奇兵·威E(港) / 瓦力(台)</span>
  670. </a>
  671. <span class="playable">[可播放]</span>
  672. </div>
  673. <div class="bd">
  674. <p class="">
  675. 导演: 安德鲁·斯坦顿 Andrew Stanton&nbsp;&nbsp;&nbsp;主演: 本·贝尔特 Ben Burtt / 艾丽...<br>
  676. 2008&nbsp;/&nbsp;美国&nbsp;/&nbsp;科幻 动画 冒险
  677. </p>
  678. <div class="star">
  679. <span class="rating45-t"></span>
  680. <span class="rating_num" property="v:average">9.3</span>
  681. <span property="v:best" content="10.0"></span>
  682. <span>967355人评价</span>
  683. </div>
  684. <p class="quote">
  685. <span class="inq">小瓦力,大人生。</span>
  686. </p>
  687. </div>
  688. </div>
  689. </div>
  690. </li>
  691. <li>
  692. <div class="item">
  693. <div class="pic">
  694. <em class="">15</em>
  695. <a href="https://movie.douban.com/subject/1291549/">
  696. <img width="100" alt="放牛班的春天" src="https://img3.doubanio.com/view/photo/s_ratio_poster/public/p1910824951.jpg" class="">
  697. </a>
  698. </div>
  699. <div class="info">
  700. <div class="hd">
  701. <a href="https://movie.douban.com/subject/1291549/" class="">
  702. <span class="title">放牛班的春天</span>
  703. <span class="title">&nbsp;/&nbsp;Les choristes</span>
  704. <span class="other">&nbsp;/&nbsp;歌声伴我心(港) / 唱诗班男孩</span>
  705. </a>
  706. <span class="playable">[可播放]</span>
  707. </div>
  708. <div class="bd">
  709. <p class="">
  710. 导演: 克里斯托夫·巴拉蒂 Christophe Barratier&nbsp;&nbsp;&nbsp;主演: 热拉尔·朱尼奥 Gé...<br>
  711. 2004&nbsp;/&nbsp;法国 瑞士 德国&nbsp;/&nbsp;剧情 音乐
  712. </p>
  713. <div class="star">
  714. <span class="rating45-t"></span>
  715. <span class="rating_num" property="v:average">9.3</span>
  716. <span property="v:best" content="10.0"></span>
  717. <span>946898人评价</span>
  718. </div>
  719. <p class="quote">
  720. <span class="inq">天籁一般的童声,是最接近上帝的存在。 </span>
  721. </p>
  722. </div>
  723. </div>
  724. </div>
  725. </li>
  726. <li>
  727. <div class="item">
  728. <div class="pic">
  729. <em class="">16</em>
  730. <a href="https://movie.douban.com/subject/1889243/">
  731. <img width="100" alt="星际穿越" src="https://img3.doubanio.com/view/photo/s_ratio_poster/public/p2206088801.jpg" class="">
  732. </a>
  733. </div>
  734. <div class="info">
  735. <div class="hd">
  736. <a href="https://movie.douban.com/subject/1889243/" class="">
  737. <span class="title">星际穿越</span>
  738. <span class="title">&nbsp;/&nbsp;Interstellar</span>
  739. <span class="other">&nbsp;/&nbsp;星际启示录(港) / 星际效应(台)</span>
  740. </a>
  741. <span class="playable">[可播放]</span>
  742. </div>
  743. <div class="bd">
  744. <p class="">
  745. 导演: 克里斯托弗·诺兰 Christopher Nolan&nbsp;&nbsp;&nbsp;主演: 马修·麦康纳 Matthew Mc...<br>
  746. 2014&nbsp;/&nbsp;美国 英国 加拿大 冰岛&nbsp;/&nbsp;剧情 科幻 冒险
  747. </p>
  748. <div class="star">
  749. <span class="rating45-t"></span>
  750. <span class="rating_num" property="v:average">9.3</span>
  751. <span property="v:best" content="10.0"></span>
  752. <span>1095565人评价</span>
  753. </div>
  754. <p class="quote">
  755. <span class="inq">爱是一种力量,让我们超越时空感知它的存在。</span>
  756. </p>
  757. </div>
  758. </div>
  759. </div>
  760. </li>
  761. <li>
  762. <div class="item">
  763. <div class="pic">
  764. <em class="">17</em>
  765. <a href="https://movie.douban.com/subject/1292213/">
  766. <img width="100" alt="大话西游之大圣娶亲" src="https://img9.doubanio.com/view/photo/s_ratio_poster/public/p2455050536.jpg" class="">
  767. </a>
  768. </div>
  769. <div class="info">
  770. <div class="hd">
  771. <a href="https://movie.douban.com/subject/1292213/" class="">
  772. <span class="title">大话西游之大圣娶亲</span>
  773. <span class="title">&nbsp;/&nbsp;西遊記大結局之仙履奇緣</span>
  774. <span class="other">&nbsp;/&nbsp;西游记完结篇仙履奇缘 / 齐天大圣西游记</span>
  775. </a>
  776. <span class="playable">[可播放]</span>
  777. </div>
  778. <div class="bd">
  779. <p class="">
  780. 导演: 刘镇伟 Jeffrey Lau&nbsp;&nbsp;&nbsp;主演: 周星驰 Stephen Chow / 吴孟达 Man Tat Ng...<br>
  781. 1995&nbsp;/&nbsp;中国香港 中国大陆&nbsp;/&nbsp;喜剧 爱情 奇幻 古装
  782. </p>
  783. <div class="star">
  784. <span class="rating45-t"></span>
  785. <span class="rating_num" property="v:average">9.2</span>
  786. <span property="v:best" content="10.0"></span>
  787. <span>1074098人评价</span>
  788. </div>
  789. <p class="quote">
  790. <span class="inq">一生所爱。</span>
  791. </p>
  792. </div>
  793. </div>
  794. </div>
  795. </li>
  796. <li>
  797. <div class="item">
  798. <div class="pic">
  799. <em class="">18</em>
  800. <a href="https://movie.douban.com/subject/5912992/">
  801. <img width="100" alt="熔炉" src="https://img9.doubanio.com/view/photo/s_ratio_poster/public/p1363250216.jpg" class="">
  802. </a>
  803. </div>
  804. <div class="info">
  805. <div class="hd">
  806. <a href="https://movie.douban.com/subject/5912992/" class="">
  807. <span class="title">熔炉</span>
  808. <span class="title">&nbsp;/&nbsp;도가니</span>
  809. <span class="other">&nbsp;/&nbsp;无声呐喊(港) / 漩涡</span>
  810. </a>
  811. </div>
  812. <div class="bd">
  813. <p class="">
  814. 导演: 黄东赫 Dong-hyuk Hwang&nbsp;&nbsp;&nbsp;主演: 孔侑 Yoo Gong / 郑有美 Yu-mi Jung /...<br>
  815. 2011&nbsp;/&nbsp;韩国&nbsp;/&nbsp;剧情
  816. </p>
  817. <div class="star">
  818. <span class="rating45-t"></span>
  819. <span class="rating_num" property="v:average">9.3</span>
  820. <span property="v:best" content="10.0"></span>
  821. <span>661701人评价</span>
  822. </div>
  823. <p class="quote">
  824. <span class="inq">我们一路奋战不是为了改变世界,而是为了不让世界改变我们。</span>
  825. </p>
  826. </div>
  827. </div>
  828. </div>
  829. </li>
  830. <li>
  831. <div class="item">
  832. <div class="pic">
  833. <em class="">19</em>
  834. <a href="https://movie.douban.com/subject/25662329/">
  835. <img width="100" alt="疯狂动物城" src="https://img1.doubanio.com/view/photo/s_ratio_poster/public/p2315672647.jpg" class="">
  836. </a>
  837. </div>
  838. <div class="info">
  839. <div class="hd">
  840. <a href="https://movie.douban.com/subject/25662329/" class="">
  841. <span class="title">疯狂动物城</span>
  842. <span class="title">&nbsp;/&nbsp;Zootopia</span>
  843. <span class="other">&nbsp;/&nbsp;优兽大都会(港) / 动物方城市(台)</span>
  844. </a>
  845. <span class="playable">[可播放]</span>
  846. </div>
  847. <div class="bd">
  848. <p class="">
  849. 导演: 拜伦·霍华德 Byron Howard / 瑞奇·摩尔 Rich Moore&nbsp;&nbsp;&nbsp;主演: 金妮弗·...<br>
  850. 2016&nbsp;/&nbsp;美国&nbsp;/&nbsp;喜剧 动画 冒险
  851. </p>
  852. <div class="star">
  853. <span class="rating45-t"></span>
  854. <span class="rating_num" property="v:average">9.2</span>
  855. <span property="v:best" content="10.0"></span>
  856. <span>1281626人评价</span>
  857. </div>
  858. <p class="quote">
  859. <span class="inq">迪士尼给我们营造的乌托邦就是这样,永远善良勇敢,永远出乎意料。</span>
  860. </p>
  861. </div>
  862. </div>
  863. </div>
  864. </li>
  865. <li>
  866. <div class="item">
  867. <div class="pic">
  868. <em class="">20</em>
  869. <a href="https://movie.douban.com/subject/1307914/">
  870. <img width="100" alt="无间道" src="https://img3.doubanio.com/view/photo/s_ratio_poster/public/p2564556863.jpg" class="">
  871. </a>
  872. </div>
  873. <div class="info">
  874. <div class="hd">
  875. <a href="https://movie.douban.com/subject/1307914/" class="">
  876. <span class="title">无间道</span>
  877. <span class="title">&nbsp;/&nbsp;無間道</span>
  878. <span class="other">&nbsp;/&nbsp;Infernal Affairs / Mou gaan dou</span>
  879. </a>
  880. <span class="playable">[可播放]</span>
  881. </div>
  882. <div class="bd">
  883. <p class="">
  884. 导演: 刘伟强 / 麦兆辉&nbsp;&nbsp;&nbsp;主演: 刘德华 / 梁朝伟 / 黄秋生<br>
  885. 2002&nbsp;/&nbsp;中国香港&nbsp;/&nbsp;剧情 犯罪 悬疑
  886. </p>
  887. <div class="star">
  888. <span class="rating45-t"></span>
  889. <span class="rating_num" property="v:average">9.2</span>
  890. <span property="v:best" content="10.0"></span>
  891. <span>875223人评价</span>
  892. </div>
  893. <p class="quote">
  894. <span class="inq">香港电影史上永不过时的杰作。</span>
  895. </p>
  896. </div>
  897. </div>
  898. </div>
  899. </li>
  900. <li>
  901. <div class="item">
  902. <div class="pic">
  903. <em class="">21</em>
  904. <a href="https://movie.douban.com/subject/1291560/">
  905. <img width="100" alt="龙猫" src="https://img9.doubanio.com/view/photo/s_ratio_poster/public/p2540924496.jpg" class="">
  906. </a>
  907. </div>
  908. <div class="info">
  909. <div class="hd">
  910. <a href="https://movie.douban.com/subject/1291560/" class="">
  911. <span class="title">龙猫</span>
  912. <span class="title">&nbsp;/&nbsp;となりのトトロ</span>
  913. <span class="other">&nbsp;/&nbsp;邻居托托罗 / 邻家的豆豆龙</span>
  914. </a>
  915. <span class="playable">[可播放]</span>
  916. </div>
  917. <div class="bd">
  918. <p class="">
  919. 导演: 宫崎骏 Hayao Miyazaki&nbsp;&nbsp;&nbsp;主演: 日高法子 Noriko Hidaka / 坂本千夏 Ch...<br>
  920. 1988&nbsp;/&nbsp;日本&nbsp;/&nbsp;动画 奇幻 冒险
  921. </p>
  922. <div class="star">
  923. <span class="rating45-t"></span>
  924. <span class="rating_num" property="v:average">9.2</span>
  925. <span property="v:best" content="10.0"></span>
  926. <span>909974人评价</span>
  927. </div>
  928. <p class="quote">
  929. <span class="inq">人人心中都有个龙猫,童年就永远不会消失。</span>
  930. </p>
  931. </div>
  932. </div>
  933. </div>
  934. </li>
  935. <li>
  936. <div class="item">
  937. <div class="pic">
  938. <em class="">22</em>
  939. <a href="https://movie.douban.com/subject/1291841/">
  940. <img width="100" alt="教父" src="https://img9.doubanio.com/view/photo/s_ratio_poster/public/p616779645.jpg" class="">
  941. </a>
  942. </div>
  943. <div class="info">
  944. <div class="hd">
  945. <a href="https://movie.douban.com/subject/1291841/" class="">
  946. <span class="title">教父</span>
  947. <span class="title">&nbsp;/&nbsp;The Godfather</span>
  948. <span class="other">&nbsp;/&nbsp;Mario Puzo&#39;s The Godfather</span>
  949. </a>
  950. <span class="playable">[可播放]</span>
  951. </div>
  952. <div class="bd">
  953. <p class="">
  954. 导演: 弗朗西斯·福特·科波拉 Francis Ford Coppola&nbsp;&nbsp;&nbsp;主演: 马龙·白兰度 M...<br>
  955. 1972&nbsp;/&nbsp;美国&nbsp;/&nbsp;剧情 犯罪
  956. </p>
  957. <div class="star">
  958. <span class="rating45-t"></span>
  959. <span class="rating_num" property="v:average">9.3</span>
  960. <span property="v:best" content="10.0"></span>
  961. <span>665009人评价</span>
  962. </div>
  963. <p class="quote">
  964. <span class="inq">千万不要记恨你的对手,这样会让你失去理智。</span>
  965. </p>
  966. </div>
  967. </div>
  968. </div>
  969. </li>
  970. <li>
  971. <div class="item">
  972. <div class="pic">
  973. <em class="">23</em>
  974. <a href="https://movie.douban.com/subject/1849031/">
  975. <img width="100" alt="当幸福来敲门" src="https://img1.doubanio.com/view/photo/s_ratio_poster/public/p1312700628.jpg" class="">
  976. </a>
  977. </div>
  978. <div class="info">
  979. <div class="hd">
  980. <a href="https://movie.douban.com/subject/1849031/" class="">
  981. <span class="title">当幸福来敲门</span>
  982. <span class="title">&nbsp;/&nbsp;The Pursuit of Happyness</span>
  983. <span class="other">&nbsp;/&nbsp;寻找快乐的故事(港) / 追求快乐</span>
  984. </a>
  985. <span class="playable">[可播放]</span>
  986. </div>
  987. <div class="bd">
  988. <p class="">
  989. 导演: 加布里尔·穆奇诺 Gabriele Muccino&nbsp;&nbsp;&nbsp;主演: 威尔·史密斯 Will Smith ...<br>
  990. 2006&nbsp;/&nbsp;美国&nbsp;/&nbsp;剧情 传记 家庭
  991. </p>
  992. <div class="star">
  993. <span class="rating45-t"></span>
  994. <span class="rating_num" property="v:average">9.1</span>
  995. <span property="v:best" content="10.0"></span>
  996. <span>1086829人评价</span>
  997. </div>
  998. <p class="quote">
  999. <span class="inq">平民励志片。 </span>
  1000. </p>
  1001. </div>
  1002. </div>
  1003. </div>
  1004. </li>
  1005. <li>
  1006. <div class="item">
  1007. <div class="pic">
  1008. <em class="">24</em>
  1009. <a href="https://movie.douban.com/subject/3319755/">
  1010. <img width="100" alt="怦然心动" src="https://img1.doubanio.com/view/photo/s_ratio_poster/public/p501177648.jpg" class="">
  1011. </a>
  1012. </div>
  1013. <div class="info">
  1014. <div class="hd">
  1015. <a href="https://movie.douban.com/subject/3319755/" class="">
  1016. <span class="title">怦然心动</span>
  1017. <span class="title">&nbsp;/&nbsp;Flipped</span>
  1018. <span class="other">&nbsp;/&nbsp;萌动青春 / 青春萌动</span>
  1019. </a>
  1020. <span class="playable">[可播放]</span>
  1021. </div>
  1022. <div class="bd">
  1023. <p class="">
  1024. 导演: 罗伯·莱纳 Rob Reiner&nbsp;&nbsp;&nbsp;主演: 玛德琳·卡罗尔 Madeline Carroll / 卡...<br>
  1025. 2010&nbsp;/&nbsp;美国&nbsp;/&nbsp;剧情 喜剧 爱情
  1026. </p>
  1027. <div class="star">
  1028. <span class="rating45-t"></span>
  1029. <span class="rating_num" property="v:average">9.1</span>
  1030. <span property="v:best" content="10.0"></span>
  1031. <span>1266040人评价</span>
  1032. </div>
  1033. <p class="quote">
  1034. <span class="inq">真正的幸福是来自内心深处。</span>
  1035. </p>
  1036. </div>
  1037. </div>
  1038. </div>
  1039. </li>
  1040. <li>
  1041. <div class="item">
  1042. <div class="pic">
  1043. <em class="">25</em>
  1044. <a href="https://movie.douban.com/subject/6786002/">
  1045. <img width="100" alt="触不可及" src="https://img9.doubanio.com/view/photo/s_ratio_poster/public/p1454261925.jpg" class="">
  1046. </a>
  1047. </div>
  1048. <div class="info">
  1049. <div class="hd">
  1050. <a href="https://movie.douban.com/subject/6786002/" class="">
  1051. <span class="title">触不可及</span>
  1052. <span class="title">&nbsp;/&nbsp;Intouchables</span>
  1053. <span class="other">&nbsp;/&nbsp;闪亮人生(港) / 逆转人生(台)</span>
  1054. </a>
  1055. </div>
  1056. <div class="bd">
  1057. <p class="">
  1058. 导演: 奥利维·那卡什 Olivier Nakache / 艾力克·托兰达 Eric Toledano&nbsp;&nbsp;&nbsp;主...<br>
  1059. 2011&nbsp;/&nbsp;法国&nbsp;/&nbsp;剧情 喜剧
  1060. </p>
  1061. <div class="star">
  1062. <span class="rating45-t"></span>
  1063. <span class="rating_num" property="v:average">9.2</span>
  1064. <span property="v:best" content="10.0"></span>
  1065. <span>710625人评价</span>
  1066. </div>
  1067. <p class="quote">
  1068. <span class="inq">满满温情的高雅喜剧。</span>
  1069. </p>
  1070. </div>
  1071. </div>
  1072. </div>
  1073. </li>
  1074. </ol>
  1075. <div class="paginator">
  1076. <span class="prev">
  1077. &lt;前页
  1078. </span>
  1079. <span class="thispage">1</span>
  1080. <a href="?start=25&amp;filter=" >2</a>
  1081. <a href="?start=50&amp;filter=" >3</a>
  1082. <a href="?start=75&amp;filter=" >4</a>
  1083. <a href="?start=100&amp;filter=" >5</a>
  1084. <a href="?start=125&amp;filter=" >6</a>
  1085. <a href="?start=150&amp;filter=" >7</a>
  1086. <a href="?start=175&amp;filter=" >8</a>
  1087. <a href="?start=200&amp;filter=" >9</a>
  1088. <a href="?start=225&amp;filter=" >10</a>
  1089. <span class="next">
  1090. <link rel="next" href="?start=25&amp;filter="/>
  1091. <a href="?start=25&amp;filter=" >后页&gt;</a>
  1092. </span>
  1093. <span class="count">(共250条)</span>
  1094. </div>
  1095. </div>
  1096. <div class="aside">
  1097. <p class="pl">
  1098. 豆瓣用户每天都在对“看过”的电影进行“很差”到“力荐”的评价,豆瓣根据每部影片看过的人数以及该影片所得的评价等综合数据,通过算法分析产生豆瓣电影 Top 250。
  1099. </p>
  1100. <div id="dale_movie_top250_bottom_right"></div>
  1101. <!-- douban ad begin -->
  1102. <div class="mobile-app-entrance block5 app-movie">
  1103. <a class="entrance-link" href="https://www.douban.com/doubanapp/frodo">
  1104. <div class="entrance-qrcode">
  1105. <img src="https://img3.doubanio.com/f/movie/a02f6ed325fc52e220f299d51e730c422e2bcd16/pics/movie/douban_app_ad/qrcode.png" alt="扫码下载豆瓣 App" width="80" height="80" />
  1106. </div>
  1107. <div class="entrance-info">
  1108. <span class="app-icon icon-movie"></span>
  1109. <span class="main-title">豆瓣</span>
  1110. <span class="sub-title">让好电影来找你</span>
  1111. </div>
  1112. </a>
  1113. </div>
  1114. <!-- douban ad end -->
  1115. </div>
  1116. <div class="extra">
  1117. </div>
  1118. </div>
  1119. </div>
  1120. <div id="footer">
  1121. <div class="footer-extra"></div>
  1122. <span id="icp" class="fleft gray-link">
  1123. &copy; 2005-2020 douban.com, all rights reserved 北京豆网科技有限公司
  1124. </span>
  1125. <a href="https://www.douban.com/hnypt/variformcyst.py" style="display: none;"></a>
  1126. <span class="fright">
  1127. <a href="https://www.douban.com/about">关于豆瓣</a>
  1128. · <a href="https://www.douban.com/jobs">在豆瓣工作</a>
  1129. · <a href="https://www.douban.com/about?topic=contactus">联系我们</a>
  1130. · <a href="https://www.douban.com/about/legal">法律声明</a>
  1131. · <a href="https://help.douban.com/?app=movie" target="_blank">帮助中心</a>
  1132. · <a href="https://www.douban.com/doubanapp/">移动应用</a>
  1133. · <a href="https://www.douban.com/partner/">豆瓣广告</a>
  1134. </span>
  1135. </div>
  1136. </div>
  1137. <!-- COLLECTED JS -->
  1138. <link rel="stylesheet" type="text/css" href="https://img3.doubanio.com/f/shire/8377b9498330a2e6f056d863987cc7a37eb4d486/css/ui/dialog.css" />
  1139. <link rel="stylesheet" type="text/css" href="https://img3.doubanio.com/f/movie/4aca95d66d37ec0712b3d19973b5d8feb75f2f05/css/movie/mod/reg_login_pop.css" />
  1140. <script type="text/javascript" src="https://img3.doubanio.com/f/shire/77323ae72a612bba8b65f845491513ff3329b1bb/js/do.js" data-cfg-autoload="false"></script>
  1141. <script type="text/javascript" src="https://img3.doubanio.com/f/shire/383a6e43f2108dc69e3ff2681bc4dc6c72a5ffb0/js/ui/dialog.js"></script>
  1142. <script type="text/javascript">
  1143. var HTTPS_DB='https://www.douban.com';
  1144. var account_pop={open:function(o,e){e?referrer="?referrer="+encodeURIComponent(e):referrer="?referrer="+window.location.href;var n="",i="",t=448;n="用户登录",i="https://accounts.douban.com/passport/login_popup?source=movie";var r=document.location.protocol+"//"+document.location.hostname,a=dui.Dialog({width:340,title:n,height:t,cls:"account_pop",isHideTitle:!0,modal:!0,content:"<iframe scrolling='no' frameborder='0' width='340' height='"+t+"' src='"+i+"' name='"+r+"'></iframe>"},!0),c=a.node;if(c.undelegate(),c.delegate(".dui-dialog-close","click",function(){var o=$("body");o.find("#login_msk").hide(),o.find(".account_pop").remove()}),$(window).width()<478){var d="";"reg"===o?d=HTTPS_DB+"/accounts/register"+referrer:"login"===o&&(d=HTTPS_DB+"/accounts/login"+referrer),window.location.href=d}else a.open();$(window).bind("message",function(o){"https://accounts.douban.com"===o.originalEvent.origin&&(c.find("iframe").css("height",o.originalEvent.data),c.height(o.originalEvent.data),a.update())})}};Douban&&Douban.init_show_login&&(Douban.init_show_login=function(o){var e=$(o);e.click(function(){var o=e.data("ref")||"";return account_pop.open("login",o),!1})}),Do(function(){$("body").delegate(".pop_register","click",function(o){o.preventDefault();var e=$(this).data("ref")||"";return account_pop.open("reg",e),!1}),$("body").delegate(".pop_login","click",function(o){o.preventDefault();var e=$(this).data("ref")||"";return account_pop.open("login",e),!1})});
  1145. </script>
  1146. <script type="text/javascript">
  1147. (function (global) {
  1148. var newNode = global.document.createElement('script'),
  1149. existingNode = global.document.getElementsByTagName('script')[0],
  1150. adSource = '//erebor.douban.com/',
  1151. userId = '',
  1152. browserId = 'Pp8Rh7_QuUA',
  1153. criteria = '3:/top250',
  1154. preview = '',
  1155. debug = false,
  1156. adSlots = ['dale_movie_top250_bottom_right'];
  1157. global.DoubanAdRequest = {src: adSource, uid: userId, bid: browserId, crtr: criteria, prv: preview, debug: debug};
  1158. global.DoubanAdSlots = (global.DoubanAdSlots || []).concat(adSlots);
  1159. newNode.setAttribute('type', 'text/javascript');
  1160. newNode.setAttribute('src', '//img1.doubanio.com/dnRiY2ZoMy9mL2FkanMvNTRjNWIyYmFlNjFkZTJlYTBlN2VmZDk1ODc2MGY3ZTk2OGZjNmQwNS9hZC5yZWxlYXNlLmpz');
  1161. newNode.setAttribute('async', true);
  1162. existingNode.parentNode.insertBefore(newNode, existingNode);
  1163. })(this);
  1164. </script>
  1165. <script type="text/javascript">
  1166. var _paq = _paq || [];
  1167. _paq.push(['trackPageView']);
  1168. _paq.push(['enableLinkTracking']);
  1169. (function() {
  1170. var p=(('https:' == document.location.protocol) ? 'https' : 'http'), u=p+'://fundin.douban.com/';
  1171. _paq.push(['setTrackerUrl', u+'piwik']);
  1172. _paq.push(['setSiteId', '100001']);
  1173. var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0];
  1174. g.type='text/javascript';
  1175. g.defer=true;
  1176. g.async=true;
  1177. g.src=p+'://img3.doubanio.com/dae/fundin/piwik.js';
  1178. s.parentNode.insertBefore(g,s);
  1179. })();
  1180. </script>
  1181. <script type="text/javascript">
  1182. var setMethodWithNs = function(namespace) {
  1183. var ns = namespace ? namespace + '.' : ''
  1184. , fn = function(string) {
  1185. if(!ns) {return string}
  1186. return ns + string
  1187. }
  1188. return fn
  1189. }
  1190. var gaWithNamespace = function(fn, namespace) {
  1191. var method = setMethodWithNs(namespace)
  1192. fn.call(this, method)
  1193. }
  1194. var _gaq = _gaq || []
  1195. , accounts = [
  1196. { id: 'UA-7019765-1', namespace: 'douban' }
  1197. , { id: 'UA-7019765-19', namespace: '' }
  1198. ]
  1199. , gaInit = function(account) {
  1200. gaWithNamespace(function(method) {
  1201. gaInitFn.call(this, method, account)
  1202. }, account.namespace)
  1203. }
  1204. , gaInitFn = function(method, account) {
  1205. _gaq.push([method('_setAccount'), account.id]);
  1206. _gaq.push([method('_setSampleRate'), '5']);
  1207. _gaq.push([method('_addOrganic'), 'google', 'q'])
  1208. _gaq.push([method('_addOrganic'), 'baidu', 'wd'])
  1209. _gaq.push([method('_addOrganic'), 'soso', 'w'])
  1210. _gaq.push([method('_addOrganic'), 'youdao', 'q'])
  1211. _gaq.push([method('_addOrganic'), 'so.360.cn', 'q'])
  1212. _gaq.push([method('_addOrganic'), 'sogou', 'query'])
  1213. if (account.namespace) {
  1214. _gaq.push([method('_addIgnoredOrganic'), '豆瓣'])
  1215. _gaq.push([method('_addIgnoredOrganic'), 'douban'])
  1216. _gaq.push([method('_addIgnoredOrganic'), '豆瓣网'])
  1217. _gaq.push([method('_addIgnoredOrganic'), 'www.douban.com'])
  1218. }
  1219. if (account.namespace === 'douban') {
  1220. _gaq.push([method('_setDomainName'), '.douban.com'])
  1221. }
  1222. _gaq.push([method('_setCustomVar'), 1, 'responsive_view_mode', 'desktop', 3])
  1223. _gaq.push([method('_setCustomVar'), 2, 'login_status', '0', 2]);
  1224. _gaq.push([method('_trackPageview')])
  1225. }
  1226. for(var i = 0, l = accounts.length; i < l; i++) {
  1227. var account = accounts[i]
  1228. gaInit(account)
  1229. }
  1230. ;(function() {
  1231. var ga = document.createElement('script');
  1232. ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
  1233. ga.setAttribute('async', 'true');
  1234. document.documentElement.firstChild.appendChild(ga);
  1235. })()
  1236. </script>
  1237. <!-- dae-web-movie--default-b59d5cf95-vzn86-->
  1238. <script>_SPLITTEST=''</script>
  1239. </body>
  1240. </html>
  1241. Process finished with exit code 0

由此可以  这是直接抓取了整个页面,很明显这个不是我们想要的结果,接下来我们对此进行优化

 

4.正则表达式解析html

  1. def parse_douban_top(html):
  2. regex = '<em class="">(\d+)</em>.*?<span class="title">(.*?)</span>.*?<p class="">(.*?)</p>.*?<span class="rating_num" property="v:average">(.*?)</span>.*?<span>(.*?)</span>.*?<span class="inq">(.*?)</span>'
  3. jsonResult = re.compile(
  4. regex, re.S
  5. )
  6. items = re.findall(jsonResult, html)
  7. for item in items:
  8. content = ""
  9. for every_list in item[2].split():
  10. # split()通过指定分隔符对字符串进行切片,因为职员表处有些标签需要我们进行处理
  11. content = content + "".join(every_list)
  12. content = re.sub('&nbsp;', ' ', content)
  13. content = re.sub('<br>', ' ', content)
  14. yield {
  15. "index": item[0],
  16. "name": item[1],
  17. "describe": content,
  18. "star": item[3],
  19. "evaluate": item[4],
  20. "title": item[5]
  21. }

其中

<em class="">(\d+)</em>.*?<span class="title">(.*?)</span>.*?<p class="">(.*?)</p>.*?<span class="rating_num" property="v:average">(.*?)</span>.*?<span>(.*?)</span>.*?<span class="inq">(.*?)</span>

这个是针对豆瓣进行内容进行解析

 

完整代码:

  1. import requests
  2. import re
  3. def get_douban_top(url):
  4. headers = {
  5. 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'
  6. }
  7. response = requests.get(url, headers=headers)
  8. if response.status_code == 200:
  9. return response.text
  10. return None
  11. def main():
  12. url = 'https://movie.douban.com/top250'
  13. html = get_douban_top(url)
  14. for item in parse_douban_top(html):
  15. print(item)
  16. def parse_douban_top(html):
  17. regex = '<em class="">(\d+)</em>.*?<span class="title">(.*?)</span>.*?<p class="">(.*?)</p>.*?<span class="rating_num" property="v:average">(.*?)</span>.*?<span>(.*?)</span>.*?<span class="inq">(.*?)</span>'
  18. jsonResult = re.compile(
  19. regex, re.S
  20. )
  21. items = re.findall(jsonResult, html)
  22. for item in items:
  23. content = ""
  24. for every_list in item[2].split():
  25. # split()通过指定分隔符对字符串进行切片,因为职员表处有些标签需要我们进行处理
  26. content = content + "".join(every_list)
  27. content = re.sub('&nbsp;', ' ', content)
  28. content = re.sub('<br>', ' ', content)
  29. yield {
  30. "index": item[0],
  31. "name": item[1],
  32. "describe": content,
  33. "star": item[3],
  34. "evaluate": item[4],
  35. "title": item[5]
  36. }
  37. main()

 

运行结果:

 

 

结束,希望能够帮助到像我这样的初学者。

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/weixin_40725706/article/detail/994507
推荐阅读
相关标签
  

闽ICP备14008679号