当前位置:   article > 正文

基于官方YOLOv4开发构建目标检测模型超详细实战教程【以自建缺陷检测数据集为例】_yolov4 模型文件

yolov4 模型文件

本文是关于基于YOLOv4开发构建目标检测模型的超详细实战教程,超详细实战教程相关的博文在前文有相应的系列,感兴趣的话可以自行移步阅读即可:
《基于yolov7开发实践实例分割模型超详细教程》

《YOLOv7基于自己的数据集从零构建模型完整训练、推理计算超详细教程》

《DETR (DEtection TRansformer)基于自建数据集开发构建目标检测模型超详细教程》

《基于yolov5-v7.0开发实践实例分割模型超详细教程》

轻量级模型YOLOv5-Lite基于自己的数据集【焊接质量检测】从零构建模型超详细教程》

《轻量级模型NanoDet基于自己的数据集【接打电话检测】从零构建模型超详细教程》

《基于YOLOv5-v6.2全新版本模型构建自己的图像识别模型超详细教程》

《基于自建数据集【海底生物检测】使用YOLOv5-v6.1/2版本构建目标检测模型超详细教程》

 《超轻量级目标检测模型Yolo-FastestV2基于自建数据集【手写汉字检测】构建模型训练、推理完整流程超详细教程》

《基于YOLOv8开发构建目标检测模型超详细教程【以焊缝质量检测数据场景为例】》

最早期接触v3和v4的时候印象中模型的训练方式都是基于Darknet框架开发构建的,模型都是通过cfg文件进行配置的,从v5开始才全面转向了PyTorch形式的项目,延续到了现在。

yolov4.cfg如下:

  1. [net]
  2. batch=64
  3. subdivisions=8
  4. # Training
  5. #width=512
  6. #height=512
  7. width=608
  8. height=608
  9. channels=3
  10. momentum=0.949
  11. decay=0.0005
  12. angle=0
  13. saturation = 1.5
  14. exposure = 1.5
  15. hue=.1
  16. learning_rate=0.0013
  17. burn_in=1000
  18. max_batches = 500500
  19. policy=steps
  20. steps=400000,450000
  21. scales=.1,.1
  22. #cutmix=1
  23. mosaic=1
  24. #:104x104 54:52x52 85:26x26 104:13x13 for 416
  25. [convolutional]
  26. batch_normalize=1
  27. filters=32
  28. size=3
  29. stride=1
  30. pad=1
  31. activation=mish
  32. # Downsample
  33. [convolutional]
  34. batch_normalize=1
  35. filters=64
  36. size=3
  37. stride=2
  38. pad=1
  39. activation=mish
  40. [convolutional]
  41. batch_normalize=1
  42. filters=64
  43. size=1
  44. stride=1
  45. pad=1
  46. activation=mish
  47. [route]
  48. layers = -2
  49. [convolutional]
  50. batch_normalize=1
  51. filters=64
  52. size=1
  53. stride=1
  54. pad=1
  55. activation=mish
  56. [convolutional]
  57. batch_normalize=1
  58. filters=32
  59. size=1
  60. stride=1
  61. pad=1
  62. activation=mish
  63. [convolutional]
  64. batch_normalize=1
  65. filters=64
  66. size=3
  67. stride=1
  68. pad=1
  69. activation=mish
  70. [shortcut]
  71. from=-3
  72. activation=linear
  73. [convolutional]
  74. batch_normalize=1
  75. filters=64
  76. size=1
  77. stride=1
  78. pad=1
  79. activation=mish
  80. [route]
  81. layers = -1,-7
  82. [convolutional]
  83. batch_normalize=1
  84. filters=64
  85. size=1
  86. stride=1
  87. pad=1
  88. activation=mish
  89. # Downsample
  90. [convolutional]
  91. batch_normalize=1
  92. filters=128
  93. size=3
  94. stride=2
  95. pad=1
  96. activation=mish
  97. [convolutional]
  98. batch_normalize=1
  99. filters=64
  100. size=1
  101. stride=1
  102. pad=1
  103. activation=mish
  104. [route]
  105. layers = -2
  106. [convolutional]
  107. batch_normalize=1
  108. filters=64
  109. size=1
  110. stride=1
  111. pad=1
  112. activation=mish
  113. [convolutional]
  114. batch_normalize=1
  115. filters=64
  116. size=1
  117. stride=1
  118. pad=1
  119. activation=mish
  120. [convolutional]
  121. batch_normalize=1
  122. filters=64
  123. size=3
  124. stride=1
  125. pad=1
  126. activation=mish
  127. [shortcut]
  128. from=-3
  129. activation=linear
  130. [convolutional]
  131. batch_normalize=1
  132. filters=64
  133. size=1
  134. stride=1
  135. pad=1
  136. activation=mish
  137. [convolutional]
  138. batch_normalize=1
  139. filters=64
  140. size=3
  141. stride=1
  142. pad=1
  143. activation=mish
  144. [shortcut]
  145. from=-3
  146. activation=linear
  147. [convolutional]
  148. batch_normalize=1
  149. filters=64
  150. size=1
  151. stride=1
  152. pad=1
  153. activation=mish
  154. [route]
  155. layers = -1,-10
  156. [convolutional]
  157. batch_normalize=1
  158. filters=128
  159. size=1
  160. stride=1
  161. pad=1
  162. activation=mish
  163. # Downsample
  164. [convolutional]
  165. batch_normalize=1
  166. filters=256
  167. size=3
  168. stride=2
  169. pad=1
  170. activation=mish
  171. [convolutional]
  172. batch_normalize=1
  173. filters=128
  174. size=1
  175. stride=1
  176. pad=1
  177. activation=mish
  178. [route]
  179. layers = -2
  180. [convolutional]
  181. batch_normalize=1
  182. filters=128
  183. size=1
  184. stride=1
  185. pad=1
  186. activation=mish
  187. [convolutional]
  188. batch_normalize=1
  189. filters=128
  190. size=1
  191. stride=1
  192. pad=1
  193. activation=mish
  194. [convolutional]
  195. batch_normalize=1
  196. filters=128
  197. size=3
  198. stride=1
  199. pad=1
  200. activation=mish
  201. [shortcut]
  202. from=-3
  203. activation=linear
  204. [convolutional]
  205. batch_normalize=1
  206. filters=128
  207. size=1
  208. stride=1
  209. pad=1
  210. activation=mish
  211. [convolutional]
  212. batch_normalize=1
  213. filters=128
  214. size=3
  215. stride=1
  216. pad=1
  217. activation=mish
  218. [shortcut]
  219. from=-3
  220. activation=linear
  221. [convolutional]
  222. batch_normalize=1
  223. filters=128
  224. size=1
  225. stride=1
  226. pad=1
  227. activation=mish
  228. [convolutional]
  229. batch_normalize=1
  230. filters=128
  231. size=3
  232. stride=1
  233. pad=1
  234. activation=mish
  235. [shortcut]
  236. from=-3
  237. activation=linear
  238. [convolutional]
  239. batch_normalize=1
  240. filters=128
  241. size=1
  242. stride=1
  243. pad=1
  244. activation=mish
  245. [convolutional]
  246. batch_normalize=1
  247. filters=128
  248. size=3
  249. stride=1
  250. pad=1
  251. activation=mish
  252. [shortcut]
  253. from=-3
  254. activation=linear
  255. [convolutional]
  256. batch_normalize=1
  257. filters=128
  258. size=1
  259. stride=1
  260. pad=1
  261. activation=mish
  262. [convolutional]
  263. batch_normalize=1
  264. filters=128
  265. size=3
  266. stride=1
  267. pad=1
  268. activation=mish
  269. [shortcut]
  270. from=-3
  271. activation=linear
  272. [convolutional]
  273. batch_normalize=1
  274. filters=128
  275. size=1
  276. stride=1
  277. pad=1
  278. activation=mish
  279. [convolutional]
  280. batch_normalize=1
  281. filters=128
  282. size=3
  283. stride=1
  284. pad=1
  285. activation=mish
  286. [shortcut]
  287. from=-3
  288. activation=linear
  289. [convolutional]
  290. batch_normalize=1
  291. filters=128
  292. size=1
  293. stride=1
  294. pad=1
  295. activation=mish
  296. [convolutional]
  297. batch_normalize=1
  298. filters=128
  299. size=3
  300. stride=1
  301. pad=1
  302. activation=mish
  303. [shortcut]
  304. from=-3
  305. activation=linear
  306. [convolutional]
  307. batch_normalize=1
  308. filters=128
  309. size=1
  310. stride=1
  311. pad=1
  312. activation=mish
  313. [convolutional]
  314. batch_normalize=1
  315. filters=128
  316. size=3
  317. stride=1
  318. pad=1
  319. activation=mish
  320. [shortcut]
  321. from=-3
  322. activation=linear
  323. [convolutional]
  324. batch_normalize=1
  325. filters=128
  326. size=1
  327. stride=1
  328. pad=1
  329. activation=mish
  330. [route]
  331. layers = -1,-28
  332. [convolutional]
  333. batch_normalize=1
  334. filters=256
  335. size=1
  336. stride=1
  337. pad=1
  338. activation=mish
  339. # Downsample
  340. [convolutional]
  341. batch_normalize=1
  342. filters=512
  343. size=3
  344. stride=2
  345. pad=1
  346. activation=mish
  347. [convolutional]
  348. batch_normalize=1
  349. filters=256
  350. size=1
  351. stride=1
  352. pad=1
  353. activation=mish
  354. [route]
  355. layers = -2
  356. [convolutional]
  357. batch_normalize=1
  358. filters=256
  359. size=1
  360. stride=1
  361. pad=1
  362. activation=mish
  363. [convolutional]
  364. batch_normalize=1
  365. filters=256
  366. size=1
  367. stride=1
  368. pad=1
  369. activation=mish
  370. [convolutional]
  371. batch_normalize=1
  372. filters=256
  373. size=3
  374. stride=1
  375. pad=1
  376. activation=mish
  377. [shortcut]
  378. from=-3
  379. activation=linear
  380. [convolutional]
  381. batch_normalize=1
  382. filters=256
  383. size=1
  384. stride=1
  385. pad=1
  386. activation=mish
  387. [convolutional]
  388. batch_normalize=1
  389. filters=256
  390. size=3
  391. stride=1
  392. pad=1
  393. activation=mish
  394. [shortcut]
  395. from=-3
  396. activation=linear
  397. [convolutional]
  398. batch_normalize=1
  399. filters=256
  400. size=1
  401. stride=1
  402. pad=1
  403. activation=mish
  404. [convolutional]
  405. batch_normalize=1
  406. filters=256
  407. size=3
  408. stride=1
  409. pad=1
  410. activation=mish
  411. [shortcut]
  412. from=-3
  413. activation=linear
  414. [convolutional]
  415. batch_normalize=1
  416. filters=256
  417. size=1
  418. stride=1
  419. pad=1
  420. activation=mish
  421. [convolutional]
  422. batch_normalize=1
  423. filters=256
  424. size=3
  425. stride=1
  426. pad=1
  427. activation=mish
  428. [shortcut]
  429. from=-3
  430. activation=linear
  431. [convolutional]
  432. batch_normalize=1
  433. filters=256
  434. size=1
  435. stride=1
  436. pad=1
  437. activation=mish
  438. [convolutional]
  439. batch_normalize=1
  440. filters=256
  441. size=3
  442. stride=1
  443. pad=1
  444. activation=mish
  445. [shortcut]
  446. from=-3
  447. activation=linear
  448. [convolutional]
  449. batch_normalize=1
  450. filters=256
  451. size=1
  452. stride=1
  453. pad=1
  454. activation=mish
  455. [convolutional]
  456. batch_normalize=1
  457. filters=256
  458. size=3
  459. stride=1
  460. pad=1
  461. activation=mish
  462. [shortcut]
  463. from=-3
  464. activation=linear
  465. [convolutional]
  466. batch_normalize=1
  467. filters=256
  468. size=1
  469. stride=1
  470. pad=1
  471. activation=mish
  472. [convolutional]
  473. batch_normalize=1
  474. filters=256
  475. size=3
  476. stride=1
  477. pad=1
  478. activation=mish
  479. [shortcut]
  480. from=-3
  481. activation=linear
  482. [convolutional]
  483. batch_normalize=1
  484. filters=256
  485. size=1
  486. stride=1
  487. pad=1
  488. activation=mish
  489. [convolutional]
  490. batch_normalize=1
  491. filters=256
  492. size=3
  493. stride=1
  494. pad=1
  495. activation=mish
  496. [shortcut]
  497. from=-3
  498. activation=linear
  499. [convolutional]
  500. batch_normalize=1
  501. filters=256
  502. size=1
  503. stride=1
  504. pad=1
  505. activation=mish
  506. [route]
  507. layers = -1,-28
  508. [convolutional]
  509. batch_normalize=1
  510. filters=512
  511. size=1
  512. stride=1
  513. pad=1
  514. activation=mish
  515. # Downsample
  516. [convolutional]
  517. batch_normalize=1
  518. filters=1024
  519. size=3
  520. stride=2
  521. pad=1
  522. activation=mish
  523. [convolutional]
  524. batch_normalize=1
  525. filters=512
  526. size=1
  527. stride=1
  528. pad=1
  529. activation=mish
  530. [route]
  531. layers = -2
  532. [convolutional]
  533. batch_normalize=1
  534. filters=512
  535. size=1
  536. stride=1
  537. pad=1
  538. activation=mish
  539. [convolutional]
  540. batch_normalize=1
  541. filters=512
  542. size=1
  543. stride=1
  544. pad=1
  545. activation=mish
  546. [convolutional]
  547. batch_normalize=1
  548. filters=512
  549. size=3
  550. stride=1
  551. pad=1
  552. activation=mish
  553. [shortcut]
  554. from=-3
  555. activation=linear
  556. [convolutional]
  557. batch_normalize=1
  558. filters=512
  559. size=1
  560. stride=1
  561. pad=1
  562. activation=mish
  563. [convolutional]
  564. batch_normalize=1
  565. filters=512
  566. size=3
  567. stride=1
  568. pad=1
  569. activation=mish
  570. [shortcut]
  571. from=-3
  572. activation=linear
  573. [convolutional]
  574. batch_normalize=1
  575. filters=512
  576. size=1
  577. stride=1
  578. pad=1
  579. activation=mish
  580. [convolutional]
  581. batch_normalize=1
  582. filters=512
  583. size=3
  584. stride=1
  585. pad=1
  586. activation=mish
  587. [shortcut]
  588. from=-3
  589. activation=linear
  590. [convolutional]
  591. batch_normalize=1
  592. filters=512
  593. size=1
  594. stride=1
  595. pad=1
  596. activation=mish
  597. [convolutional]
  598. batch_normalize=1
  599. filters=512
  600. size=3
  601. stride=1
  602. pad=1
  603. activation=mish
  604. [shortcut]
  605. from=-3
  606. activation=linear
  607. [convolutional]
  608. batch_normalize=1
  609. filters=512
  610. size=1
  611. stride=1
  612. pad=1
  613. activation=mish
  614. [route]
  615. layers = -1,-16
  616. [convolutional]
  617. batch_normalize=1
  618. filters=1024
  619. size=1
  620. stride=1
  621. pad=1
  622. activation=mish
  623. ##########################
  624. [convolutional]
  625. batch_normalize=1
  626. filters=512
  627. size=1
  628. stride=1
  629. pad=1
  630. activation=leaky
  631. [convolutional]
  632. batch_normalize=1
  633. size=3
  634. stride=1
  635. pad=1
  636. filters=1024
  637. activation=leaky
  638. [convolutional]
  639. batch_normalize=1
  640. filters=512
  641. size=1
  642. stride=1
  643. pad=1
  644. activation=leaky
  645. ### SPP ###
  646. [maxpool]
  647. stride=1
  648. size=5
  649. [route]
  650. layers=-2
  651. [maxpool]
  652. stride=1
  653. size=9
  654. [route]
  655. layers=-4
  656. [maxpool]
  657. stride=1
  658. size=13
  659. [route]
  660. layers=-1,-3,-5,-6
  661. ### End SPP ###
  662. [convolutional]
  663. batch_normalize=1
  664. filters=512
  665. size=1
  666. stride=1
  667. pad=1
  668. activation=leaky
  669. [convolutional]
  670. batch_normalize=1
  671. size=3
  672. stride=1
  673. pad=1
  674. filters=1024
  675. activation=leaky
  676. [convolutional]
  677. batch_normalize=1
  678. filters=512
  679. size=1
  680. stride=1
  681. pad=1
  682. activation=leaky
  683. [convolutional]
  684. batch_normalize=1
  685. filters=256
  686. size=1
  687. stride=1
  688. pad=1
  689. activation=leaky
  690. [upsample]
  691. stride=2
  692. [route]
  693. layers = 85
  694. [convolutional]
  695. batch_normalize=1
  696. filters=256
  697. size=1
  698. stride=1
  699. pad=1
  700. activation=leaky
  701. [route]
  702. layers = -1, -3
  703. [convolutional]
  704. batch_normalize=1
  705. filters=256
  706. size=1
  707. stride=1
  708. pad=1
  709. activation=leaky
  710. [convolutional]
  711. batch_normalize=1
  712. size=3
  713. stride=1
  714. pad=1
  715. filters=512
  716. activation=leaky
  717. [convolutional]
  718. batch_normalize=1
  719. filters=256
  720. size=1
  721. stride=1
  722. pad=1
  723. activation=leaky
  724. [convolutional]
  725. batch_normalize=1
  726. size=3
  727. stride=1
  728. pad=1
  729. filters=512
  730. activation=leaky
  731. [convolutional]
  732. batch_normalize=1
  733. filters=256
  734. size=1
  735. stride=1
  736. pad=1
  737. activation=leaky
  738. [convolutional]
  739. batch_normalize=1
  740. filters=128
  741. size=1
  742. stride=1
  743. pad=1
  744. activation=leaky
  745. [upsample]
  746. stride=2
  747. [route]
  748. layers = 54
  749. [convolutional]
  750. batch_normalize=1
  751. filters=128
  752. size=1
  753. stride=1
  754. pad=1
  755. activation=leaky
  756. [route]
  757. layers = -1, -3
  758. [convolutional]
  759. batch_normalize=1
  760. filters=128
  761. size=1
  762. stride=1
  763. pad=1
  764. activation=leaky
  765. [convolutional]
  766. batch_normalize=1
  767. size=3
  768. stride=1
  769. pad=1
  770. filters=256
  771. activation=leaky
  772. [convolutional]
  773. batch_normalize=1
  774. filters=128
  775. size=1
  776. stride=1
  777. pad=1
  778. activation=leaky
  779. [convolutional]
  780. batch_normalize=1
  781. size=3
  782. stride=1
  783. pad=1
  784. filters=256
  785. activation=leaky
  786. [convolutional]
  787. batch_normalize=1
  788. filters=128
  789. size=1
  790. stride=1
  791. pad=1
  792. activation=leaky
  793. ##########################
  794. [convolutional]
  795. batch_normalize=1
  796. size=3
  797. stride=1
  798. pad=1
  799. filters=256
  800. activation=leaky
  801. [convolutional]
  802. size=1
  803. stride=1
  804. pad=1
  805. filters=255
  806. activation=linear
  807. [yolo]
  808. mask = 0,1,2
  809. anchors = 12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401
  810. classes=80
  811. num=9
  812. jitter=.3
  813. ignore_thresh = .7
  814. truth_thresh = 1
  815. scale_x_y = 1.2
  816. iou_thresh=0.213
  817. cls_normalizer=1.0
  818. iou_normalizer=0.07
  819. iou_loss=ciou
  820. nms_kind=greedynms
  821. beta_nms=0.6
  822. max_delta=5
  823. [route]
  824. layers = -4
  825. [convolutional]
  826. batch_normalize=1
  827. size=3
  828. stride=2
  829. pad=1
  830. filters=256
  831. activation=leaky
  832. [route]
  833. layers = -1, -16
  834. [convolutional]
  835. batch_normalize=1
  836. filters=256
  837. size=1
  838. stride=1
  839. pad=1
  840. activation=leaky
  841. [convolutional]
  842. batch_normalize=1
  843. size=3
  844. stride=1
  845. pad=1
  846. filters=512
  847. activation=leaky
  848. [convolutional]
  849. batch_normalize=1
  850. filters=256
  851. size=1
  852. stride=1
  853. pad=1
  854. activation=leaky
  855. [convolutional]
  856. batch_normalize=1
  857. size=3
  858. stride=1
  859. pad=1
  860. filters=512
  861. activation=leaky
  862. [convolutional]
  863. batch_normalize=1
  864. filters=256
  865. size=1
  866. stride=1
  867. pad=1
  868. activation=leaky
  869. [convolutional]
  870. batch_normalize=1
  871. size=3
  872. stride=1
  873. pad=1
  874. filters=512
  875. activation=leaky
  876. [convolutional]
  877. size=1
  878. stride=1
  879. pad=1
  880. filters=255
  881. activation=linear
  882. [yolo]
  883. mask = 3,4,5
  884. anchors = 12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401
  885. classes=80
  886. num=9
  887. jitter=.3
  888. ignore_thresh = .7
  889. truth_thresh = 1
  890. scale_x_y = 1.1
  891. iou_thresh=0.213
  892. cls_normalizer=1.0
  893. iou_normalizer=0.07
  894. iou_loss=ciou
  895. nms_kind=greedynms
  896. beta_nms=0.6
  897. max_delta=5
  898. [route]
  899. layers = -4
  900. [convolutional]
  901. batch_normalize=1
  902. size=3
  903. stride=2
  904. pad=1
  905. filters=512
  906. activation=leaky
  907. [route]
  908. layers = -1, -37
  909. [convolutional]
  910. batch_normalize=1
  911. filters=512
  912. size=1
  913. stride=1
  914. pad=1
  915. activation=leaky
  916. [convolutional]
  917. batch_normalize=1
  918. size=3
  919. stride=1
  920. pad=1
  921. filters=1024
  922. activation=leaky
  923. [convolutional]
  924. batch_normalize=1
  925. filters=512
  926. size=1
  927. stride=1
  928. pad=1
  929. activation=leaky
  930. [convolutional]
  931. batch_normalize=1
  932. size=3
  933. stride=1
  934. pad=1
  935. filters=1024
  936. activation=leaky
  937. [convolutional]
  938. batch_normalize=1
  939. filters=512
  940. size=1
  941. stride=1
  942. pad=1
  943. activation=leaky
  944. [convolutional]
  945. batch_normalize=1
  946. size=3
  947. stride=1
  948. pad=1
  949. filters=1024
  950. activation=leaky
  951. [convolutional]
  952. size=1
  953. stride=1
  954. pad=1
  955. filters=255
  956. activation=linear
  957. [yolo]
  958. mask = 6,7,8
  959. anchors = 12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401
  960. classes=80
  961. num=9
  962. jitter=.3
  963. ignore_thresh = .7
  964. truth_thresh = 1
  965. random=1
  966. scale_x_y = 1.05
  967. iou_thresh=0.213
  968. cls_normalizer=1.0
  969. iou_normalizer=0.07
  970. iou_loss=ciou
  971. nms_kind=greedynms
  972. beta_nms=0.6
  973. max_delta=5

yolov4-tiny.cfg如下:

  1. [net]
  2. # Testing
  3. #batch=1
  4. #subdivisions=1
  5. # Training
  6. batch=64
  7. subdivisions=1
  8. width=416
  9. height=416
  10. channels=3
  11. momentum=0.9
  12. decay=0.0005
  13. angle=0
  14. saturation = 1.5
  15. exposure = 1.5
  16. hue=.1
  17. learning_rate=0.00261
  18. burn_in=1000
  19. max_batches = 500200
  20. policy=steps
  21. steps=400000,450000
  22. scales=.1,.1
  23. [convolutional]
  24. batch_normalize=1
  25. filters=32
  26. size=3
  27. stride=2
  28. pad=1
  29. activation=leaky
  30. [convolutional]
  31. batch_normalize=1
  32. filters=64
  33. size=3
  34. stride=2
  35. pad=1
  36. activation=leaky
  37. [convolutional]
  38. batch_normalize=1
  39. filters=64
  40. size=3
  41. stride=1
  42. pad=1
  43. activation=leaky
  44. [route]
  45. layers=-1
  46. groups=2
  47. group_id=1
  48. [convolutional]
  49. batch_normalize=1
  50. filters=32
  51. size=3
  52. stride=1
  53. pad=1
  54. activation=leaky
  55. [convolutional]
  56. batch_normalize=1
  57. filters=32
  58. size=3
  59. stride=1
  60. pad=1
  61. activation=leaky
  62. [route]
  63. layers = -1,-2
  64. [convolutional]
  65. batch_normalize=1
  66. filters=64
  67. size=1
  68. stride=1
  69. pad=1
  70. activation=leaky
  71. [route]
  72. layers = -6,-1
  73. [maxpool]
  74. size=2
  75. stride=2
  76. [convolutional]
  77. batch_normalize=1
  78. filters=128
  79. size=3
  80. stride=1
  81. pad=1
  82. activation=leaky
  83. [route]
  84. layers=-1
  85. groups=2
  86. group_id=1
  87. [convolutional]
  88. batch_normalize=1
  89. filters=64
  90. size=3
  91. stride=1
  92. pad=1
  93. activation=leaky
  94. [convolutional]
  95. batch_normalize=1
  96. filters=64
  97. size=3
  98. stride=1
  99. pad=1
  100. activation=leaky
  101. [route]
  102. layers = -1,-2
  103. [convolutional]
  104. batch_normalize=1
  105. filters=128
  106. size=1
  107. stride=1
  108. pad=1
  109. activation=leaky
  110. [route]
  111. layers = -6,-1
  112. [maxpool]
  113. size=2
  114. stride=2
  115. [convolutional]
  116. batch_normalize=1
  117. filters=256
  118. size=3
  119. stride=1
  120. pad=1
  121. activation=leaky
  122. [route]
  123. layers=-1
  124. groups=2
  125. group_id=1
  126. [convolutional]
  127. batch_normalize=1
  128. filters=128
  129. size=3
  130. stride=1
  131. pad=1
  132. activation=leaky
  133. [convolutional]
  134. batch_normalize=1
  135. filters=128
  136. size=3
  137. stride=1
  138. pad=1
  139. activation=leaky
  140. [route]
  141. layers = -1,-2
  142. [convolutional]
  143. batch_normalize=1
  144. filters=256
  145. size=1
  146. stride=1
  147. pad=1
  148. activation=leaky
  149. [route]
  150. layers = -6,-1
  151. [maxpool]
  152. size=2
  153. stride=2
  154. [convolutional]
  155. batch_normalize=1
  156. filters=512
  157. size=3
  158. stride=1
  159. pad=1
  160. activation=leaky
  161. ##################################
  162. [convolutional]
  163. batch_normalize=1
  164. filters=256
  165. size=1
  166. stride=1
  167. pad=1
  168. activation=leaky
  169. [convolutional]
  170. batch_normalize=1
  171. filters=512
  172. size=3
  173. stride=1
  174. pad=1
  175. activation=leaky
  176. [convolutional]
  177. size=1
  178. stride=1
  179. pad=1
  180. filters=255
  181. activation=linear
  182. [yolo]
  183. mask = 3,4,5
  184. anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319
  185. classes=80
  186. num=6
  187. jitter=.3
  188. scale_x_y = 1.05
  189. cls_normalizer=1.0
  190. iou_normalizer=0.07
  191. iou_loss=ciou
  192. ignore_thresh = .7
  193. truth_thresh = 1
  194. random=0
  195. resize=1.5
  196. nms_kind=greedynms
  197. beta_nms=0.6
  198. [route]
  199. layers = -4
  200. [convolutional]
  201. batch_normalize=1
  202. filters=128
  203. size=1
  204. stride=1
  205. pad=1
  206. activation=leaky
  207. [upsample]
  208. stride=2
  209. [route]
  210. layers = -1, 23
  211. [convolutional]
  212. batch_normalize=1
  213. filters=256
  214. size=3
  215. stride=1
  216. pad=1
  217. activation=leaky
  218. [convolutional]
  219. size=1
  220. stride=1
  221. pad=1
  222. filters=255
  223. activation=linear
  224. [yolo]
  225. mask = 1,2,3
  226. anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319
  227. classes=80
  228. num=6
  229. jitter=.3
  230. scale_x_y = 1.05
  231. cls_normalizer=1.0
  232. iou_normalizer=0.07
  233. iou_loss=ciou
  234. ignore_thresh = .7
  235. truth_thresh = 1
  236. random=0
  237. resize=1.5
  238. nms_kind=greedynms
  239. beta_nms=0.6

最开始的时候还是蛮喜欢这种形式的,非常的简洁,直接使用Darknet框架训练也很方便,到后面随着模型改进各种组件的替换,Darknet变得越发不适用了。YOLOv4的话感觉定位相比于v3和v5来说比较尴尬一些,git里面搜索yolov4,结果如下所示:

排名第一的项目是pytorch-YOLOv4,地址在这里,如下所示:

从说明里面来看,这个只是一个minimal的实现:

官方的实现应该是:

仔细看的话会发现,官方这里提供了YOLOv3风格的实现项目以及YOLOv5风格的实现项目,本文主要是以YOLOv3风格的YOLOv4项目为基准来讲解完整的实践流程,项目地址在这里,如下所示:

首先下载所需要的项目,如下:

下载到本地解压缩后,如下所示:

网上直接百度下载这两个weights文件放在weights目录下,如下所示:

然后随便复制过来一个自己之前yolov5项目的数据集放在当前项目目录下,我是前面刚好基于yolov5做了钢铁缺陷检测项目,数据集可以直接拿来用,如果没有现成的数据集的话可以看我签名yolov5的超详细教程里面可以按照步骤自己创建数据集即可。如下所示:

这里我选择的是基于yolov4-tiny版本的模型来进行开发训练,为的就是计算速度能够更快一些。

修改train.py里面的内容,如下所示:

  1. parser = argparse.ArgumentParser()
  2. parser.add_argument('--weights', type=str, default='weights/yolov4-tiny.weights', help='initial weights path')
  3. parser.add_argument('--cfg', type=str, default='cfg/yolov4-tiny.cfg', help='model.yaml path')
  4. parser.add_argument('--data', type=str, default='data/self.yaml', help='data.yaml path')
  5. parser.add_argument('--hyp', type=str, default='data/hyp.scratch.yaml', help='hyperparameters path')
  6. parser.add_argument('--epochs', type=int, default=100)
  7. parser.add_argument('--batch-size', type=int, default=8, help='total batch size for all GPUs')
  8. parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='[train, test] image sizes')
  9. parser.add_argument('--rect', action='store_true', help='rectangular training')
  10. parser.add_argument('--resume', nargs='?', const=True, default=False, help='resume most recent training')
  11. parser.add_argument('--nosave', action='store_true', help='only save final checkpoint')
  12. parser.add_argument('--notest', action='store_true', help='only test final epoch')
  13. parser.add_argument('--noautoanchor', action='store_true', help='disable autoanchor check')
  14. parser.add_argument('--evolve', action='store_true', help='evolve hyperparameters')
  15. parser.add_argument('--bucket', type=str, default='', help='gsutil bucket')
  16. parser.add_argument('--cache-images', action='store_true', help='cache images for faster training')
  17. parser.add_argument('--image-weights', action='store_true', help='use weighted image selection for training')
  18. parser.add_argument('--device', default='0', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
  19. parser.add_argument('--multi-scale', action='store_true', help='vary img-size +/- 50%%')
  20. parser.add_argument('--single-cls', action='store_true', help='train as single-class dataset')
  21. parser.add_argument('--adam', action='store_true', help='use torch.optim.Adam() optimizer')
  22. parser.add_argument('--sync-bn', action='store_true', help='use SyncBatchNorm, only available in DDP mode')
  23. parser.add_argument('--local_rank', type=int, default=-1, help='DDP parameter, do not modify')
  24. parser.add_argument('--log-imgs', type=int, default=16, help='number of images for W&B logging, max 100')
  25. parser.add_argument('--workers', type=int, default=8, help='maximum number of dataloader workers')
  26. parser.add_argument('--project', default='runs/train', help='save to project/name')
  27. parser.add_argument('--name', default='exp', help='save to project/name')
  28. parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
  29. opt = parser.parse_args()

终端直接执行:

python train.py

即可。

当然也可以选择基于参数指定的形式启动,如下:

python train.py --device 0 --batch-size 16 --img 640 640 --data self.yaml --cfg cfg/yolov4-tiny.cfg --weights 'weights/yolov4-tiny.weights' --name yolov4-tiny

根据个人喜好来选择即可。

启动训练终端输出如下所示:

训练完成截图如下所示:

 训练完成我们来看下结果文件,如下所示:

可以看到:结果文件直观来看跟yolov5项目差距还是很大的,评估指标只有一个PR图,所以如果是做论文的话最好还是使用yolov5来做会好点。

PR曲线如下所示:

训练可视化如下所示:

LABEL数据可视化如下所示:

weights目录如下所示:

这个跟yolov5项目差异也是很大的,yolov5项目只有两个pt文件,一个是最优的一个是最新的,但是yolov4项目居然产生了19个文件,保存的可以说是非常详细了有点像yolov7,但是比v7维度更多一些。

感兴趣的话都可以按照我上面的教程步骤开发构建自己的目标检测模型。

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/盐析白兔/article/detail/573037
推荐阅读
相关标签
  

闽ICP备14008679号