赞
踩
系统环境:centos7
安装tesseract:
yum-config-manager --add-repo https://download.opensuse.org/repositories/home:/Alexander_Pozdnyakov/CentOS_7/
sudo rpm --import https://build.opensuse.org/projects/home:Alexander_Pozdnyakov/public_key
yum install tesseract
yum install tesseract-langpack-deu
安装php拓展包:
composer require thiagoalessio/tesseract_ocr
注意,php需要开启system函数才能正常使用拓展
使用如下:
require ('./vendor/autoload.php');
use thiagoalessio\TesseractOCR\TesseractOCR;
$ret = (new TesseractOCR('./text.png'))
->run();
var_dump($ret);
这边,我的测试图片是:
输出效果如下:
这时候,是无法识别中文的,我们安装下英文,中文繁体,中文简体 识别库。
wget --no-check-certificate https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata
wget --no-check-certificate https://github.com/tesseract-ocr/tessdata/raw/4.00/chi_sim.traineddata
wget --no-check-certificate https://github.com/tesseract-ocr/tessdata/raw/master/chi_tra.traineddata
其他语言包可以到这边查找:
https://tesseract-ocr.github.io/tessdoc/Data-Files#data-files-for-version-400-november-29-2016
然后移动库到我的安装目录,也就是
cp/mv *.traineddata /usr/local/share/tessdata/
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。