2015-07-16 16:24:25 +02:00
|
|
|
tesseract
|
|
|
|
=========
|
|
|
|
|
|
|
|
![](https://badge.imagelayers.io/vimagick/tesseract:latest.svg)
|
|
|
|
|
2016-01-29 18:36:47 +02:00
|
|
|
[Tesseract][1] is an Open Source OCR engine, available under the Apache 2.0
|
2015-07-16 16:24:25 +02:00
|
|
|
license. It can be used directly, or (for programmers) using an API. It
|
|
|
|
supports a wide variety of languages.
|
|
|
|
|
|
|
|
Tesseract doesn't have a built-in GUI, but there are several available from the
|
|
|
|
3rdParty page.
|
|
|
|
|
|
|
|
Quick Start
|
|
|
|
-----------
|
|
|
|
|
2022-02-16 11:17:58 +02:00
|
|
|
```bash
|
|
|
|
$ alias tesseract='docker run --rm -u $(id -u):$(id -g) -v `pwd`:/data -w /data vimagick/tesseract'
|
2019-12-07 07:15:29 +02:00
|
|
|
|
2019-12-07 07:00:54 +02:00
|
|
|
$ tesseract input.png output -l eng --psm 3
|
|
|
|
$ cat output.txt
|
2019-12-07 07:15:29 +02:00
|
|
|
The (quick) [brown] {fox} jumps!
|
|
|
|
|
2019-12-09 02:33:02 +02:00
|
|
|
$ tesseract chinese.jpg stdout -l chi_tra --psm 8 --oem 0
|
2019-12-07 07:15:29 +02:00
|
|
|
學習
|
2015-07-16 16:24:25 +02:00
|
|
|
```
|
2016-01-29 18:36:47 +02:00
|
|
|
|
|
|
|
[1]: https://github.com/tesseract-ocr/tesseract
|