OCR with OpenCV and Tesseract (Installation Guide)
Introduction
Tesseract OCR provides some powerful algorithms for recognizing text in images. The text can be small, big, or skewed and this is managed through some flags in the algorithm. Another good feature is the ability to extract confidence values not only for the whole text, but to individual words and even letters.
OpenCV can be viewed as the state-of-the-art library for managing anything related to computer vision. It can be used to read in static images or ones from a camera, transform them to a form suitable for the OCR algorithm and apply it.
Note:
Since OpenCV 3.0, the Tesseract OCR has been integrated into the text module. For more information look at the official documentation (most likely here).
Installation
Installing OpenCV is relatively easy. It can be done either by using the version added to the PPA and using a package manager such as synaptic to install it. To compile the library from source, follow the instructions on their website.
To install Tesseract, you can either try
sudo apt-get install libtesseract-dev
Or can compile it from source. To do so, you can use the following instructions (assuming you have 8 virtual core, otherwise calling to make should be done with the number of cores the current platform has):
- Initially, we need to install the base libs
sudo apt-get install libpng-dev libjpeg-dev libtiff-dev zlib1g-dev
sudo apt-get install gcc g++
sudo apt-get install autoconf automake libtools
- Getting Leptonica
wget http://www.leptonica.org/source/leptonica-1.70.tar.gz
tar -zxvf leptonica-1.70.tar.gz
cd leptonica-1.70
./configure
make -j8 -l8
sudo make install
sudo ldconfig
- Finally, installing Tesseract OCR
svn checkout http://tesseract-ocr.googlecode.com/svn/trunk/
cd trunk/
./autogen.sh
./configure
make -j8 -l8
sudo make install
sudo ldconfig
sudo make install-langs
After following the instructions, the commands
pkg-config tesseract --libs
and
pkg-config opencv --libs
should return something similar to
\-L/usr/local/lib -ltesseract
# and
/usr/local/lib/libopencv\_calib3d.so /usr/local/lib/libopencv\_contrib.so /usr/local/lib/libopencv\_core.so /\* and continue with all other modules \*/
Sample Application
Now that we have everything installed, we need to test that it’s working. An easy way would be to construct a sample application. This will be done in the next article!