KNIME Image Processing - Tesseract (OCR) Extension

The KNIME Tesseract (OCR) integration enables Optical Character Recognition (OCR) in KNIME. OCR means, that text on images can be converted into characters, which then can be processed, e.g. with the KNIME TextMining Extension. Please note that this integration is still in a BETA state and we are happy for any feedback. The extension is based on the open source OCR engine Tesseract OCR, which was originally developed by HP Labs between 1985 and 1995 and now by Google since 2005. For integration, this extension bases on the Tess4J Library.

Important Note: The Tess4J Integration does not work on Mac, yet. We are working on it!

Download

You can install the KNIME Tesseract (OCR) Integration from the nightly or stable community contributions updaet-site. See

<< Community Contributions >>

Examples

An example workflow can be found on the Example Server or in the NodeGuide.

Contributors and Contact

Jonathan Hale (University of Konstanz)
Christian Dietz (University of Konstanz)