[TIKA-2366] Add image cropping functionality to TesseractOCRParser - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Trivial
Resolution: Unresolved
Affects Version/s: 1.14
Fix Version/s: None
Component/s: ocr
Labels:
- ImageMagick
- crop
- images
- ocr
- pdf
Environment:

ImageMagick-7.0.5, Tesseract 3.0.5

Description

I am using Tika's TesseractOCRParser to read scanned pdf files. It would be nice if I could utilize ImageMagick's crop command through the TesseractOCRParser so that document headers/footers can be ignored.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Zachary Lee Jones

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 16/May/17 14:19

Updated:: 16/May/17 14:19

Time Tracking

Estimated:

Remaining:

Logged:

Not Specified