Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1703

Can't Specify Tesseract Data Folder Distinct from Tesseract Executable Path

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.9
    • Fix Version/s: 1.11
    • Component/s: parser
    • Labels:
      None

      Description

      If a user specifies the path to the Tesseract executable using TesseractOCRConfig.setTesseractPath, then Tika will assume that the Tesseract config folder (usually referred to as the 'tessdata' folder) is in the same location. This is usually true in a Windows environment, where everything is installed into a central location.

      However, this is not necessarily the case in a Linux environment. If one were to build Tesseract from source, for example, the config folder will be installed in a different location than the Tesseract executable.

      One way to fix this would be to add a way to specify the location of the Tesseract config folder separate from the path to the executable.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                chrismattmann Chris A. Mattmann
                Reporter:
                taidan19 Christian Wolfe
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: