Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-1813

Support Arabic PDF extraction

    XMLWordPrintableJSON

Details

    Description

      Extraction of Arabic text from PDF files is supported by tika/pdfbox, but we don't have the optional dependency to do it.

      Attachments

        1. arabic.pdf
          12 kB
          Robert Muir
        2. icu4j-4_2_1.jar
          6.05 MB
          Robert Muir
        3. SOLR-1813.patch
          2 kB
          Robert Muir

        Activity

          People

            gsingers Grant Ingersoll
            rcmuir Robert Muir
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: