Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Not A Problem
    • Affects Version/s: 2.0.2
    • Fix Version/s: None
    • Component/s: FontBox, Text extraction
    • Labels:
      None

      Description

      Hi Team,
      I have two PDF in Gujarati language but font is Different, 1st PDF have Shruti font and 2nd PDF have LMG-RUPE font, Shruti read correctly in tika parser and it gives me a correct output, but LMG-RUPE pdf gives me a worng output. Metadata is same for both pdf.
      1) https://drive.google.com/open?id=0B4Sse_x7pvrqRnRETzNsUk1BY0k (Shruti font)
      2) https://drive.google.com/open?id=0B4Sse_x7pvrqVC0zb2NqTzNvYVU (LMG-RUPE font)

        Attachments

        1. PDFBOX-3445-rupen-debugger.png
          118 kB
          Tilman Hausherr
        2. PDFBOX-3445-rupen.pdf
          48 kB
          Tilman Hausherr

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                gopalbhalala gopalbhalala
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: