PDFBox
  1. PDFBox
  2. PDFBOX-568

testextract failure on Linux and Mac OS X

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.0-incubator
    • Fix Version/s: 1.3.1
    • Component/s: Text extraction
    • Labels:
      None

      Description

      As discussed on the mailing list, the extraction test case seems to fail on non-Windows platforms.

      The troublesome test file is ample_fonts_solidconvertor.pdf, and the textextract.log file says the following (^@ is U+0000 and � is U+FFFD):

      Lines differ at index expected:46-253 actual:46-65533
      FAILURE: Line mismatch for file sample_fonts_solidconvertor.pdf at expected line: 8 at actual line: 8
      expected line was: "@V@e^@r^@d^@a^@n^@a^@:@ ^@T@o^@t^@o^@ @j@e^@ @p@o^@k^@u^@s^@n^@ý^@ @t@e^@x^@t^@ @s@ ^A"
      actual line was: "@V@e^@r^@d^@a^@n^@a^@:@ ^@T@o^@t^@o^@ @j@e^@ @p@o^@k^@u^@s^@n^@�@ ^@t@e^@x^@t^@ @s@ ^A"
      Lines differ at index expected:4-253 actual:4-65533
      FAILURE: Line mismatch for file sample_fonts_solidconvertor.pdf at expected line: 10 at actual line: 10
      expected line was: "AY^A~@ý^@á^@í^@é"
      actual line was: "AY^A~@�@�@�^@�"
      Lines differ at index expected:52-253 actual:52-65533
      FAILURE: Line mismatch for file sample_fonts_solidconvertor.pdf at expected line: 11 at actual line: 11
      expected line was: "@S@a^@n^@s^@ @s@e^@r^@i^@f^@:@ ^@T@o^@t^@o^@ @j@e^@ @p@o^@k^@u^@s^@n^@ý^@ @t@e^@x^@t^@ @s@ ^A"
      actual line was: "@S@a^@n^@s^@ @s@e^@r^@i^@f^@:@ ^@T@o^@t^@o^@ @j@e^@ @p@o^@k^@u^@s^@n^@�@ ^@t@e^@x^@t^@ @s@ ^A"
      Lines differ at index expected:4-253 actual:4-65533
      FAILURE: Line mismatch for file sample_fonts_solidconvertor.pdf at expected line: 13 at actual line: 13
      expected line was: "AY^A~@ý^@á^@í^@é"
      actual line was: "AY^A~@�@�@�^@�"
      Preparing to parse sample_fonts_solidconvertor.pdf for sorted test
      Lines differ at index expected:46-253 actual:46-65533
      FAILURE: Line mismatch for file sample_fonts_solidconvertor.pdf at expected line: 8 at actual line: 8
      expected line was: "@V@e^@r^@d^@a^@n^@a^@:@ ^@T@o^@t^@o^@ @j@e^@ @p@o^@k^@u^@s^@n^@ý^@ @t@e^@x^@t^@ @s@ ^A"
      actual line was: "@V@e^@r^@d^@a^@n^@a^@:@ ^@T@o^@t^@o^@ @j@e^@ @p@o^@k^@u^@s^@n^@�@ ^@t@e^@x^@t^@ @s@ ^A"
      Lines differ at index expected:0-253 actual:0-65533
      FAILURE: Line mismatch for file sample_fonts_solidconvertor.pdf at expected line: 10 at actual line: 10
      expected line was: "@á^@í^@é"
      actual line was: "@�@�@�@�"
      Lines differ at index expected:52-253 actual:52-65533
      FAILURE: Line mismatch for file sample_fonts_solidconvertor.pdf at expected line: 11 at actual line: 11
      expected line was: "@S@a^@n^@s^@ @s@e^@r^@i^@f^@:@ ^@T@o^@t^@o^@ @j@e^@ @p@o^@k^@u^@s^@n^@ý^@ @t@e^@x^@t^@ @s@ ^A"
      actual line was: "@S@a^@n^@s^@ @s@e^@r^@i^@f^@:@ ^@T@o^@t^@o^@ @j@e^@ @p@o^@k^@u^@s^@n^@�@ ^@t@e^@x^@t^@ @s@ ^A"
      Lines differ at index expected:4-253 actual:4-65533
      FAILURE: Line mismatch for file sample_fonts_solidconvertor.pdf at expected line: 13 at actual line: 13
      expected line was: "A~^AY@ý^@á^@í^@é"
      actual line was: "A~^AY@�@�@�^@�"

        Activity

        No work has yet been logged on this issue.

          People

          • Assignee:
            Unassigned
            Reporter:
            Jukka Zitting
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development