Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-1512

TextPositionComparator is not compatible with Java 7

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.7.1, 2.0.0
    • Fix Version/s: 1.8.8, 2.0.0
    • Component/s: Text extraction
    • Labels:
      None
    • Environment:
      Java 7

      Description

      The TextPostionCompartor causes the following exception running on Java 7: Unexpected RuntimeException from org.apache.tika.parser.ParserDecorator$1@9007fa2 Original cause: Comparison method violates its general contract!

      I think the problem is with this check:

      if ( yDifference < .1 ||
      (pos2YBottom >= pos1YTop && pos2YBottom <= pos1YBottom) ||
      (pos1YBottom >= pos2YTop && pos1YBottom <= pos2YBottom))

      as it violates the contract requirement:

      The implementor must also ensure that the relation is transitive: ((compare(x, y)>0) && (compare(y, z)>0)) implies compare(x, z)>0.

      Finally, the implementor must ensure that compare(x, y)==0 implies that sgn(compare(x, z))==sgn(compare(y, z)) for all z.

      Java 7 now is strict and throws exceptions when the contract is violated.

        Attachments

        1. TextPositionComparator.java
          3 kB
          Benjamin Papez
        2. immo-kurier_arsenal_93x62.pdf
          1.63 MB
          Hannes Erven
        3. WFI_PDFParser_TextPostionComparator.txt
          3 kB
          SCHAEFER B.S.
        4. FOP-2252.pdf
          205 kB
          Tilman Hausherr
        5. illustration-of-inconsistent-sorting.png
          3 kB
          Hannes Erven
        6. TopoContained.txt
          0.0 kB
          Maruan Sahyoun
        7. TopoContained.pdf
          19 kB
          Maruan Sahyoun
        8. TopoOverlap.txt
          0.0 kB
          Maruan Sahyoun
        9. TopoOverlap.pdf
          18 kB
          Maruan Sahyoun
        10. Topo.txt
          0.0 kB
          Maruan Sahyoun
        11. Topo.pdf
          17 kB
          Maruan Sahyoun
        12. quicksort.patch
          8 kB
          Uwe

          Issue Links

            Activity

              People

              • Assignee:
                lehmi Andreas Lehmkühler
                Reporter:
                bpapez Benjamin Papez
              • Votes:
                12 Vote for this issue
                Watchers:
                23 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: