Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-1512

TextPositionComparator is not compatible with Java 7

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.7.1, 2.0.0
    • 1.8.8, 2.0.0
    • Text extraction
    • None
    • Java 7

    Description

      The TextPostionCompartor causes the following exception running on Java 7: Unexpected RuntimeException from org.apache.tika.parser.ParserDecorator$1@9007fa2 Original cause: Comparison method violates its general contract!

      I think the problem is with this check:

      if ( yDifference < .1 ||
      (pos2YBottom >= pos1YTop && pos2YBottom <= pos1YBottom) ||
      (pos1YBottom >= pos2YTop && pos1YBottom <= pos2YBottom))

      as it violates the contract requirement:

      The implementor must also ensure that the relation is transitive: ((compare(x, y)>0) && (compare(y, z)>0)) implies compare(x, z)>0.

      Finally, the implementor must ensure that compare(x, y)==0 implies that sgn(compare(x, z))==sgn(compare(y, z)) for all z.

      Java 7 now is strict and throws exceptions when the contract is violated.

      Attachments

        1. FOP-2252.pdf
          205 kB
          Tilman Hausherr
        2. illustration-of-inconsistent-sorting.png
          3 kB
          Hannes Erven
        3. immo-kurier_arsenal_93x62.pdf
          1.63 MB
          Hannes Erven
        4. quicksort.patch
          8 kB
          Uwe
        5. TextPositionComparator.java
          3 kB
          Benjamin Papez
        6. Topo.pdf
          17 kB
          Maruan Sahyoun
        7. Topo.txt
          0.0 kB
          Maruan Sahyoun
        8. TopoContained.pdf
          19 kB
          Maruan Sahyoun
        9. TopoContained.txt
          0.0 kB
          Maruan Sahyoun
        10. TopoOverlap.pdf
          18 kB
          Maruan Sahyoun
        11. TopoOverlap.txt
          0.0 kB
          Maruan Sahyoun
        12. WFI_PDFParser_TextPostionComparator.txt
          3 kB
          SCHAEFER B.S.

        Issue Links

          Activity

            People

              lehmi Andreas Lehmkühler
              bpapez Benjamin Papez
              Votes:
              12 Vote for this issue
              Watchers:
              20 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: