Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-1512

TextPositionComparator is not compatible with Java 7

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.7.1, 2.0.0
    • 1.8.8, 2.0.0
    • Text extraction
    • None
    • Java 7

    Description

      The TextPostionCompartor causes the following exception running on Java 7: Unexpected RuntimeException from org.apache.tika.parser.ParserDecorator$1@9007fa2 Original cause: Comparison method violates its general contract!

      I think the problem is with this check:

      if ( yDifference < .1 ||
      (pos2YBottom >= pos1YTop && pos2YBottom <= pos1YBottom) ||
      (pos1YBottom >= pos2YTop && pos1YBottom <= pos2YBottom))

      as it violates the contract requirement:

      The implementor must also ensure that the relation is transitive: ((compare(x, y)>0) && (compare(y, z)>0)) implies compare(x, z)>0.

      Finally, the implementor must ensure that compare(x, y)==0 implies that sgn(compare(x, z))==sgn(compare(y, z)) for all z.

      Java 7 now is strict and throws exceptions when the contract is violated.

      Attachments

        1. WFI_PDFParser_TextPostionComparator.txt
          3 kB
          SCHAEFER B.S.
        2. TopoOverlap.txt
          0.0 kB
          Maruan Sahyoun
        3. TopoOverlap.pdf
          18 kB
          Maruan Sahyoun
        4. TopoContained.txt
          0.0 kB
          Maruan Sahyoun
        5. TopoContained.pdf
          19 kB
          Maruan Sahyoun
        6. Topo.txt
          0.0 kB
          Maruan Sahyoun
        7. Topo.pdf
          17 kB
          Maruan Sahyoun
        8. TextPositionComparator.java
          3 kB
          Benjamin Papez
        9. quicksort.patch
          8 kB
          Uwe
        10. immo-kurier_arsenal_93x62.pdf
          1.63 MB
          Hannes Erven
        11. illustration-of-inconsistent-sorting.png
          3 kB
          Hannes Erven
        12. FOP-2252.pdf
          205 kB
          Tilman Hausherr

        Issue Links

          Activity

            People

              lehmi Andreas Lehmkühler
              bpapez Benjamin Papez
              Votes:
              12 Vote for this issue
              Watchers:
              20 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: