Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.16
    • Component/s: None
    • Labels:
      None

      Description

      Hi
      Your project is licensed under Apache License Version 2,
      but your code pulls in code from json.org under Douglas Crockford’s bad licence [1] , and is non-free [2].
      Such usage restriction makes the license incompatible with The Open Source Definition and
      The Free Software Definition. Because Tika binary distribution includes this software,
      it effectively becomes proprietary software itself.
      You may also comment that the json.org license is valid for You but for many Linux distributions it is not acceptable.

      I hope to continue to maintain Tika for Fedora, without having to run into these problems.

      Please try to replace it with one of the many free alternatives.

      Regards

      [1]
      ./tika-1.11/tika-parsers/src/main/java/org/apache/tika/parser/journal/GrobidRESTParser.java
      ./tika-1.11/tika-parsers/src/main/java/org/apache/tika/parser/journal/JournalParser.java
      ./tika-1.11/tika-parsers/src/main/java/org/apache/tika/parser/journal/TEIParser.java

      [2]
      https://wiki.debian.org/qa.debian.org/jsonevil
      http://www.sonatype.com/people/2012/03/use-json-well-youd-better-not-be-evil/
      http://tanguy.ortolo.eu/blog/article46/json-license

      1. deps_new.txt
        171 kB
        Tim Allison

        Issue Links

        There are no Sub-Tasks for this issue.

          Activity

          Hide
          gagravarr Nick Burch added a comment -

          The JSON license has been approved for use by Apache Projects by the ASF Legal Affairs committee, without affecting the license conditions of the overall software, see http://www.apache.org/legal/resolved.html#json

          If you feel that's incorrect, you'd need to take that up with the Legal Affairs committee on legal-discuss@ - they're the ones qualified / charged with deciding on this sort of stuff, not us!

          Show
          gagravarr Nick Burch added a comment - The JSON license has been approved for use by Apache Projects by the ASF Legal Affairs committee, without affecting the license conditions of the overall software, see http://www.apache.org/legal/resolved.html#json If you feel that's incorrect, you'd need to take that up with the Legal Affairs committee on legal-discuss@ - they're the ones qualified / charged with deciding on this sort of stuff, not us!
          Hide
          puntogil gil cattaneo added a comment -

          Hi
          It did not surprise me much, ASL license then follow your intentions it is no longer free.
          Wishing you could deduce that it slipped to a lower category of "B".
          Regards

          Show
          puntogil gil cattaneo added a comment - Hi It did not surprise me much, ASL license then follow your intentions it is no longer free. Wishing you could deduce that it slipped to a lower category of "B". Regards
          Hide
          chrismattmann Chris A. Mattmann added a comment -

          Hi gil cattaneo Nick Burch - Nick answered and beat me to it. The ASF considers this license OK to include in ASF products: http://www.apache.org/legal/resolved.html#json

          Thus it shouldn't be a problem at all.

          Show
          chrismattmann Chris A. Mattmann added a comment - Hi gil cattaneo Nick Burch - Nick answered and beat me to it. The ASF considers this license OK to include in ASF products: http://www.apache.org/legal/resolved.html#json Thus it shouldn't be a problem at all.
          Show
          chrismattmann Chris A. Mattmann added a comment - Per http://www.apache.org/legal/resolved.html#json
          Hide
          gagravarr Nick Burch added a comment -

          The ASF legal team have recently changed their mind on the license (see https://lists.apache.org/thread.html/9627a9278d263378a2045d4bffccb6e83b9f01bb783c6dd6fa325faf@%3Clegal-discuss.apache.org%3E), so we'll now need to change this

          Show
          gagravarr Nick Burch added a comment - The ASF legal team have recently changed their mind on the license (see https://lists.apache.org/thread.html/9627a9278d263378a2045d4bffccb6e83b9f01bb783c6dd6fa325faf@%3Clegal-discuss.apache.org%3E ), so we'll now need to change this
          Hide
          gagravarr Nick Burch added a comment -

          Ted Dunning has produced a hopefully drop-in replacement (based on a suitable library I believe, with method signatures tweaked). We could give this a try, and check everything passes - might be quicker and easier than swapping the whole json implementation...

          <dependency>
            <groupId>com.tdunning</groupId>
            <artifactId>json</artifactId>
            <version>1.0</version>
          </dependency>
          

          Could someone try that and see if all unit tests still pass?

          Show
          gagravarr Nick Burch added a comment - Ted Dunning has produced a hopefully drop-in replacement (based on a suitable library I believe, with method signatures tweaked). We could give this a try, and check everything passes - might be quicker and easier than swapping the whole json implementation... <dependency> <groupId>com.tdunning</groupId> <artifactId>json</artifactId> <version>1.0</version> </dependency> Could someone try that and see if all unit tests still pass?
          Hide
          thammegowda Thamme Gowda added a comment -

          I attempted to fix this, but I hit a blocker:

          org.apache.tika.parser.journal.TEIParser uses org.json.XML.toJSONObject() method. This utility method doesn't exists in the replacement library.

          Almost all the references I found on the internet for "converting XML to JSON string" points to the same code example I am trying to get rid off!

          Show
          thammegowda Thamme Gowda added a comment - I attempted to fix this, but I hit a blocker: org.apache.tika.parser.journal.TEIParser uses org.json.XML.toJSONObject() method. This utility method doesn't exists in the replacement library. Almost all the references I found on the internet for "converting XML to JSON string" points to the same code example I am trying to get rid off!
          Hide
          tallison@mitre.org Tim Allison added a comment -

          Y, I ran into the same problem...it shouldn't be too hard to write a SAX parser for the original XML and get rid of json entirely. I'm frankly lazy and haven't bothered getting the journal parser running locally. If someone could share example xml output, it shouldn't take more than a few hours, I'd think.

          Show
          tallison@mitre.org Tim Allison added a comment - Y, I ran into the same problem...it shouldn't be too hard to write a SAX parser for the original XML and get rid of json entirely. I'm frankly lazy and haven't bothered getting the journal parser running locally. If someone could share example xml output, it shouldn't take more than a few hours, I'd think.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Tika-trunk #1280 (See https://builds.apache.org/job/Tika-trunk/1280/)
          TIKA-1804 – convert json parsing to SAX in TEIParser, step 1: test (tallison: https://github.com/apache/tika/commit/b290cd79d652741a0c8249f3860583b4169f6455)

          • (add) tika-parsers/src/test/resources/test-documents/testTEI.xml
          • (add) tika-parsers/src/test/java/org/apache/tika/parser/journal/TEITest.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Tika-trunk #1280 (See https://builds.apache.org/job/Tika-trunk/1280/ ) TIKA-1804 – convert json parsing to SAX in TEIParser, step 1: test (tallison: https://github.com/apache/tika/commit/b290cd79d652741a0c8249f3860583b4169f6455 ) (add) tika-parsers/src/test/resources/test-documents/testTEI.xml (add) tika-parsers/src/test/java/org/apache/tika/parser/journal/TEITest.java
          Hide
          chris.a.mattmann@jpl.nasa.gov Mattmann, Chris A (388J) added a comment -

          Hi Everyone,

          I will be out of the office 5/29 – 6/6 on Vacation.

          During this time, email and cell can be used for emergencies.

          Thank you.

          Cheers,

          Chris Mattmann

          Show
          chris.a.mattmann@jpl.nasa.gov Mattmann, Chris A (388J) added a comment - Hi Everyone, I will be out of the office 5/29 – 6/6 on Vacation. During this time, email and cell can be used for emergencies. Thank you. Cheers, Chris Mattmann
          Hide
          tallison@mitre.org Tim Allison added a comment -

          Thanks to Nick Burch, I dropped Ted Dunning's replacement in, refactored the TEIParser dramatically, and everything works.

          I noticed that deeplearning4j still uses org.json, so I excluded that, on the theory, that tika-dl requires tika-parsers, which would bring in Ted Dunning's replacement. As long as nothing in dl4j is calling toJSONObject(), we should be ok???

          Anyone know if that'll cause a problem?

          Also, even though I installed grobid and got it running. I get timeout errors on the test file, so I can't test grobid... I did dump TEI.xml from one pdf, and the old and the new are equivalent.

          So, once I make the commit, it would be great if someone with a grobid that works on our test document could try it out.

          Show
          tallison@mitre.org Tim Allison added a comment - Thanks to Nick Burch , I dropped Ted Dunning's replacement in, refactored the TEIParser dramatically, and everything works. I noticed that deeplearning4j still uses org.json, so I excluded that, on the theory, that tika-dl requires tika-parsers, which would bring in Ted Dunning's replacement. As long as nothing in dl4j is calling toJSONObject() , we should be ok??? Anyone know if that'll cause a problem? Also, even though I installed grobid and got it running. I get timeout errors on the test file, so I can't test grobid... I did dump TEI.xml from one pdf, and the old and the new are equivalent. So, once I make the commit, it would be great if someone with a grobid that works on our test document could try it out.
          Hide
          tallison@mitre.org Tim Allison added a comment -

          Looks good to me...

          If someone who can run dl4j and/or grobid with the actual unit tests, that'd be great! I'll leave this open for a few days.

          Show
          tallison@mitre.org Tim Allison added a comment - Looks good to me... If someone who can run dl4j and/or grobid with the actual unit tests, that'd be great! I'll leave this open for a few days.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Jenkins build Tika-trunk #1294 (See https://builds.apache.org/job/Tika-trunk/1294/)
          TIKA-1804 – remove dependency on org.json (tallison: https://github.com/apache/tika/commit/1760249f26536047e54e3932f47910217a6c81f5)

          • (edit) tika-dl/pom.xml
          • (edit) tika-parsers/src/test/java/org/apache/tika/parser/journal/TEITest.java
          • (edit) CHANGES.txt
          • (edit) tika-parsers/src/main/java/org/apache/tika/parser/journal/GrobidRESTParser.java
          • (edit) tika-parsers/pom.xml
          • (edit) tika-parsers/src/main/java/org/apache/tika/parser/ner/corenlp/CoreNLPNERecogniser.java
          • (add) tika-parsers/src/main/java/org/apache/tika/parser/journal/TEIDOMParser.java
          • (delete) tika-parsers/src/main/java/org/apache/tika/parser/journal/TEIParser.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Jenkins build Tika-trunk #1294 (See https://builds.apache.org/job/Tika-trunk/1294/ ) TIKA-1804 – remove dependency on org.json (tallison: https://github.com/apache/tika/commit/1760249f26536047e54e3932f47910217a6c81f5 ) (edit) tika-dl/pom.xml (edit) tika-parsers/src/test/java/org/apache/tika/parser/journal/TEITest.java (edit) CHANGES.txt (edit) tika-parsers/src/main/java/org/apache/tika/parser/journal/GrobidRESTParser.java (edit) tika-parsers/pom.xml (edit) tika-parsers/src/main/java/org/apache/tika/parser/ner/corenlp/CoreNLPNERecogniser.java (add) tika-parsers/src/main/java/org/apache/tika/parser/journal/TEIDOMParser.java (delete) tika-parsers/src/main/java/org/apache/tika/parser/journal/TEIParser.java
          Hide
          tallison@mitre.org Tim Allison added a comment -

          I opened https://github.com/deeplearning4j/deeplearning4j/issues/3561 to request that they swap out json.org.

          Show
          tallison@mitre.org Tim Allison added a comment - I opened https://github.com/deeplearning4j/deeplearning4j/issues/3561 to request that they swap out json.org.
          Hide
          tallison@mitre.org Tim Allison added a comment -

          Adam Gibson swapped in Ted Dunning's code with no problem. Thank you!

          I think we'll be ok with excluding org.json from dl4j's dependency and relying on Ted's code.

          Show
          tallison@mitre.org Tim Allison added a comment - Adam Gibson swapped in Ted Dunning's code with no problem. Thank you! I think we'll be ok with excluding org.json from dl4j's dependency and relying on Ted's code.
          Hide
          agibsonccc Adam Gibson added a comment -

          Hey folks - I double checked the other dl4j components as well. There shouldn't be any issues from here.

          Show
          agibsonccc Adam Gibson added a comment - Hey folks - I double checked the other dl4j components as well. There shouldn't be any issues from here.
          Hide
          tallison@mitre.org Tim Allison added a comment -

          Thank you Adam Gibson! We excluded it from deeplearning4j-keras. mvn dependency:tree doesn't show it appearing in anything else, but to confirm, it isn't brought in by deeplearning4j-modelimport, datavec-data-image or nd4j-native-platform, right?

          Show
          tallison@mitre.org Tim Allison added a comment - Thank you Adam Gibson ! We excluded it from deeplearning4j-keras . mvn dependency:tree doesn't show it appearing in anything else, but to confirm, it isn't brought in by deeplearning4j-modelimport , datavec-data-image or nd4j-native-platform , right?
          Hide
          agibsonccc Adam Gibson added a comment -

          deeplearning4j-core was the only thing bringing that in (that was a transitive dep to model import)

          Show
          agibsonccc Adam Gibson added a comment - deeplearning4j-core was the only thing bringing that in (that was a transitive dep to model import)
          Hide
          tallison@mitre.org Tim Allison added a comment -

          Great. Thank you!

          Show
          tallison@mitre.org Tim Allison added a comment - Great. Thank you!
          Hide
          chrismattmann Chris A. Mattmann added a comment -

          great work Tim & Adam & team!

          Show
          chrismattmann Chris A. Mattmann added a comment - great work Tim & Adam & team!

            People

            • Assignee:
              Unassigned
              Reporter:
              puntogil gil cattaneo
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development