Tika
  1. Tika
  2. TIKA-699

Automatic checks against backwards-incompatible API changes

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.0
    • Component/s: None
    • Labels:
      None

      Description

      As we get closer to 1.x we should add tooling like the Maven Clirr plugin [1] to guard against accidental backwards-incompatible API changes.

      [1] http://mojo.codehaus.org/clirr-maven-plugin/

        Activity

        Jukka Zitting created issue -
        Hide
        Michael McCandless added a comment -

        +1, this sounds awesome.

        We could really use this in Lucene too!

        Show
        Michael McCandless added a comment - +1, this sounds awesome. We could really use this in Lucene too!
        Hide
        Jukka Zitting added a comment -

        The attached patch adds the required clirr-maven-plugin configuration.

        It currently reports the following problems:

        [ERROR] org.apache.tika.io.ByteArrayOutputStream: Class org.apache.tika.io.ByteArrayOutputStream removed
        [ERROR] org.apache.tika.io.IOUtils: Method 'public java.io.InputStream toBufferedInputStream(java.io.InputStream)' has been removed
        [ERROR] org.apache.tika.metadata.MSOffice: Changed type of field LAST_PRINTED from java.lang.String to org.apache.tika.metadata.Property
        [ERROR] org.apache.tika.metadata.MSOffice: Changed type of field LAST_SAVED from java.lang.String to org.apache.tika.metadata.Property
        [ERROR] org.apache.tika.sax.SecureContentHandler: Parameter 2 of 'public SecureContentHandler(org.xml.sax.ContentHandler, org.apache.tika.io.CountingInputStream)' has changed its type to org.apache.tika.io.TikaInputStream

        The first two are from revision 1125422 for TIKA-375 to get rid of unused code.

        The second two are from revision 1100053 for TIKA-656 to get properly typed metadata keys.

        The last one is from revision 1124385 for TIKA-645 to avoid extra layers of stream wrapping.

        All of these could we worked around fairly easily to restore full backwards compatibility to Tika 0.9. The question is whether we want to do so, especially since the jump from 0.x to 1.x offers a clean point for getting rid of old baggage in the API.

        If we don't want to fix these issues, then we should commit this change only after 1.0 is released, and use the 1.0 release as the reference point for future compatibility checks.

        Show
        Jukka Zitting added a comment - The attached patch adds the required clirr-maven-plugin configuration. It currently reports the following problems: [ERROR] org.apache.tika.io.ByteArrayOutputStream: Class org.apache.tika.io.ByteArrayOutputStream removed [ERROR] org.apache.tika.io.IOUtils: Method 'public java.io.InputStream toBufferedInputStream(java.io.InputStream)' has been removed [ERROR] org.apache.tika.metadata.MSOffice: Changed type of field LAST_PRINTED from java.lang.String to org.apache.tika.metadata.Property [ERROR] org.apache.tika.metadata.MSOffice: Changed type of field LAST_SAVED from java.lang.String to org.apache.tika.metadata.Property [ERROR] org.apache.tika.sax.SecureContentHandler: Parameter 2 of 'public SecureContentHandler(org.xml.sax.ContentHandler, org.apache.tika.io.CountingInputStream)' has changed its type to org.apache.tika.io.TikaInputStream The first two are from revision 1125422 for TIKA-375 to get rid of unused code. The second two are from revision 1100053 for TIKA-656 to get properly typed metadata keys. The last one is from revision 1124385 for TIKA-645 to avoid extra layers of stream wrapping. All of these could we worked around fairly easily to restore full backwards compatibility to Tika 0.9. The question is whether we want to do so, especially since the jump from 0.x to 1.x offers a clean point for getting rid of old baggage in the API. If we don't want to fix these issues, then we should commit this change only after 1.0 is released, and use the 1.0 release as the reference point for future compatibility checks.
        Jukka Zitting made changes -
        Field Original Value New Value
        Attachment 0001-TIKA-699-Automatic-checks-against-backwards-incompat.patch [ 12491774 ]
        Hide
        Jukka Zitting added a comment -

        Added checks for tika-core in revision 1179318.

        Show
        Jukka Zitting added a comment - Added checks for tika-core in revision 1179318.
        Jukka Zitting made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Assignee Jukka Zitting [ jukkaz ]
        Fix Version/s 1.0 [ 12317967 ]
        Resolution Fixed [ 1 ]

          People

          • Assignee:
            Jukka Zitting
            Reporter:
            Jukka Zitting
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development