Commons Compress
  1. Commons Compress
  2. COMPRESS-253

BZip2CompressorInputStream reads fewer bytes from truncated file than CPython's bz2 implementation

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.4.1, 1.6, 1.7
    • Fix Version/s: 1.8
    • Component/s: Compressors
    • Labels:

      Description

      Jython includes support for decompressing bz2 files using commons compress and shares regression tests with CPython. The CPython test test_read_truncated in test_bz2.py passes under CPython but fails under Jython.

      The BZip2CompressorInputStream is able to read 769 bytes from the truncated data rather than the 770 bytes that the CPython bz2 implementation can read.

      1. compress-253.patch
        12 kB
        Stefan Bodewig
      2. PythonTruncatedBzip2Test.java
        4 kB
        Indra Talip

        Activity

        Indra Talip created issue -
        Hide
        Indra Talip added a comment - - edited

        Attached file contains test cases translated to commons-compress/junit from the Jython/CPython regression tests for reading truncated files.

        Show
        Indra Talip added a comment - - edited Attached file contains test cases translated to commons-compress/junit from the Jython/CPython regression tests for reading truncated files.
        Indra Talip made changes -
        Field Original Value New Value
        Attachment PythonTruncatedBzip2Test.java [ 12621769 ]
        Hide
        Stefan Bodewig added a comment -

        The reason for this is that BZIp2CompressorInputStream always calculates the next byte when reading one.

        The attached patch (against current trunk) changes this and removes a piece of state at the same time. All existing tests inside Common Compress pass with the change but we are pretty close to release 1.7 and I don't feel well adding the change now without more extensive testing.

        Show
        Stefan Bodewig added a comment - The reason for this is that BZIp2CompressorInputStream always calculates the next byte when reading one. The attached patch (against current trunk) changes this and removes a piece of state at the same time. All existing tests inside Common Compress pass with the change but we are pretty close to release 1.7 and I don't feel well adding the change now without more extensive testing.
        Stefan Bodewig made changes -
        Attachment compress-253.patch [ 12621773 ]
        Stefan Bodewig made changes -
        Labels bzip patch
        Sebb made changes -
        Summary BZip2CompressorInputStream reads less bytes from truncated file than CPython's bz2 implementation BZip2CompressorInputStream reads fewer bytes from truncated file than CPython's bz2 implementation
        Hide
        Indra Talip added a comment -

        With the patch applied to commons-compress all the Jython regression tests from commit b1304c8 on my fork of Jython for the bz2 module pass successfully. Thanks

        Show
        Indra Talip added a comment - With the patch applied to commons-compress all the Jython regression tests from commit b1304c8 on my fork of Jython for the bz2 module pass successfully. Thanks
        Hide
        Stefan Bodewig added a comment -

        fixed with svn revision 1559687

        Show
        Stefan Bodewig added a comment - fixed with svn revision 1559687
        Stefan Bodewig made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s 1.8 [ 12325950 ]
        Resolution Fixed [ 1 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        13d 3h 17m 1 Stefan Bodewig 20/Jan/14 13:26

          People

          • Assignee:
            Unassigned
            Reporter:
            Indra Talip
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development