Uploaded image for project: 'ORC'
  1. ORC
  2. ORC-24

C++ reader for direct string encodings occasionally skips bytes

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.0.0
    • C++
    • None

    Description

      The ORC C++ direct string column reader can occasionally skip bytes in the blob stream.

      The necessary conditions are:

      • The column is a string column and is directly encoded.
      • The blob stream for the row batch crosses a compression block boundary.
      • There is a null value toward the end of the block boundary.
      • The value in the length value of the null value crosses the block boundary, but the length value of the following value does not.

      Attachments

        1. ORC-24.patch
          0.6 kB
          Aliaksei Sandryhaila

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            omalley Owen O'Malley
            omalley Owen O'Malley
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment