Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-62

Format: Are the nulls bits 0 or 1 for null values?

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.1.0
    • Component/s: Format
    • Labels:
      None

      Description

      As brought up by Dan Robinson on the mailing list (thank you for catching this!), there is an inconsistency in the format documents in the representation of nulls with the ValueVectors code import – since I drafted these format documents initially I'll take the blame for the inconsistency, but:

      • Drill / ValueVectors uses the value 0 for null data, and 1 for non-null data
      • The format document currently states the opposite (values are null if the bit is set)

      I can see arguments both ways, but one argument for the ValueVectors style is that values must be explicitly set to be non-null, versus uninitialized values being accidentally interpreted as being non-null. When initializing a bitmap, one can memset the bits to 0, then set then to 1 when non-null values are appended during construction.

        Attachments

          Activity

            People

            • Assignee:
              wesm Wes McKinney
              Reporter:
              wesm Wes McKinney
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: