Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-5862

[Java] Provide dictionary builder

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.15.0
    • Component/s: Java

      Description

      The dictionary builder servers for the following scenario which is frequently encountered in practice when dictionary encoding is involved: the dictionary values are not known a priori, so they are determined dynamically, as new data arrive continually.

      In particular, when a new value arrives, it is tested to check if it is already in the dictionary. If so, it is simply neglected, otherwise, it is added to the dictionary.

      When all values have been evaluated, the dictionary can be considered complete. So encoding can start afterward.

      The code snippet using a dictionary builder should be like this:

      DictonaryBuilder<IntVector> dictionaryBuilder = ...
      dictionaryBuilder.startBuild();
      ...
      dictionaryBuild.addValue(newValue);
      ...
      dictionaryBuilder.endBuild();

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                fan_li_ya Liya Fan
                Reporter:
                fan_li_ya Liya Fan
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 6h 40m
                  6h 40m