Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-5862

[Java] Provide dictionary builder

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.15.0
    • Java

    Description

      The dictionary builder servers for the following scenario which is frequently encountered in practice when dictionary encoding is involved: the dictionary values are not known a priori, so they are determined dynamically, as new data arrive continually.

      In particular, when a new value arrives, it is tested to check if it is already in the dictionary. If so, it is simply neglected, otherwise, it is added to the dictionary.

      When all values have been evaluated, the dictionary can be considered complete. So encoding can start afterward.

      The code snippet using a dictionary builder should be like this:

      DictonaryBuilder<IntVector> dictionaryBuilder = ...
      dictionaryBuilder.startBuild();
      ...
      dictionaryBuild.addValue(newValue);
      ...
      dictionaryBuilder.endBuild();

      Attachments

        Issue Links

          Activity

            People

              fan_li_ya Liya Fan
              fan_li_ya Liya Fan
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 6h 40m
                  6h 40m