Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-17301

[C++] Implement compute function "binary_slice"

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 8.0.1
    • 11.0.0
    • C++

    Description

      In some situations, may  request an access method to get binary or sting likes array one or some continuous bytes , for example start 1 end 3 step 1,  the two bytes, it seems like "binary_replace_slice " function, provide byte and code two measurement unit

       

       

      application case:

       
      here, I can give one example to descirbe why need a function to extract binary in byte unit:
       
                In distribute database, data has distribute policy and relatived hash algorithm for different data type, here we just discuss string-like and binary type, the hash algorithm need detach string-like or binary in bytes to calculating, for example , take 1-4 byte cast to integer and shift-left 16 bits, then take 5-6byte cast to integer and the result from last step, and so on, the  'utf8_slice_codeunits' function can partly meet the require if all are ascii,  but if the string-like contain chinese, one chinese may occupied three bytes,  start 1 to end 3, three utf8 character
        may take nine bytes, but it not meet the hash algorithm, it only need 3 bytes, so if provide a function but not cast, the same function arguments like 'utf8_slice_codeunits', it may called 'binary_slice_byteunit'

      Attachments

        Issue Links

          Activity

            People

              kshitij12345 Kshiteej K
              chenbaggio ChenTsing
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h
                  4h