Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-3365

Support Apache arrow vector filling from carbondata SDK

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.6.0
    • Component/s: None
    • Labels:
      None

      Description

      Background: 
      As we know Apache arrow is a cross-language development platform for 
      in-memory data, It specifies a standardised language-independent columnar 
      memory format for flat and hierarchical data, organised for efficient 
      analytic operations on modern hardware. 
      So, By integrating carbon to support filling arrow vector, contents read by 
      carbondata files can be used for analytics in any programming language. say 
      arrow vector filled from carbon java SDK can be read by python, c, c++ and 
      many other languages supported by arrow. 
      This will also increase the scope for carbondata use-cases and carbondata 
      can be used for various applications as arrow is integrated already with 
      many query engines. 
      Implementation: 
      Stage1: 
      After SDK reading the carbondata file, convert carbon rows and fill the 
      arrow vector. 
      Stage2: 
      Deep integration with carbon vector; for this, currently carbon SDK vector 
      doesn't support filling complex columns. 
      After supporting this, arrow vector can be wrapped around carbon SDK vector 
      for deep integration. 

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              Ajantha_Bhat Ajantha Bhat
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 12h 10m
                12h 10m