Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-3995

Bulk insert row writer perf improvements

    XMLWordPrintableJSON

Details

    Description

      EDIT
      ====

      While investigating, perf hits in the Bulk Insert a few issues were found:

      1. NonPartitionedKeyGenerator does not implement `getRecordKey`, `getParititionKey` for `InternalRow`, leading to invocation of default implementation converting row to Avro.
      2. HUDI-3993: Using UDF to fetch record keys, similarly has to deserialize `InternalRow` into deserialized `Row`

       

      Attachments

        Issue Links

          Activity

            People

              shivnarayan sivabalan narayanan
              shivnarayan sivabalan narayanan
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 12h
                  12h
                  Remaining:
                  Remaining Estimate - 12h
                  12h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified