Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-3993

Avoid calling into Spark UDF in Bulk Insert

    XMLWordPrintableJSON

Details

    Description

      Currently, invoking into UDF w/in Hudi's Bulk Insert causes  20% perf-gap as compared against raw Parquet Bulk Insert into a non-partitioned table.

       

      Attachments

        Issue Links

          Activity

            People

              alexey.kudinkin Alexey Kudinkin
              alexey.kudinkin Alexey Kudinkin
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: