Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-2777

Data import performance deteriorates because multiple Spark jobs are started when data is written to disks.

    XMLWordPrintableJSON

Details

    Description

      If multiple partitions exist and the final result of RDD.isEmpty is true, Spark starts multiple jobs in 5-fold increment mode. As a result, the computing performance deteriorates.

      Attachments

        Issue Links

          Activity

            People

              liuhe0702 liuhe0702
              liuhe0702 liuhe0702
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: