Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-7957

data skew when writing with bulk_insert + bucket_index enabled

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • spark-sql

    Description

      as  https://github.com/apache/hudi/issues/11565 say, when use bulk insert as row if table is bucket, data will skew, because of the partitioner algorithm

      Attachments

        Issue Links

          Activity

            People

              knightchess Knight Chess
              knightchess Knight Chess
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: