Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18931

Create empty staging directory in partitioned table on insert

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 2.0.2
    • None
    • SQL
    • None

    Description

      CREATE TABLE temp.test_partitioning_4 (
      num string
      )
      PARTITIONED BY (
      day string)
      stored as parquet

      On every

      INSERT INTO TABLE temp.test_partitioning_4 PARTITION (day)
      select day, count as num from
      hss.session where year=2016 and month=4
      group by day

      new directory ".hive-staging_hive_2016-12-19_15-55-11_298_3412488541559534475-4" created on HDFS. It's big issue, because I insert every day and bunch of empty dirs on HDFS is very bad for HDFS.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              epahomov Egor Pahomov
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: