Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-1427

After Splitting Partition, Data doesn't get Divided to Different Partitions.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 1.2.0
    • None
    • data-query
    • None
    • spark 2.1

    Description

      When Performing a Split Partition Query on a Partitioned Table, The data doesn't get affected at all, however, we can see the updated Partitions using the show Partitions Query and the old partition as deleted.

      But the data still remains in that partition, Ideally, the data should be divided as per the new partitions, Which happens after the subsequent loads, the data then gets to the latest partitions.

      Example :
      1. Create Table :
      DROP TABLE IF EXISTS list_partition_table;

      CREATE TABLE list_partition_table(shortField SHORT, intField INT, bigintField LONG, doubleField DOUBLE, timestampField TIMESTAMP, decimalField DECIMAL(18,2), dateField DATE, charField CHAR(5), floatField FLOAT, complexData ARRAY<STRING> ) PARTITIONED BY (stringField STRING) STORED BY 'carbondata' TBLPROPERTIES('PARTITION_TYPE'='LIST', 'LIST_INFO'='Asia, (China, Europe, NoPartition)');

      2. Load Data :
      load data inpath 'hdfs://localhost:54310/CSV/list_partition_table.csv' into table list_partition_table options('FILEHEADER'='shortfield,intfield,bigintfield,doublefield,stringfield,timestampfield,decimalfield,datefield,charfield,floatfield,complexdata', 'COMPLEX_DELIMITER_LEVEL_1'='$','COMPLEX_DELIMITER_LEVEL_2'='#');

      3. Show Partitions :
      show partitions list_partition_table;
      ----------------------------------------------+

      partition

      ----------------------------------------------+

      0, stringfield = DEFAULT
      1, stringfield = Asia
      2, stringfield = China, Europe, NoPartition

      ----------------------------------------------+
      3 rows selected (0.09 seconds)

      4. Split Partition :
      ALTER TABLE list_partition_table SPLIT PARTITION(2) INTO('China', '(Europe, NoPartition)' );

      5. Show Partition :
      show partitions list_partition_table;
      ---------------------------------------+

      partition

      ---------------------------------------+

      0, stringfield = DEFAULT
      1, stringfield = Asia
      3, stringfield = China
      4, stringfield = Europe, NoPartition

      ---------------------------------------+
      4 rows selected (0.065 seconds)

      The partitions get updated , but still the data remains the same(UNPARTITIONED), in the same partition.

      Attachments

        1. list_partition_table.csv
          1 kB
          Neha Bhardwaj
        2. screenshot-1.png
          280 kB
          Cao, Lionel

        Activity

          People

            lucao Cao, Lionel
            nehabhardwaj Neha Bhardwaj
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: