Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-159

carbon should support primary key & keep mapping table table_property

    XMLWordPrintableJSON

Details

    • Important

    Description

      As we know , carbon support MDK index , according the design ,if we have filter or filter combination on the left side columns , we can get a good performance .
      but if the leading key is a high cardinality column (>100million cardinality etc), only the filter on leading key can gain good performance, the filter on following columns and other high cardinality columns can not , because the they are close to un-sort .
      i suggest we add one key mapping function , the table property will look like :
      create table (low cardinality column to high cardinality column)
      table_property(
      primary_key h_col3,
      index_key_mapping(h_col1,h_col2)
      )

      low cardinality-> high cardinality
      col1,col2,col3,col4.....col10,h_col1,h_col2,h_col3

      during data loading , carbon will create a internal index table A,it will records all the (values --> position) of primary_key,look like:
      h_col3 list of block let
      18682114091 [blockid1+blokletid1],[blockid4+blokletid10]....
      18683343442 [blockid2+blokletid4],[blockid23+blokletid5]....
      ... .....

      and will create another two key mapping table:
      table 1:
      ---------------------------------------
      h_col2 hcol3
      jarray 18682114091
      ramana 18683343442
      ...... .......

      table2:
      -----------------------------------------
      h_col1 hcol3
      77647 18682114091
      99899 18683343442
      ...... .......

      1)if the filter on col1-col10, will use original MDK capacity ;
      2)if the filter on h_col1, system will scan index table to get the block let position , then use it to fetch the data directly;
      3)if the filter on h_col2 or h_col3 , system first scan the key mapping table to get the primary key list , then 2)

      Attachments

        Activity

          People

            Unassigned Unassigned
            jarray888 Heng Qiu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 720h
                720h
                Remaining:
                Remaining Estimate - 720h
                720h
                Logged:
                Time Spent - Not Specified
                Not Specified