Uploaded image for project: 'Apache AsterixDB'
  1. Apache AsterixDB
  2. ASTERIXDB-1698

Secondary index doesn't follow the compaction policy

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • STO - Storage
    • None
    • master : 4819ea44723b87a68406d248782861cf6e5d3305

    Description

      Here is the ddl for the dataset:

      create dataset ds_tweet(typeTweet) if not exists primary key id using compaction policy prefix (("max-mergable-component-size"="134217728"),("max-tolerance-component-count"="10")) with filter on create_at ;
      create index text_idx if not exists on ds_tweet("text") type keyword;
      

      In this case, I want to create a smaller component around 128M. During the data ingestion phase, it works well, and the size of each text_idx component is also small (~80M each). I assume it also followed the component size constraint?

      After ingestion, I found that I needed to build another index,

      create index time_idx if not exists on ds_tweet(create_at) type btree;
      

      When it finished, I found that this time_idx didn't follow the constraint and ended up with one giant 1.2G component on each partition.

      Attachments

        Activity

          People

            imaxon Ian Maxon
            javierjia Jianfeng Jia
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: