Uploaded image for project: 'Apache AsterixDB'
  1. Apache AsterixDB
  2. ASTERIXDB-2339

Improve Inverted Index Merge Performance

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Implemented
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: STO - Storage
    • Labels:
      None

      Description

      Currently, the merge of inverted index is implemented by a full range scan, i.e., token+key pairs are generated and fed into a priority queue to obtain a global ordering. However, it is typical that a token can correspond to tens or hundreds (or even much more) keys. As a result, comparisons of tokens are wasted because for many times tokens would be the same. To improve this, we can have two priority queues, one for tokens and one for keys. For each token, we merge their inverted lists using the key priority queue. After that, we fetch the next token from the token queue, and merge their inverted lists again.

        Attachments

          Activity

            People

            • Assignee:
              luochen01 Chen Luo
              Reporter:
              luochen01 Chen Luo
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: