Uploaded image for project: 'Apache AsterixDB'
  1. Apache AsterixDB
  2. ASTERIXDB-1249

Self index join chooses wrong probe/index branch

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • COMP - Compiler
    • None

    Description

      DDLs:

      drop dataverse test if exists;
      create dataverse test;
      use dataverse test;
      
      create type TwitterUserType as closed {
          screen-name: string,
          lang: string,
          friends-count: int64,
          statuses-count: int64,
          name: string,
          followers-count: int64
      }
      
      create type TweetMessageType as closed {
          tweetid: int64,
              user: TwitterUserType,
              sender-location: point,
          send-time: datetime,
              referred-topics: {{ string }},
          message-text: string,
          countA: int64,
          countB: int64
      }
      
      create dataset TweetMessages(TweetMessageType)
      primary key tweetid;
      
      create index twmSndLocIx on TweetMessages(sender-location) type rtree;
      create index msgCountAIx on TweetMessages(countA) type btree;
      create index msgCountBIx on TweetMessages(countB) type btree;
      create index msgTextIx on TweetMessages(message-text) type keyword;
      

      Query:

      for $t1 in dataset('TweetMessages')
      for $t2 in dataset('TweetMessages')
      let $n :=  create-circle($t1.sender-location, 0.5)
      where spatial-intersect($t2.sender-location, $n) 
      order by $t2.tweetid 
      return {"tweetid2":$t2.tweetid, "loc2":$t2.sender-location};
      

      Optimized plan:

      distribute result [%0->$$10]
      -- DISTRIBUTE_RESULT  |PARTITIONED|
        exchange 
        -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
          project ([$$10])
          -- STREAM_PROJECT  |PARTITIONED|
            assign [$$10] <- [function-call: asterix:closed-record-constructor, Args:[AString: {tweetid2}, %0->$$15, AString: {loc2}, %0->$$13]]
            -- ASSIGN  |PARTITIONED|
              exchange 
              -- SORT_MERGE_EXCHANGE [$$15(ASC) ]  |PARTITIONED|
                order (ASC, %0->$$15) 
                -- STABLE_SORT [$$15(ASC)]  |PARTITIONED|
                  exchange 
                  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                    project ([$$13, $$15])
                    -- STREAM_PROJECT  |PARTITIONED|
                      select (function-call: asterix:spatial-intersect, Args:[%0->$$13, function-call: asterix:create-circle, Args:[function-call: asterix:field-access-by-index, Args:[%0->$$0, AInt32: {2}], ADouble: {0.5}]])
                      -- STREAM_SELECT  |PARTITIONED|
                        project ([$$0, $$13, $$15])
                        -- STREAM_PROJECT  |PARTITIONED|
                          exchange 
                          -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                            unnest-map [$$14, $$0] <- function-call: asterix:index-search, Args:[AString: {TweetMessages}, AInt32: {0}, AString: {test}, AString: {TweetMessages}, ABoolean: {true}, ABoolean: {false}, ABoolean: {false}, AInt32: {1}, %0->$$27, AInt32: {1}, %0->$$27, TRUE, TRUE, TRUE]
                            -- BTREE_SEARCH  |PARTITIONED|
                              exchange 
                              -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                order (ASC, %0->$$27) 
                                -- STABLE_SORT [$$27(ASC)]  |PARTITIONED|
                                  exchange 
                                  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                    project ([$$27, $$13, $$15])
                                    -- STREAM_PROJECT  |PARTITIONED|
                                      exchange 
                                      -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                        unnest-map [$$23, $$24, $$25, $$26, $$27] <- function-call: asterix:index-search, Args:[AString: {twmSndLocIx}, AInt32: {1}, AString: {test}, AString: {TweetMessages}, ABoolean: {true}, ABoolean: {false}, ABoolean: {true}, AInt32: {4}, %0->$$19, %0->$$20, %0->$$21, %0->$$22]
                                        -- RTREE_SEARCH  |PARTITIONED|
                                          exchange 
                                          -- BROADCAST_EXCHANGE  |PARTITIONED|
                                            assign [$$19, $$20, $$21, $$22] <- [function-call: asterix:create-mbr, Args:[%0->$$13, AInt32: {2}, AInt32: {0}], function-call: asterix:create-mbr, Args:[%0->$$13, AInt32: {2}, AInt32: {1}], function-call: asterix:create-mbr, Args:[%0->$$13, AInt32: {2}, AInt32: {2}], function-call: asterix:create-mbr, Args:[%0->$$13, AInt32: {2}, AInt32: {3}]]
                                            -- ASSIGN  |PARTITIONED|
                                              project ([$$13, $$15])
                                              -- STREAM_PROJECT  |PARTITIONED|
                                                assign [$$13] <- [function-call: asterix:field-access-by-index, Args:[%0->$$1, AInt32: {2}]]
                                                -- ASSIGN  |PARTITIONED|
                                                  exchange 
                                                  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                                    data-scan []<-[$$15, $$1] <- test:TweetMessages
                                                    -- DATASOURCE_SCAN  |PARTITIONED|
                                                      exchange 
                                                      -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                                        empty-tuple-source
                                                        -- EMPTY_TUPLE_SOURCE  |PARTITIONED|
      

      The optimized plan is incorrect — the index search doesn't use the right join condition and hence the result is different from expected.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            wangsaeu Taewoo Kim
            buyingyi Yingyi Bu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment