Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-164

Improve query performance of optiq over mongo

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      optiq-mongo's aggregation framework pipeline could be optimized.

      Queries like "select <projectItems> from < tableExpression> where < booleanExpression>" should let "where <booleanExpression>" (i.e. {$match:{...}}) go in front of "select <projectItems>" (i.e. {$project:{...}}). This could allow indexes to be used so that mongo doesn't need to scan the whole collection, which will very likely improve query speed.

      optiq command:
      ```
      select * from visits where user_id='test_user_id' limit 10;
      ```

      current mongo command:
      ```
      db.visits.runCommand("aggregate",{
      pipeline: [
      { $project:

      { date: 1.0, user_id: 1.0, 'dim_user': 1.0, 'dim_tech': 1.0 }

      },
      { $match:

      { user_id: "test_user_id" }

      },
      ],
      explain:true}
      );
      ```
      result:
      ```
      "allPlans" : [
      {
      "cursor" : "BasicCursor",
      "n" : 1989102,
      "nscannedObjects" : 1989102,
      "nscanned" : 1989102,
      "indexBounds" : {

      }
      }
      ]
      ```

      proposed mongo command:
      ```
      db.visits.runCommand("aggregate",{
      pipeline: [
      { $match:

      { user_id: "test_user_id" }

      },
      { $project:

      { date: 1.0, user_id: 1.0, 'dim_user': 1.0, 'dim_tech': 1.0 }

      }
      ],
      explain:true}
      );
      ```
      result:
      ```
      "allPlans" : [
      {
      "cursor" : "BtreeCursor user_id_1_date_1",
      "n" : 394,
      "nscannedObjects" : 394,
      "nscanned" : 394,
      "indexBounds" : {
      "user_id" : [
      [
      "test_user_id",
      "test_user_id"
      ]
      ],
      "date" : [
      [

      { "$minElement" : 1 }

      ,

      { "$maxElement" : 1 }

      ]
      ]
      }
      }
      ]
      ```

      Btw, the "limit 10" doesn't work for me. It doesn't add a {$limit:10} pipeline.

      ---------------- Imported from GitHub ----------------
      Url: https://github.com/julianhyde/optiq/issues/164
      Created by: tikazyq
      Labels: performance,
      Assignee: julianhyde
      Created at: Sun Mar 02 17:10:31 CET 2014
      State: closed

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              github-import GitHub Import
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: