Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-164

Improve query performance of optiq over mongo

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • None
    • None

    Description

      optiq-mongo's aggregation framework pipeline could be optimized.

      Queries like "select <projectItems> from < tableExpression> where < booleanExpression>" should let "where <booleanExpression>" (i.e. {$match:{...}}) go in front of "select <projectItems>" (i.e. {$project:{...}}). This could allow indexes to be used so that mongo doesn't need to scan the whole collection, which will very likely improve query speed.

      optiq command:
      ```
      select * from visits where user_id='test_user_id' limit 10;
      ```

      current mongo command:
      ```
      db.visits.runCommand("aggregate",{
      pipeline: [
      { $project:

      { date: 1.0, user_id: 1.0, 'dim_user': 1.0, 'dim_tech': 1.0 }

      },
      { $match:

      { user_id: "test_user_id" }

      },
      ],
      explain:true}
      );
      ```
      result:
      ```
      "allPlans" : [
      {
      "cursor" : "BasicCursor",
      "n" : 1989102,
      "nscannedObjects" : 1989102,
      "nscanned" : 1989102,
      "indexBounds" : {

      }
      }
      ]
      ```

      proposed mongo command:
      ```
      db.visits.runCommand("aggregate",{
      pipeline: [
      { $match:

      { user_id: "test_user_id" }

      },
      { $project:

      { date: 1.0, user_id: 1.0, 'dim_user': 1.0, 'dim_tech': 1.0 }

      }
      ],
      explain:true}
      );
      ```
      result:
      ```
      "allPlans" : [
      {
      "cursor" : "BtreeCursor user_id_1_date_1",
      "n" : 394,
      "nscannedObjects" : 394,
      "nscanned" : 394,
      "indexBounds" : {
      "user_id" : [
      [
      "test_user_id",
      "test_user_id"
      ]
      ],
      "date" : [
      [

      { "$minElement" : 1 }

      ,

      { "$maxElement" : 1 }

      ]
      ]
      }
      }
      ]
      ```

      Btw, the "limit 10" doesn't work for me. It doesn't add a {$limit:10} pipeline.

      ---------------- Imported from GitHub ----------------
      Url: https://github.com/julianhyde/optiq/issues/164
      Created by: tikazyq
      Labels: performance,
      Assignee: julianhyde
      Created at: Sun Mar 02 17:10:31 CET 2014
      State: closed

      Attachments

        Activity

          People

            Unassigned Unassigned
            github-import GitHub Import
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: