Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-945

Sort algorithm that skips low-cardinality leading edge, such as a salting column



    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:


      Calcite should support a sort algorithm that skips low-cardinality leading edge.

      Consider a table EMP sorted on (gender, empno), where gender has 2 distinct values, and the query

      SELECT * FROM emp ORDER BY empno

      You cannot satisfy the query using the native sort order alone, but you can efficiently merge the two sorted runs.

      The query

      SELECT * FROM emp WHERE empno BETWEEN 100 AND 105 ORDER BY empno

      should be able to use a skip-scan.

      This applies to Phoenix, where "salting" columns are often added to the leading edge of the key. The salting column has a fixed cardinality, less than 256, chosen to ensure the right degree of parallelism. Calcite would not need to model salting columns explicitly, just recognize that they are low cardinality. Phoenix has such a sort algorithm; it just need to be modelled as a RelNode with appropriate cost model.

      This is issue is closely related to CALCITE-873. An equality constraint (i.e. a filter below the sort) ensures that a column entering the sort has cardinality C <= 1, whereas this issue considers where C is small. If you know C <= 1 you don't need a sort, but if C is small you can use a merge.

      Cc James R. Taylor, Wei Xue, Andrew Kyle Purtell.


          Issue Links



              • Assignee:
                julianhyde Julian Hyde
              • Votes:
                0 Vote for this issue
                1 Start watching this issue


                • Created: