Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6101

Optimize Implicit Columns Processing

    XMLWordPrintableJSON

Details

    Description

      Problem Description -

      • Apache Drill allows users to specify columns even for SELECT STAR queries
      • From my discussion with paul-rogers, Apache Calcite has a limitation where the, extra columns are not provided
      • The workaround has been to always include all implicit columns for SELECT STAR queries
      • Unfortunately, the current implementation is very inefficient as implicit column values get duplicated; this leads to substantial performance degradation when the number of rows are large

      Suggested Optimization -

      • The NullableVarChar vector should be enhanced to efficiently store duplicate values
      • This will not only address the current Calcite limitations (for SELECT STAR queries) but also optimize all queries with implicit columns

       

      Attachments

        Issue Links

          Activity

            People

              sachouche Salim Achouche
              sachouche Salim Achouche
              Timothy Farkas Timothy Farkas
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: