Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-1598

Encode column names to save space and improve performance

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.10.0
    • Labels:
      None

      Description

      when creating table using phoenix DDL replace the column names that the user give with shorter names to save space. the user will still the full name is his select statements and will get them in the result set but under the hood the infra will translate the names to their sorter version.

      example:
      when creating table with my_column_1, my_column_2 ... the table will be created with a as first column , b as the second one etc'

        Attachments

        1. PHOENIX-1598_master.patch
          1.42 MB
          Samarth Jain
        2. PHOENIX-1598-4.x-HBase-0.98.patch
          1.44 MB
          Samarth Jain
        1.
        Support encoded column qualifiers per column family Sub-task Resolved Samarth Jain
        2.
        Make joins work with encoded column names Sub-task Resolved Samarth Jain
        3.
        Add support for setting a storage scheme at table creation time Sub-task Resolved Samarth Jain
        4.
        Support null when columns have default values for immutable tables with encoding scheme COLUMNS_STORED_IN_SINGLE_CELL Sub-task Resolved Thomas D'Silva
        5.
        Support different encoding schemes (BYTE, SHORT, INTEGER) for storing encoded column qualifiers Sub-task Resolved Samarth Jain
        6.
        Make changes to IndexMaintainer backward compatible Sub-task Resolved Samarth Jain
        7.
        Add a CREATE IMMUTABLE TABLE construct to make immutable tables more explicit Sub-task Resolved Thomas D'Silva
        8.
        Parameterize tests for different encoding and storage schemes Sub-task Resolved Thomas D'Silva
        9.
        Add upgrade code to add the required metadata columns for supporting column encoding Sub-task Resolved Samarth Jain
        10.
        Add COLUMN_ENCODED_BYTES table property Sub-task Resolved Thomas D'Silva
        11.
        Fix bulkload for StorageScheme - ONE_CELL_PER_KEYVALUE_COLUMN Sub-task Resolved Ankit Singhal
        12.
        Data load gets 5-7X slower with mutable sparse columns Sub-task Resolved Samarth Jain
        13.
        Filter on value column for mutable encoded table is > 3X slower compared to non encoded table Sub-task Resolved Samarth Jain
        14.
        Upgrading from 4.8 or before to encodecolumns2 branch fails Sub-task Resolved Samarth Jain
        15.
        Make use of EncodedColumnQualifierCellsList for all column name mapping schemes Sub-task Resolved Samarth Jain
        16.
        Add a test case to test out CREATE TABLE IF NOT EXISTS code path Sub-task Resolved Samarth Jain
        17.
        Change tests extending BaseQueryIT to use unique table names Sub-task Resolved Samarth Jain
        18.
        Optimize BooleanExpressionFilter and ColumnProjectionFilter for tables with encoded columns Sub-task Resolved Samarth Jain
        19.
        Backward compatibility fails for immutable tables after column encoding patch Sub-task Resolved Samarth Jain
        20.
        Backward compatibility fails for joins Sub-task Resolved Samarth Jain
        21.
        Remove testUnfoundSingleColumnCaseStatement from CaseStatementIT Sub-task Resolved Samarth Jain

          Activity

            People

            • Assignee:
              samarthjain Samarth Jain
              Reporter:
              noamb noam bulvik
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: