XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.2.0, 3.2.0
    • Labels:
      None

      Description

      There's a potential problem with storing the guideposts as a VARBINARY ARRAY, as pointed out by PHOENIX-1329. We'd run into this issue if we're collecting stats for a table with a trailing VARBINARY row key column if the value contained embedded null bytes. Because of this, we're better off storing guideposts as VARBINARY and serializing/deserializing in the following manner:
      <byte length as vint><bytes><byte length as vint><bytes>...

      We should also store as a separate KeyValue column the total number of guideposts. So the schema of SYSTEM.STATS would look like this now instead:

          public static final String CREATE_STATS_TABLE_METADATA = 
                  "CREATE TABLE " + SYSTEM_CATALOG_SCHEMA + ".\"" + SYSTEM_STATS_TABLE + "\"(\n" +
                  // PK columns
                  PHYSICAL_NAME  + " VARCHAR NOT NULL," +
                  COLUMN_FAMILY + " VARCHAR," +
                  REGION_NAME + " VARCHAR," +
                  GUIDE_POSTS  + " VARBINARY," +
                  GUIDE_POSTS_COUNT + " SMALLINT," +
                  MIN_KEY + " VARBINARY," + 
                  MAX_KEY + " VARBINARY," +
                  LAST_STATS_UPDATE_TIME+ " DATE, "+
                  "CONSTRAINT " + SYSTEM_TABLE_PK_NAME + " PRIMARY KEY ("
                  + PHYSICAL_NAME + ","
                  + COLUMN_FAMILY + ","+ REGION_NAME+"))\n" +
                  // TODO: should we support versioned stats?
                  // Install split policy to prevent a physical table's stats from being split across regions.
                  HTableDescriptor.SPLIT_POLICY + "='" + MetaDataSplitPolicy.class.getName() + "'\n";
      

      Then the serialization code in StatisticsTable.addStats() would need to change to populate the GUIDE_POSTS_COUNT and serialize the GUIDE_POSTS in the new format.

      The deserialization code is isolated to StatisticsUtil.readStatisitics(). It would need to read the GUIDE_POSTS_COUNT first for estimated sizing, and then deserialize the GUIDE_POSTS in the new format.

        Attachments

        1. Phoenix-1333_1.patch
          50 kB
          ramkrishna.s.vasudevan
        2. PHOENIX-1333_2.patch
          52 kB
          James Taylor
        3. Phoenix-1333.patch
          15 kB
          ramkrishna.s.vasudevan

          Activity

            People

            • Assignee:
              ram_krish ramkrishna.s.vasudevan
              Reporter:
              jamestaylor James Taylor
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: