Uploaded image for project: 'Chukwa'
  1. Chukwa
  2. CHUKWA-700

Revisit Chukwa metrics schema design for HBase

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 0.6.0
    • Fix Version/s: None
    • Component/s: Data Collection
    • Labels:
      None
    • Environment:

      MacOSX, Java

      Description

      Current Chukwa HBase schema looks like this:

      <timestamp>-<primaryKey>   <columnFamily>:<cell>...
      

      Monotonic increasing timestamp can not evenly distribute across region servers without special handle and care periodically.

      It is time to revise the schema, and proposed schema looks like this:

      <hhddmmyyyy>-<primaryId>  cf:<cell>...
      

      Timestamp is stored with cell, row key helps to split data by hour, and a full hour of metrics is stored on the same row. PrimaryKey is replaced with hash id of the primary key. Metrics tables to aggregate metrics:

      chukwaMetrics -> chukwaMetricsMonthly -> chukwaMetricsYearly

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              eyang Eric Yang
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: