Uploaded image for project: 'Chukwa (retired)'
  1. Chukwa (retired)
  2. CHUKWA-700

Revisit Chukwa metrics schema design for HBase

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.6.0
    • None
    • Data Collection
    • None
    • MacOSX, Java

    Description

      Current Chukwa HBase schema looks like this:

      <timestamp>-<primaryKey>   <columnFamily>:<cell>...
      

      Monotonic increasing timestamp can not evenly distribute across region servers without special handle and care periodically.

      It is time to revise the schema, and proposed schema looks like this:

      <hhddmmyyyy>-<primaryId>  cf:<cell>...
      

      Timestamp is stored with cell, row key helps to split data by hour, and a full hour of metrics is stored on the same row. PrimaryKey is replaced with hash id of the primary key. Metrics tables to aggregate metrics:

      chukwaMetrics -> chukwaMetricsMonthly -> chukwaMetricsYearly

      Attachments

        Activity

          People

            Unassigned Unassigned
            eyang Eric Yang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: