Uploaded image for project: 'Apache IoTDB'
  1. Apache IoTDB
  2. IOTDB-1809

Squeeze MeasurementSchema

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • Core/Engine
    • None

    Description

      Each timeseries is associated with a MeasurementSchema, which contains the `measurementId`, datatype, encoding, compression type, and properties. However, with limited numbers of data types, encodings, compression types, and mostly null properties, MeasurementSchemas are highly redundant.

      To make it more specific, we currently have 7 data types, 9 encodings, and 8 compressions, so there are at most 7*9*8=504 distinguish MeasurementSchemas. However, each timeseries will create its own MeasurementSchema, when there are 1M timeseries, only 0.05% of the MeasurementSchemas are different.

      If we squeeze `measurementId` out of MeasurementSchema, then we can share MeasurementSchema in different timeseries and reduce the number of MeasurementSchema greatly. In the example above, about 1M MeasurementSchema instances will be eliminated, assuming 20 bytes per instance, 20MB memory footprint will be reduced, and the number grows almost linearly with the number of timeseries.

      Attachments

        Activity

          People

            Unassigned Unassigned
            jt2594838 Tian Jiang
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: