Uploaded image for project: 'James Server'
  1. James Server
  2. JAMES-3430

Reduce Cassandra overall load by restructuring messageV2 table

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • master, 3.6.0
    • master
    • cassandra, mailbox
    • None

    Description

      Analysis

      Today messagev2 metadata table is the largest table of our schema (11GB on 6 million messages instalation).

      Reads to it are abnormally long, and this is the query we spend the more time executing while doing IMAP:

      (see attached screenshot)

      That is by far our most expensive query (per row), and that while LWT do not even come into play!

      Looking at table stats:

      Number of partitions (estimate): 6100310
      SSTable Compression Ratio: 0.43905338269472705
      Space used (total): 2037635717
      

      Doing a nodetool status, we realize that total occupied space in Cassandra is of `4.97 GiB`, so messagesv2 table occuppy 40% of total storage space which is a lot compared to other messages metadata tables (~180MB so 10x less).

      Doing a tablehistograms analysis:

      Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
                                    (micros)          (micros)           (bytes)                  
      50%             1.00              0.00            315.85               770                12
      75%             1.00              0.00            545.79               770                12
      95%             1.00              0.00           1131.75              1109                17
      98%             1.00              0.00           1629.72              1331                17
      99%             1.00              0.00           1955.67              1597                20
      Min             0.00              0.00             88.15                30                 0
      Max             1.00              0.00           3379.39           1131752               770
      

      We realize that cell count is high, and that the byte count is high (other message metadata table are between 125 -> 250 bytes so 6 times less).

      Knowing our data model, each message have a set of properties, each composed of a namespace, a name and a value. These are stored as a UDT list, thus is really space inefficient. Compress ratio (see above) do not compensate for this.

      These properties are well defined, set by the StoreMailboxManager and only include:

      • mediaType
      • subType
      • ContentId
      • ContentDescription
      • ContentTrasferEncoding
      • ContentLocation
      • ContentDispositionType
      • ContentDispositionParameters
      • ContentTypeParameters
      • ContentMD5
      • ContentLanguage
      • Charset
      • MimepartDelimites

      Why?

      Here is the conclusion:

      • Slow reads on messagev2 table have a large impact on IMAP performance (and on Cassandra performances)
      • This slowness is due to the corresponding space used on disk (more data = slower reads)
      • This extra space is due to an inefficient storage format of the property fields
      • By restructuring the way we store these properties we can reclaim disk space and thus query read speed

      How?

      • Restructure CassandraMessageDAO (and underlying table) to store known properties in a column instead of a UDT list. Unknown properties shall be rejected.

      This avoids the use of a collection on a critical table thus should significantly fasten related operations.

      A data migration (messagev3) will be needed.

      Definition of done

      • Come up with space occupied per message update within tests for the old and the new
      • IMAP performance tests

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            btellier Benoit Tellier
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment