Uploaded image for project: 'James Server'
  1. James Server
  2. JAMES-2852

Optimizing CassandraBlobStore deleteBucket

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Blob, cassandra
    • None

    Description

      Currently CassandraBlobStore needs to iterate on all blobs of a current bucket in order to delete a bucket.

      This was our design considerations:

      We avoided "wide row" issue - many blobs being stored in the same buckets the maximum size of a cell would have been exceeded - and optimize data repartition in a cluster. For these reasons, we had to choose a primary key that has a finner granularity than just the bucket - we choosed to rely on the bucket and the object identifier. This leads to a slow operation upon deleting bucket as all blobns not in default bucket needs to be iterated on.

      The only usage so far is the vault, which currently relies on 13 buckets, hence the over-head introduced is reasonable.

      However, this cost will increase as we expand our usage of buckets.

      Later on, we could introduce a time serie for retrieving easily blobs stored in a bucket and avoiding iterating non related blobs.

      Attachments

        Activity

          People

            Unassigned Unassigned
            btellier Benoit Tellier
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: