Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-4810

FileDataStore: support SHA-2



    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: blob
    • Labels:


      The FileDataStore currently uses SHA-1, but that algorithm is deprecated. We should support other algorithms as well (mainly SHA-256).

      Migration should be painless (no long downtime). I think default for writing (if not configured explicitly) could still be SHA-1. But when reading, SHA-256 should also be supported (depending on the identifier). That way, the new Oak version for all repositories (in a cluster + shared datastore) can be installed "slowly".

      After all repositories are running with the new Oak version, the configuration for SHA-256 can be enabled. That way, SHA-256 is used for new binaries. Both SHA-1 and SHA-256 are supported for reading.

      One potential downside is deduplication would suffer a bit if a new Blob with same content is added again as digest based match would fail. That can be mitigated by computing 2 types of digest if need arises. The downsides are some additional file operations and CPU, and slower migration to SHA-256.

      Some other open questions:

      • While we are at it, it might makes senses to additionally support SHA-3 and other algorithms (make it configurable). But the length of the identifier alone might then not be enough information to know what algorithm is used, so maybe add a prefix.
      • The number of subdirectory levels: should we keep it as is, or should we reduce it (for example one level less).


          Issue Links



              • Assignee:
                thomasm Thomas Mueller
              • Votes:
                0 Vote for this issue
                3 Start watching this issue


                • Created: