HADOOP-15621 we implemented TTL for Authoritative Directory Listings and added ExpirableMetadata. DDBPathMetadata extends PathMetadata extends ExpirableMetadata, so all metadata entries in ddb can expire, but the implementation is not done yet.
To complete this feature the following should be done:
- Add new tests for metadata entry and tombstone expiry to ITestS3GuardTtl
- Implement metadata entry and tombstone expiry
I would like to start a debate on whether we need to use separate expiry times for entries and tombstones. My +1 on not using separate settings - so only one config name and value.
HADOOP-13649the metadata TTL is implemented in LocalMetadataStore, using an existing feature in guava's cache implementation. Expiry is set with fs.s3a.s3guard.local.ttl.
- LocalMetadataStore's TTL and this TTL is different. That TTL is using the guava cache's internal solution for the TTL of these entries. This is an S3AFileSystem level solution in S3Guard, a layer above all metadata store.
- This is not the same, and not using the DDB's TTL feature. We need a different behavior than what ddb promises: cleaning once a day with a background job is not usable for this feature - although it can be used as a general cleanup solution separately and independently from S3Guard.
- Use the same ttl for entries and authoritative directory listing
- All entries can be expired. Then the returned metadata from the MS will be null.
- Add two new methods pruneExpiredTtl() and pruneExpiredTtl(String keyPrefix) to MetadataStore interface. These methods will delete all expired metadata from the ms.
- Use last_updated field in ms for both file metadata and authoritative directory expiry.