Details
-
Sub-task
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
-
None
Description
After running integration test ITestS3AFileSystemContract, I found the following items are not cleaned up in DynamoDB:
parent=/mliu-s3guard/user/mliu/s3afilesystemcontract/testRenameDirectoryAsExisting/dir, child=subdir parent=/mliu-s3guard/user/mliu/s3afilesystemcontract/testRenameDirectoryAsExistingNew/newdir/subdir, child=file2
At first I thought it’s similar to HADOOP-14226 or HADOOP-14227, and we need to be careful when cleaning up test data.
Then I found it’s a bug in the code of integrating S3Guard with S3AFileSystem: for rename we miss sub-directory items to put (dest) and delete (src). The reason is that in S3A, we delete those fake directory objects if they are not necessary, e.g. non-empty. So when we list the objects to rename, the object summaries will only return file objects. This has two consequences after rename:
- there will be left items for src path in metadata store - left-overs will confuse get(Path) which should return null
- we are not persisting the whole subtree for dest path to metadata store - this will break the DynamoDBMetadataStore invariant: if a path exists, all its ancestors will also exist in the table.
UPDATE: modified test case ITestS3AFileSystemContract:: testRenameDirectoryAsExistingDirectory() will fail w/o this patch; it passes w/ this patch. If the test case makes sense, the proposal follows.
Existing tests are not complaining about this though. If this is a real bug, let’s address it here.
Attachments
Attachments
Issue Links
- is depended upon by
-
HADOOP-13998 Merge initial S3guard release into trunk
- Resolved
- relates to
-
HADOOP-14255 S3A to delete unnecessary fake directory objects in mkdirs()
- Resolved