Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
StorageEngine-Backlog
Description
Currently, the filename format of each tsfile is {file_created_time}{version_id}{inner_space_merge_num}-{cross_space_merge_num}.tsfile. In one time partition, the order of tsfiles is guaranteed by the version_id, for example, 1651825804093-2-0-0.tsfile is after 1651825804092-1-0-0.tsfile
https://issues.apache.org/jira/projects/IOTDB/issues/IOTDB-3100?filter=myopenissues#
The problem is that filename conflict may occur in the cross space compaction and load scenes. In the cross space compaction, assuming there exists 3-2-0-0.tsfile, 4-3-0-0.tsfile and 5-5-0-0.tsfile in the sequence folder, if file 4-3-0-0.tsfile is selected, compaction cannot generate 3 or more target files because only 2 version_id are left between 2 and 5, so some big target files may be generated. In the load, assuming there exists 3-2-0-0.tsfile, 3-3-0-0.tsfile and 3-3-0-0.tsfile in the sequence folder, no more sequence files cannot be loaded between 3-2-0-0.tsfile and 3-3-0-0.tsfile, they can only be loaded into the unsequence folder.
In response to these problems, the format won't be changed, but the meaning of file_created_time and version_id will be different. Instead of version_id, we use file_created_time to guarantee the order of tsfiles, and if two tsfiles have the same file_created_time, then we use version_id to guarantee the order. This semantics change may afftect query, compaction and load module.
Attachments
Issue Links
- is duplicated by
-
IOTDB-3086 Change TsFile name generation to support load and split tsfiles
- Closed
- links to