[IOTDB-87] Improve Oveflow File Reader to save memory - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.10.0
Component/s: None
Labels:
- discussion
- memory-control

Description

Hi, after reading source code of `SeriesReaderFactory.createUnSeqMergeReader`, I think the implement may take too much memory, and there is an un-useful read stream being opened.

In this function, when you call `chunkLoader.getChunk(chunkMetaData)`, a complete Chunk (with its raw data) will be loaded into the memory.

I think (but I am not sure whether it is good), we can only store ChunkMetaData in memory (and use a concise format in memory, which means just leaving useful info) and read at most one page for each Chunk ( if a Chunk's time is less than other chunks, we do not to read it), it will be fine..

Suppose a page size is 64K, and in the worst case, each Chunk only contains one page, then if we have 1TB overflow data, there will be 4 million ChunkMetadata, and if we keep them in memory concisely (seems just the start time and the offset in the file are needed), the memory space will be less than 4*(8+8+8)MB=96MB. (In this case, the total memory cost of the first page of each chunk is about 64K*4M=256GB.... so we can not keep all of them in memory...).

Does anyone have good idea? ( kakayu )

By the way, in the class of `EngineChunkReader`, the file input stream (TsFileSequenceReader) is meaningless, so we do not need to keep it as a field.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: xiangdong Huang

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 25/Apr/19 17:32

Updated:: 02/Feb/20 11:20

Resolved:: 02/Feb/20 11:20