Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.6.0
Description
We encountered a problem in the hudi production environment, which is very similar to the HUDI-945 problem.
Software environment: spark 2.4.5, hudi 0.6
Scenario: consume Kafka data and write hudi, using spark streaming (non-StructedStreaming).
Problem: As time goes by, the number of /temp/***** file handles held by the executor process is increasing.
"
/tmp/10ded0f7-1bcc-4316-91e9-9b4d0507e1e0
/tmp/49251680-0efd-4cc4-a55e-1af2038d3900
/tmp/cc7dd284-3444-4c17-a5c8-84b3090c17f9
"
Reason analysis: ExternalSpillableMap is used in HoodieMergeHandle class, and DiskBasedMap is used to flush overflowed data to the disk. But the file stream can only be closed and deleted by the hook when the jvm exits. When the clear method is executed in the program, the stream is not closed and the file is not deleted. As a result, over time, more and more file handles are still held, leading to errors. This error is similar to Hudi-945.
*软件环境:*spark 2.4.5、hudi 0.6
*场景:*消费kafka数据写入hudi,采用spark streaming(非StructedStreaming)。
问题:executor 进程随着时间的推移,所持有的/temp/****文件句柄数越来越多。
"
/tmp/10ded0f7-1bcc-4316-91e9-9b4d0507e1e0
/tmp/49251680-0efd-4cc4-a55e-1af2038d3900
/tmp/cc7dd284-3444-4c17-a5c8-84b3090c17f9
"
*原因分析:*HoodieMergeHandle类中采用ExternalSpillableMap,使用DiskBasedMap将溢出的数据刷新到磁盘上。但是文件流只有在jvm退出的时候通过钩子关闭且删除文件。程序中执行clear方法时,并不关闭流及删除文件。从而导致随着时间推移,越来越多的文件句柄还持有,导致报错。此错误和Hudi-945挺相似的。
Attachments
Issue Links
- Blocked
-
HUDI-945 Cleanup spillable map files eagerly as part of close
- Closed