I am running a process that needs to crawl a tree structure containing ~10K images, copy the images to the local disk, process these images, and copy them back to HDFS.
My problem is the following : after about 10h of processing, the processes crash, complaining about a std::bad_alloc exception (I use hadoop pipes to run existing software). When running fuse_dfs in debug mode, I get an outOfMemoryError, telling that there is no more room in the heap.
While the process is running, using top or ps, I notice that fuse is using up an increasing amount of memory, until some limit is reached. At that point , the memory used is oscillating. I suppose that this is due to the use of the virtual memory.
This leads me to the conclusion that there is some memory leak in fuse_dfs, since the only other programs running are Hadoop and the existing software, both thoroughly tested in the past.
My problem is that my knowledge concerning memory leak tracking is rather limited, so I will need some instructions to get more insight concerning this issue.