Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
1.20, 1.19.1
-
None
-
None
-
Reproduced on Windows 2012 R2 and Ubuntu 18.04.
Java: jdk1.8.0_151
Description
I have an application that extracts text from multiple files on a file share. I've been running into issues with the application running out of memory (~26g dedicated to the heap).
I found in the heap dumps there is a "fDTDDecl" buffer which is creating very large char arrays and never releasing that memory. In the picture you can see the heap dump with 4 SAXParsers holding onto a large chunk of memory. The fourth one is expanded to show it is all being held by the "fDTDDecl" field. This dump is from a scaled down execution (not a 26g heap).
It looks like that DTD field should never be that large, I'm wondering if this is a bug with xerces instead? I can easily reproduce the issue by attempting to extract text from large .pst files.