Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-4450

java.lang.OutOfMemoryError when validating pdf

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.13
    • Fix Version/s: 2.0.14, 3.0.0 PDFBox
    • Component/s: Preflight
    • Labels:
      None

      Description

      Getting an out of memory exception when attempting to use preflight to validate pdfs.

       

      Env:

      Linux 64 bit (arch linux)

      Java 8

      java -version
      java version "1.8.0_131"
      Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
      Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)

      JVM args used to test: 

      java -Xmx2048m -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider

       

      PDF that is blowing up 

      lean-from-the-trenches.pdf

       

      Console output
      
      Jan 30, 2019 10:25:58 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
      WARNING: Using fallback font ArialMT for base font Symbol
      Jan 30, 2019 10:25:58 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
      WARNING: Using fallback font ArialMT for base font ZapfDingbats
      Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
      at java.util.Arrays.copyOfRange(Arrays.java:3664)
      at java.lang.String.<init>(String.java:207)
      at java.lang.StringBuilder.toString(StringBuilder.java:407)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1587)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1587)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1587)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1587)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1587)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
      at org.apache.pdfbox.cos.COSDictionary.getDictionaryString(COSDictionary.java:1559)
      at org.apache.pdfbox.cos.COSDictionary.toString(COSDictionary.java:1531)
      at org.apache.pdfbox.preflight.xobject.XObjFormValidator.checkGroup(XObjFormValidator.java:138)
      at org.apache.pdfbox.preflight.xobject.XObjFormValidator.validate(XObjFormValidator.java:73)
      at org.apache.pdfbox.preflight.process.reflect.GraphicObjectPageValidationProcess.validate(GraphicObjectPageValidationProcess.java:74)
      at org.apache.pdfbox.preflight.utils.ContextHelper.callValidation(ContextHelper.java:84)
      at org.apache.pdfbox.preflight.utils.ContextHelper.validateElement(ContextHelper.java:57)
      at org.apache.pdfbox.preflight.process.reflect.ResourcesValidationProcess.validateXObjects(ResourcesValidationProcess.java:224)
      at org.apache.pdfbox.preflight.process.reflect.ResourcesValidationProcess.validate(ResourcesValidationProcess.java:81)
      at org.apache.pdfbox.preflight.utils.ContextHelper.callValidation(ContextHelper.java:84)

       

      Code used:

       

      import java.io.File;
      import java.util.ArrayList;
      import java.util.List;
      import org.apache.pdfbox.preflight.PreflightDocument;
      import org.apache.pdfbox.preflight.ValidationResult;
      import org.apache.pdfbox.preflight.ValidationResult.ValidationError;
      import org.apache.pdfbox.preflight.parser.PreflightParser;
      
      public class Validator {
        private File file = null;
        private List<ValidationError> errorList = new ArrayList<ValidationError>();
      
        public Validator(File file) {
          this.file = file;
        }
      
        public List<ValidationError> getErrors(){
          return errorList;
        }
      
        public boolean validate() throws Exception{
          PreflightParser parser = null;
          PreflightDocument document = null;
          ValidationResult result = null;
          try {
            parser = new PreflightParser(file);
            parser.parse();
            document = parser.getPreflightDocument();
            document.validate();
            result = document.getResult();
            errorList = result.getErrorsList();
          }
          catch(Exception e) {
            throw e;
          }
          finally {
            if(document != null) {
              try {
                document.close();
              }catch(Exception ignored) {}
            }
            parser = null;
            document = null;
            result = null;
          }
          return errorList.size() > 0 ? true : false;
        }
      }
      

       

       

        Attachments

        1. lean-from-the-trenches-p136.pdf
          128 kB
          Tilman Hausherr
        2. lean-from-the-trenches.pdf
          12.47 MB
          Dana Shaw

          Activity

            People

            • Assignee:
              tilman Tilman Hausherr
              Reporter:
              dshaw Dana Shaw
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: