Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-12404

Disable caching for JarURLConnection to avoid sharing JarFile with other users when loading resource from URL in Configuration class.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 2.8.0, 3.0.0-alpha1
    • conf
    • None
    • Reviewed

    Description

      Disable caching for JarURLConnection to avoid sharing JarFile with other users when loading resource from URL in Configuration class.
      Currently Configuration#parse will call url.openStream to get the InputStream for DocumentBuilder to parse.
      Based on the JDK source code, the calling sequence is
      url.openStream => handler.openConnection.getInputStream => new JarURLConnection => JarURLConnection.connect => factory.get(getJarFileURL(), getUseCaches()) => URLJarFile.getInputStream=>JarFile.getInputStream=>ZipFile.getInputStream
      If URLConnection#getUseCaches is true (by default), URLJarFile will be shared for the same URL. If the shared URLJarFile is closed by other users, all the InputStream returned by URLJarFile#getInputStream will be closed based on the document
      So we saw the following exception in a heavy-load system at rare situation which cause a hive job failed

      2014-10-21 23:44:41,856 ERROR org.apache.hadoop.hive.ql.exec.Task: Ended 
      Job = job_1413909398487_3696 with exception 
      'java.lang.RuntimeException(java.io.IOException: Stream closed)' 
      java.lang.RuntimeException: java.io.IOException: Stream closed 
      at 
      org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2484) 
      at 
      org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2337) 
      at 
      org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2254) 
      at org.apache.hadoop.conf.Configuration.get(Configuration.java:861) 
      at 
      org.apache.hadoop.mapred.JobConf.checkAndWarnDeprecation(JobConf.java:2030) 
      at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:479) 
      at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:469) 
      at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:187) 
      at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:582) 
      at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:580) 
      at java.security.AccessController.doPrivileged(Native Method) 
      at javax.security.auth.Subject.doAs(Subject.java:415) 
      at 
      org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.j 
      ava:1614) 
      at 
      org.apache.hadoop.mapred.JobClient.getJobUsingCluster(JobClient.java:580) 
      at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:598) 
      at 
      org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExe 
      cHelper.java:288) 
      at 
      org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExe 
      cHelper.java:547) 
      at 
      org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:426) 
      at 
      org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136) 
      at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) 
      at 
      org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) 
      at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1516) 
      at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1283) 
      at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1101) 
      at org.apache.hadoop.hive.ql.Driver.run(Driver.java:924) 
      at org.apache.hadoop.hive.ql.Driver.run(Driver.java:919) 
      at 
      org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation 
      .java:145) 
      at 
      org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation. 
      java:69) 
      at 
      org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.jav 
      a:200) 
      at java.security.AccessController.doPrivileged(Native Method) 
      at javax.security.auth.Subject.doAs(Subject.java:415) 
      at 
      org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.j 
      ava:1614) 
      at 
      org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java: 
      502) 
      at 
      org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java: 
      213) 
      at 
      java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
      at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
      at 
      java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1 
      145) 
      at 
      java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java: 
      615) 
      at java.lang.Thread.run(Thread.java:745) 
      Caused by: java.io.IOException: Stream closed 
      at 
      java.util.zip.InflaterInputStream.ensureOpen(InflaterInputStream.java:67) 
      at 
      java.util.zip.InflaterInputStream.read(InflaterInputStream.java:142) 
      at java.io.FilterInputStream.read(FilterInputStream.java:133) 
      at 
      com.sun.org.apache.xerces.internal.impl.XMLEntityManager$RewindableInputStr 
      eam.read(XMLEntityManager.java:2902) 
      at 
      com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java: 
      302) 
      at 
      com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScan 
      ner.java:1753) 
      at 
      com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipChar(XMLEntity 
      Scanner.java:1426) 
      at 
      com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$Frag 
      mentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2807) 
      at 
      com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocu 
      mentScannerImpl.java:606) 
      at 
      com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNS 
      DocumentScannerImpl.java:117) 
      at 
      com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scan 
      Document(XMLDocumentFragmentScannerImpl.java:510) 
      at 
      com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Co 
      nfiguration.java:848) 
      at 
      com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Co 
      nfiguration.java:777) 
      at 
      com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:1 
      41) 
      at 
      com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:2 
      43) 
      at 
      com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentB 
      uilderImpl.java:347) 
      at 
      javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150) 
      at 
      org.apache.hadoop.conf.Configuration.parse(Configuration.java:2325) 
      at 
      org.apache.hadoop.conf.Configuration.parse(Configuration.java:2313) 
      at 
      org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2384)
      

      Also we can save a little bit memory, with JarURLConnection's caches disabled.

      Attachments

        1. HADOOP-12404.000.patch
          1 kB
          Zhihai Xu

        Issue Links

          Activity

            People

              zxu Zhihai Xu
              zxu Zhihai Xu
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: