Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-15913

xml parsing error in a heavily multi-threaded environment

Add voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: 2.7.3
    • Fix Version/s: None
    • Component/s: common
    • Labels:
      None

      Description

      We met this problem in a production environment, the stack trace like this:

      ERROR org.apache.hadoop.hive.ql.exec.Task: Ended Job = job_1541600895081_0580 with exception 'java.lang.NullPointerException(Inflater has been closed)'
      java.lang.NullPointerException: Inflater has been closed
              at java.util.zip.Inflater.ensureOpen(Inflater.java:389)
              at java.util.zip.Inflater.inflate(Inflater.java:257)
              at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
              at java.io.FilterInputStream.read(FilterInputStream.java:133)
              at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
              at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
              at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
              at java.io.InputStreamReader.read(InputStreamReader.java:184)
              at java.io.BufferedReader.fill(BufferedReader.java:154)
              at java.io.BufferedReader.readLine(BufferedReader.java:317)
              at java.io.BufferedReader.readLine(BufferedReader.java:382)
              at javax.xml.parsers.FactoryFinder.findJarServiceProvider(FactoryFinder.java:319)
              at javax.xml.parsers.FactoryFinder.find(FactoryFinder.java:255)
              at javax.xml.parsers.DocumentBuilderFactory.newInstance(DocumentBuilderFactory.java:121)
              at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2524)
              at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2501)
              at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2407)
              at org.apache.hadoop.conf.Configuration.get(Configuration.java:983)
              at org.apache.hadoop.mapred.JobConf.checkAndWarnDeprecation(JobConf.java:2007)
              at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:479)
              at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:469)
              at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:188)
              at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:601)
              at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:599)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
              at org.apache.hadoop.mapred.JobClient.getJobUsingCluster(JobClient.java:599)
              at org.apache.hadoop.mapred.JobClient.getJobInner(JobClient.java:609)
              at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:639)
              at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:294)
              at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:558)
              at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:457)
              at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:141)
              at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)

      and can reproduce it in our test environment by steps below:
      1. set configs:

      hive.server2.async.exec.threads  = 50
      hive.server2.async.exec.wait.queue.size = 100
      

      2. open 4 beeline terminates in 4 different nodes.
      3. create 30 queries in each beeline terminate. Each query include "add jar xxx.jar" like this:

      add jar mykeytest-1.0-SNAPSHOT.jar;
      create temporary function ups as 'com.xxx.manager.GetCommentNameOrId';
      insert into test partition(tjrq = ${my_no}, ywtx = '${my_no2}' )
      select  dt.d_year as i_brand
             ,item.i_brand_id as i_item_sk
             ,ups(item.i_brand) as i_product_name
             ,sum(ss_ext_sales_price) as i_category_id
       from  date_dim dt
            ,store_sales
            ,item
       where dt.d_date_sk = store_sales.ss_sold_date_sk
         and store_sales.ss_item_sk = item.i_item_sk
         and item.i_manufact_id = 436
         and dt.d_moy=12
       group by dt.d_year
            ,item.i_brand
            ,item.i_brand_id
       order by dt.d_year
      

      and all these 120 queries connect to one hiveserver2

      Run all the query concurrently, and will see the stack trace abover in hiveserver2 log

        Attachments

        Issue Links

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              Cyl Yeliang Cang

              Dates

              • Created:
                Updated:

                Issue deployment