Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6654

Possible NPE in JobHistoryEventHandler#handleEvent

Add voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Critical
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      I have seen NPE thrown from JobHistoryEventHandler#handleEvent:

      2016-03-14 16:42:15,231 INFO [Thread-69] org.apache.hadoop.service.AbstractService: Service JobHistoryEventHandler failed in state STOPPED; cause: java.lang.NullPointerException
      java.lang.NullPointerException
      	at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:570)
      	at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:382)
      	at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
      	at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
      	at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
      	at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157)
      	at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131)
      	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStop(MRAppMaster.java:1651)
      	at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
      	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.stop(MRAppMaster.java:1147)
      	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.shutDownJob(MRAppMaster.java:573)
      	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler$1.run(MRAppMaster.java:620)
      

      In the version this exception is thrown, the line is:

      mi.writeEvent(historyEvent);

      IMHO, this may be caused by an exception in a previous step. Specifically, in the kerberized environment, when creating event writer which calls to decrypt EEK, the connection to KMS failed. Exception below:

       
      2016-03-14 16:41:57,559 ERROR [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Error JobHistoryEventHandler in handleEvent: EventType: AM_STARTED
      java.net.SocketTimeoutException: Read timed out
      	at java.net.SocketInputStream.socketRead0(Native Method)
      	at java.net.SocketInputStream.read(SocketInputStream.java:152)
      	at java.net.SocketInputStream.read(SocketInputStream.java:122)
      	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
      	at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
      	at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
      	at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)
      	at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
      	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323)
      	at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
      	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:520)
      	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:505)
      	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:779)
      	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$3.call(LoadBalancingKMSClientProvider.java:185)
      	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$3.call(LoadBalancingKMSClientProvider.java:181)
      	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:94)
      	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:181)
      	at org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388)
      	at org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1420)
      	at org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:1522)
      	at org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:1507)
      	at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:407)
      	at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:400)
      	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
      	at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:400)
      	at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:343)
      	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:917)
      	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:898)
      	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:795)
      	at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.createEventWriter(JobHistoryEventHandler.java:428)
      	at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.setupEventWriter(JobHistoryEventHandler.java:468)
      	at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:553)
      	at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$1.run(JobHistoryEventHandler.java:326)
      	at java.lang.Thread.run(Thread.java:745)
      

      We should better handle this scenario and not throw an NPE.

      Attachments

        1. MAPREDUCE-6654-v2.patch
          7 kB
          Junping Du
        2. MAPREDUCE-6654-v2.1.patch
          7 kB
          Junping Du
        3. MAPREDUCE-6654.patch
          6 kB
          Junping Du

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            junping_du Junping Du
            xiaochen Xiao Chen

            Dates

              Created:
              Updated:

              Slack

                Issue deployment