Uploaded image for project: 'Eagle (Retired)'
  1. Eagle (Retired)
  2. EAGLE-568

Set max retry times for HDFS problems, otherwise skip instead of blocking whole topology

    XMLWordPrintableJSON

Details

    Description

      Problem and Exception

      2016-09-23 16:20:03 Thread-2-mrHistoryJobExecutor org.apache.eagle.jpm.mr.history.storm.JobHistorySpout [ERROR] fail crawling job history file and continue ...
      java.lang.RuntimeException: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1687617170-10.8.211.11-1404135347814:blk_6965456919_1107923643088 file=/mapred/history/done/2016/09/23/000716/job_1470077190071_716284_conf.xml
              at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2398) ~[stormjar.jar:na]
              at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2261) ~[stormjar.jar:na]
              at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2168) ~[stormjar.jar:na]
              at org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2210) ~[stormjar.jar:na]
              at java.lang.Iterable.forEach(Iterable.java:74) ~[na:1.8.0_91]
              at org.apache.eagle.jpm.util.Utils.fetchJobType(Utils.java:112) ~[stormjar.jar:na]
              at org.apache.eagle.jpm.mr.history.parser.JHFEventReaderBase.<init>(JHFEventReaderBase.java:121) ~[stormjar.jar:na]
              at org.apache.eagle.jpm.mr.history.parser.JHFMRVer2EventReader.<init>(JHFMRVer2EventReader.java:47) ~[stormjar.jar:na]
              at org.apache.eagle.jpm.mr.history.parser.JHFParserFactory.getParser(JHFParserFactory.java:49) ~[stormjar.jar:na]
              at org.apache.eagle.jpm.mr.history.crawler.DefaultJHFInputStreamCallback.onInputStream(DefaultJHFInputStreamCallback.java:55) ~[stormjar.jar:na]
              at org.apache.eagle.jpm.mr.history.crawler.AbstractJobHistoryDAO.readFileContent(AbstractJobHistoryDAO.java:163) ~[stormjar.jar:na]
              at org.apache.eagle.jpm.mr.history.crawler.JHFCrawlerDriverImpl.crawl(JHFCrawlerDriverImpl.java:173) ~[stormjar.jar:na]
              at org.apache.eagle.jpm.mr.history.storm.JobHistorySpout.nextTuple(JobHistorySpout.java:166) ~[stormjar.jar:na]
              at backtype.storm.daemon.executor$fn__5629$fn__5644$fn__5673.invoke(executor.clj:585) [storm-core-0.9.3.2.2.0.0-2041.jar:0.9.3.2.2.0.0-2041]
              at backtype.storm.util$async_loop$fn__452.invoke(util.clj:465) [storm-core-0.9.3.2.2.0.0-2041.jar:0.9.3.2.2.0.0-2041]
              at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]
              at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
      Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1687617170-10.8.211.11-1404135347814:blk_6965456919_1107923643088 file=/mapred/history/done/2016/09/23/000716/job_1470077190071_716284_conf.xml
              at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:880) ~[stormjar.jar:na]
              at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:560) ~[stormjar.jar:na]
              at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:790) ~[stormjar.jar:na]
              at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:837) ~[stormjar.jar:na]
              at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:645) ~[stormjar.jar:na]
              at java.io.FilterInputStream.read(FilterInputStream.java:83) ~[na:1.8.0_91]
              at com.sun.org.apache.xerces.internal.impl.XMLEntityManager$RewindableInputStream.read(XMLEntityManager.java:2899) ~[na:1.8.0_91]
              at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674) ~[na:1.8.0_91]
              at com.sun.org.apache.xerces.internal.impl.XMLVersionDetector.determineDocVersion(XMLVersionDetector.java:189) ~[na:1.8.0_91]
              at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:812) ~[na:1.8.0_91]
              at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777) ~[na:1.8.0_91]
              at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) ~[na:1.8.0_91]
              at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:243) ~[na:1.8.0_91]
              at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339) ~[na:1.8.0_91]
              at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121) ~[na:1.8.0_91]
              at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2239) ~[stormjar.jar:na]
              at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2312) ~[stormjar.jar:na]
              ... 16 common frames omitted
      

      Proposal Solution

      Set max retry times for HDFS problems, otherwise skip instead of blocking whole topology

      Attachments

        Activity

          People

            wujinhu wujinhu
            haoch Hao Chen
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: