Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-1289

History reader fails to get the query information after a successful query execution

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.9.0
    • Fix Version/s: 0.10.0
    • Component/s: TajoMaster
    • Labels:
      None

      Description

      When a query execution is successfully finished, its status is written into a query history. Once the history is written, the tajo master gets the status of the finished query from that written history.

      Here, when the tajo master reads the query history, an I/O exception occurs and the tajo cli stops forever. Here is the full stack trace.

      2015-01-09 00:27:54,019 INFO org.apache.tajo.master.querymaster.QueryInProgress: Stop query:q_1420730837752_0001
      2015-01-09 00:27:54,019 INFO org.apache.tajo.master.rm.TajoWorkerResourceManager: Release Resource: 0.0,512
      2015-01-09 00:27:54,019 INFO org.apache.tajo.master.rm.TajoWorkerResourceManager: Released QueryMaster (q_1420730837752_0001) resource.
      2015-01-09 00:27:54,019 INFO org.apache.tajo.master.querymaster.QueryInProgress: q_1420730837752_0001 QueryMaster stopped
      2015-01-09 00:27:54,032 INFO org.apache.tajo.util.history.HistoryWriter: Create query history file: hdfs://localhost:7020/tmp/tajo-jihoon/staging/history/20150109/query-list/query-list-002754.hist
      2015-01-09 00:27:54,899 ERROR org.apache.tajo.util.history.HistoryReader: Reading error:hdfs://localhost:7020/tmp/tajo-jihoon/staging/history/20150107/query-list/query-list-131932.hist, Cannot obtain block length for LocatedBlock{BP-1604697128-192.168.0.12-1412676388616:blk_1073741964_1140; getBlockSize()=1356; corrupt=false; offset=0; locs=[127.0.0.1:50010]}
      java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1604697128-192.168.0.12-1412676388616:blk_1073741964_1140; getBlockSize()=1356; corrupt=false; offset=0; locs=[127.0.0.1:50010]}
              at org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:350)
              at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:294)
              at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:231)
              at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:224)
              at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1295)
              at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:300)
              at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:296)
              at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
              at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:296)
              at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:764)
              at org.apache.tajo.util.history.HistoryReader.getQueries(HistoryReader.java:88)
              at org.apache.tajo.util.history.HistoryReader.getQueryInfo(HistoryReader.java:294)
              at org.apache.tajo.master.querymaster.QueryJobManager.getFinishedQuery(QueryJobManager.java:131)
              at org.apache.tajo.master.TajoMasterClientService$TajoMasterClientProtocolServiceHandler.getQueryStatus(TajoMasterClientService.java:471)
              at org.apache.tajo.ipc.TajoMasterClientProtocol$TajoMasterClientProtocolService$2.callBlockingMethod(TajoMasterClientProtocol.java:551)
              at org.apache.tajo.rpc.BlockingRpcServer$ServerHandler.messageReceived(BlockingRpcServer.java:103)
              at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
              at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
              at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
              at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
              at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:70)
              at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
              at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
              at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
              at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
              at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
              at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
              at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
              at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
              at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
              at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
              at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
              at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
              at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
              at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
              at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
              at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
              at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
              at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              at java.lang.Thread.run(Thread.java:745)
      

        Activity

        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Tajo-master-CODEGEN-build #202 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/202/)
        TAJO-1289: History reader fails to get the query information after a successful query execution. (jinho) (jhkim: rev a15b5fab7b1475f5cb4e5eba842f1c4b17166b58)

        • tajo-core/src/test/java/org/apache/tajo/util/history/TestHistoryWriterReader.java
        • tajo-core/src/main/java/org/apache/tajo/querymaster/QueryMaster.java
        • tajo-core/src/main/java/org/apache/tajo/master/QueryManager.java
        • CHANGES
        • tajo-core/src/main/java/org/apache/tajo/util/history/HistoryWriter.java
        • tajo-core/src/test/java/org/apache/tajo/client/TestTajoClient.java
        • tajo-core/src/main/java/org/apache/tajo/util/history/HistoryReader.java
        • tajo-core/src/main/java/org/apache/tajo/master/QueryInfo.java
        • tajo-core/src/main/java/org/apache/tajo/worker/TaskRunnerManager.java
        • tajo-core/src/main/java/org/apache/tajo/master/QueryInProgress.java
        • tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java
        • tajo-common/src/main/java/org/apache/tajo/util/Bytes.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Tajo-master-CODEGEN-build #202 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/202/ ) TAJO-1289 : History reader fails to get the query information after a successful query execution. (jinho) (jhkim: rev a15b5fab7b1475f5cb4e5eba842f1c4b17166b58) tajo-core/src/test/java/org/apache/tajo/util/history/TestHistoryWriterReader.java tajo-core/src/main/java/org/apache/tajo/querymaster/QueryMaster.java tajo-core/src/main/java/org/apache/tajo/master/QueryManager.java CHANGES tajo-core/src/main/java/org/apache/tajo/util/history/HistoryWriter.java tajo-core/src/test/java/org/apache/tajo/client/TestTajoClient.java tajo-core/src/main/java/org/apache/tajo/util/history/HistoryReader.java tajo-core/src/main/java/org/apache/tajo/master/QueryInfo.java tajo-core/src/main/java/org/apache/tajo/worker/TaskRunnerManager.java tajo-core/src/main/java/org/apache/tajo/master/QueryInProgress.java tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java tajo-common/src/main/java/org/apache/tajo/util/Bytes.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Tajo-master-build #563 (See https://builds.apache.org/job/Tajo-master-build/563/)
        TAJO-1289: History reader fails to get the query information after a successful query execution. (jinho) (jhkim: rev a15b5fab7b1475f5cb4e5eba842f1c4b17166b58)

        • tajo-core/src/main/java/org/apache/tajo/querymaster/QueryMaster.java
        • tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java
        • tajo-core/src/main/java/org/apache/tajo/master/QueryManager.java
        • tajo-core/src/main/java/org/apache/tajo/master/QueryInProgress.java
        • tajo-core/src/test/java/org/apache/tajo/util/history/TestHistoryWriterReader.java
        • tajo-core/src/main/java/org/apache/tajo/util/history/HistoryWriter.java
        • tajo-core/src/main/java/org/apache/tajo/util/history/HistoryReader.java
        • tajo-common/src/main/java/org/apache/tajo/util/Bytes.java
        • tajo-core/src/main/java/org/apache/tajo/master/QueryInfo.java
        • CHANGES
        • tajo-core/src/main/java/org/apache/tajo/worker/TaskRunnerManager.java
        • tajo-core/src/test/java/org/apache/tajo/client/TestTajoClient.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Tajo-master-build #563 (See https://builds.apache.org/job/Tajo-master-build/563/ ) TAJO-1289 : History reader fails to get the query information after a successful query execution. (jinho) (jhkim: rev a15b5fab7b1475f5cb4e5eba842f1c4b17166b58) tajo-core/src/main/java/org/apache/tajo/querymaster/QueryMaster.java tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java tajo-core/src/main/java/org/apache/tajo/master/QueryManager.java tajo-core/src/main/java/org/apache/tajo/master/QueryInProgress.java tajo-core/src/test/java/org/apache/tajo/util/history/TestHistoryWriterReader.java tajo-core/src/main/java/org/apache/tajo/util/history/HistoryWriter.java tajo-core/src/main/java/org/apache/tajo/util/history/HistoryReader.java tajo-common/src/main/java/org/apache/tajo/util/Bytes.java tajo-core/src/main/java/org/apache/tajo/master/QueryInfo.java CHANGES tajo-core/src/main/java/org/apache/tajo/worker/TaskRunnerManager.java tajo-core/src/test/java/org/apache/tajo/client/TestTajoClient.java
        Hide
        jhkim Jinho Kim added a comment -

        committed it
        Thank you!

        Show
        jhkim Jinho Kim added a comment - committed it Thank you!
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user asfgit closed the pull request at:

        https://github.com/apache/tajo/pull/356

        Show
        githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/tajo/pull/356
        Hide
        hyunsik Hyunsik Choi added a comment -

        +1 ship it!

        Show
        hyunsik Hyunsik Choi added a comment - +1 ship it!
        Hide
        tajoqa Tajo QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12694088/TAJO-1289.patch
        against master revision release-0.9.0-rc0-151-g17c6dff.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The applied patch does not increase the total number of javadoc warnings.

        +1 checkstyle. The patch generated 0 code style errors.

        -1 findbugs. The patch appears to introduce 186 new Findbugs (version 2.0.3) warnings.

        -1 release audit. The applied patch generated 370 release audit warnings.

        +1 core tests. The patch passed unit tests in tajo-common tajo-core.

        Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/575//testReport/
        Release audit warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/575//artifact/incubator-tajo/patchprocess/patchReleaseAuditProblems.txt
        Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/575//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-core.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/575//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-common.html
        Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/575//console

        This message is automatically generated.

        Show
        tajoqa Tajo QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694088/TAJO-1289.patch against master revision release-0.9.0-rc0-151-g17c6dff. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The applied patch does not increase the total number of javadoc warnings. +1 checkstyle. The patch generated 0 code style errors. -1 findbugs. The patch appears to introduce 186 new Findbugs (version 2.0.3) warnings. -1 release audit. The applied patch generated 370 release audit warnings. +1 core tests. The patch passed unit tests in tajo-common tajo-core. Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/575//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/575//artifact/incubator-tajo/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/575//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-core.html Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/575//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-common.html Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/575//console This message is automatically generated.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jinossy commented on the pull request:

        https://github.com/apache/tajo/pull/356#issuecomment-71148607

        I've add two configuration following:
        ```
        tajo.history.query.replication
        tajo.history.task.replication
        ```
        Please review again
        Thanks

        Show
        githubbot ASF GitHub Bot added a comment - Github user jinossy commented on the pull request: https://github.com/apache/tajo/pull/356#issuecomment-71148607 I've add two configuration following: ``` tajo.history.query.replication tajo.history.task.replication ``` Please review again Thanks
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jinossy commented on the pull request:

        https://github.com/apache/tajo/pull/356#issuecomment-71135527

        Thanks guys for the review. I’ll commit it shortly.
        If you have more review, please let me know

        Show
        githubbot ASF GitHub Bot added a comment - Github user jinossy commented on the pull request: https://github.com/apache/tajo/pull/356#issuecomment-71135527 Thanks guys for the review. I’ll commit it shortly. If you have more review, please let me know
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jinossy commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/356#discussion_r23426339

        — Diff: tajo-core/src/main/java/org/apache/tajo/util/history/HistoryWriter.java —
        @@ -49,8 +56,10 @@
        public static final String QUERY_LIST = "query-list";
        public static final String QUERY_DETAIL = "query-detail";
        public static final String HISTORY_FILE_POSTFIX = ".hist";
        + private static short REPLICATION = 1;
        — End diff –

        Sure!

        Show
        githubbot ASF GitHub Bot added a comment - Github user jinossy commented on a diff in the pull request: https://github.com/apache/tajo/pull/356#discussion_r23426339 — Diff: tajo-core/src/main/java/org/apache/tajo/util/history/HistoryWriter.java — @@ -49,8 +56,10 @@ public static final String QUERY_LIST = "query-list"; public static final String QUERY_DETAIL = "query-detail"; public static final String HISTORY_FILE_POSTFIX = ".hist"; + private static short REPLICATION = 1; — End diff – Sure!
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jinossy commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/356#discussion_r23426319

        — Diff: tajo-core/src/main/java/org/apache/tajo/master/QueryManager.java —
        @@ -121,19 +126,27 @@ public EventHandler getEventHandler() {

        public synchronized Collection<QueryInfo> getFinishedQueries() {
        — End diff –

        I agree with you, we should make to pagination

        Show
        githubbot ASF GitHub Bot added a comment - Github user jinossy commented on a diff in the pull request: https://github.com/apache/tajo/pull/356#discussion_r23426319 — Diff: tajo-core/src/main/java/org/apache/tajo/master/QueryManager.java — @@ -121,19 +126,27 @@ public EventHandler getEventHandler() { public synchronized Collection<QueryInfo> getFinishedQueries() { — End diff – I agree with you, we should make to pagination
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/356#issuecomment-71070983

        The patch looks nice to me. I leave some trivial comments.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/356#issuecomment-71070983 The patch looks nice to me. I leave some trivial comments.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/356#discussion_r23396975

        — Diff: tajo-core/src/main/java/org/apache/tajo/util/history/HistoryWriter.java —
        @@ -49,8 +56,10 @@
        public static final String QUERY_LIST = "query-list";
        public static final String QUERY_DETAIL = "query-detail";
        public static final String HISTORY_FILE_POSTFIX = ".hist";
        + private static short REPLICATION = 1;
        — End diff –

        Since history may be not critical data in many cases, one replication would be enough. But, in some cases, it can be very important. How about improve it to be some configurable?

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/356#discussion_r23396975 — Diff: tajo-core/src/main/java/org/apache/tajo/util/history/HistoryWriter.java — @@ -49,8 +56,10 @@ public static final String QUERY_LIST = "query-list"; public static final String QUERY_DETAIL = "query-detail"; public static final String HISTORY_FILE_POSTFIX = ".hist"; + private static short REPLICATION = 1; — End diff – Since history may be not critical data in many cases, one replication would be enough. But, in some cases, it can be very important. How about improve it to be some configurable?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/356#discussion_r23396104

        — Diff: tajo-core/src/main/java/org/apache/tajo/master/QueryManager.java —
        @@ -121,19 +126,27 @@ public EventHandler getEventHandler() {

        public synchronized Collection<QueryInfo> getFinishedQueries() {
        — End diff –

        In some cases, it may result in lots of histories. In that case, It will cause long latency. Later, we need to improve the paging feature in new jira.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/356#discussion_r23396104 — Diff: tajo-core/src/main/java/org/apache/tajo/master/QueryManager.java — @@ -121,19 +126,27 @@ public EventHandler getEventHandler() { public synchronized Collection<QueryInfo> getFinishedQueries() { — End diff – In some cases, it may result in lots of histories. In that case, It will cause long latency. Later, we need to improve the paging feature in new jira.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/356#issuecomment-71035462

        +1

        Thank you for your contribution.
        It looks good to me and I found that history server ran as expected on my testing server.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/356#issuecomment-71035462 +1 Thank you for your contribution. It looks good to me and I found that history server ran as expected on my testing server.
        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user jinossy opened a pull request:

        https://github.com/apache/tajo/pull/356

        TAJO-1289: History reader fails to get the query information after a successful query execution

        Main problem is some corrupted history files in hdfs. TajoClient are to get a query status in finished list when a query is finished. but the finished list is a stored files
        So I’ve add LRU map for a finished query caching, and fix hidden race condition bug.

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/jinossy/tajo TAJO-1289

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/tajo/pull/356.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #356


        commit 4996ad1bb0c0995bc763b88d80d0fe7b37a43fa1
        Author: jhkim <jhkim@apache.org>
        Date: 2015-01-22T06:41:15Z

        TAJO-1289: History reader fails to get the query information after a successful query execution


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user jinossy opened a pull request: https://github.com/apache/tajo/pull/356 TAJO-1289 : History reader fails to get the query information after a successful query execution Main problem is some corrupted history files in hdfs. TajoClient are to get a query status in finished list when a query is finished. but the finished list is a stored files So I’ve add LRU map for a finished query caching, and fix hidden race condition bug. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jinossy/tajo TAJO-1289 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tajo/pull/356.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #356 commit 4996ad1bb0c0995bc763b88d80d0fe7b37a43fa1 Author: jhkim <jhkim@apache.org> Date: 2015-01-22T06:41:15Z TAJO-1289 : History reader fails to get the query information after a successful query execution

          People

          • Assignee:
            jhkim Jinho Kim
            Reporter:
            jihoonson Jihoon Son
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development