Uploaded image for project: 'Apache AsterixDB'
  1. Apache AsterixDB
  2. ASTERIXDB-1336

Operator fails to find the required temp file(s)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • None

    Description

      Runtime setting: Running at scale with 9 machines (27 partitions)

      The following query breaks as the probe phase in hash join fails to find corresponding build file for a spilled partition:

      The query:

      count( 
              for $t in dataset Orders 
              where $t.o_orderkey >= 0 and $t.o_orderkey < 50000000 
              return {
                       "o_orderkey": $t.o_orderkey,
                       "o_custkey": $t.o_custkey,
                       "o_orderstatus": $t.o_orderstatus,
                       "o_totalprice": $t.o_totalprice,
                       "o_orderdate": $t.o_orderdate,
                       "o_orderpriority": $t.o_orderpriority,
                       "o_clerk": $t.o_clerk,
                       "o_shippriority": $t.o_shippriority,
                       "o_comment": $t.o_comment,
                       "o_lineitems": for $l in dataset LineItem 
                              where $l.l_orderkey = $t.o_orderkey and
                        $l.l_orderkey >= 0 and $l.l_orderkey < 50000000 
                              return {
                               "l_partkey": $l.l_partkey,
                               "l_suppkey": $l.l_suppkey,
                               "l_linenumber": $l.l_linenumber,
                               "l_quantity": $l.l_quantity,
                               "l_extendedprice": $l.l_extendedprice,
                               "l_discount": $l.l_discount,
                               "l_tax": $l.l_tax,
                               "l_returnflag": $l.l_returnflag,
                               "l_linestatus": $l. l_linestatus,
                               "l_shipdate": $l.l_shipdate,
                               "l_commitdate": $l.l_commitdate,
                               "l_receiptdate": $l.l_receiptdate,
                               "l_shipinstruct": $l.l_shipinstruct,
                               "l_shipmode": $l.l_shipmode,
                               "l_comment": $l.l_comment } } );
      

      The error log (from NC side):

      org.apache.hyracks.api.exceptions.HyracksDataException: java.util.concurrent.ExecutionException: org.apache.hyracks.api.exceptions.HyracksDataException: org.apache.hyracks.api.exceptions.HyracksDataException: java.io.FileNotFoundException: /mnt/data/sdd/pouria/asterixTpch/cs9/./RelS4319758000653184014.waf (No such file or directory)
      	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:218)
      	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:83)
      	at org.apache.hyracks.control.nc.Task.run(Task.java:261)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: java.util.concurrent.ExecutionException: org.apache.hyracks.api.exceptions.HyracksDataException: org.apache.hyracks.api.exceptions.HyracksDataException: java.io.FileNotFoundException: /mnt/data/sdd/pouria/asterixTpch/cs9/./RelS4319758000653184014.waf (No such file or directory)
      	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
      	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
      	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:212)
      	... 5 more
      Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: org.apache.hyracks.api.exceptions.HyracksDataException: java.io.FileNotFoundException: /mnt/data/sdd/pouria/asterixTpch/cs9/./RelS4319758000653184014.waf (No such file or directory)
      	at org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.close(IndexSearchOperatorNodePushable.java:230)
      	at org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.close(AbstractOneInputOneOutputOneFramePushRuntime.java:57)
      	at org.apache.hyracks.algebricks.runtime.operators.std.AssignRuntimeFactory$1.close(AssignRuntimeFactory.java:122)
      	at org.apache.hyracks.algebricks.runtime.operators.std.EmptyTupleSourceRuntimeFactory$1.close(EmptyTupleSourceRuntimeFactory.java:60)
      	at org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$1.initialize(AlgebricksMetaOperatorDescriptor.java:116)
      	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:83)
      	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:205)
      	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:202)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	... 3 more
      Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: java.io.FileNotFoundException: /mnt/data/sdd/pouria/asterixTpch/cs9/./RelS4319758000653184014.waf (No such file or directory)
      	at org.apache.hyracks.control.nc.io.IOManager.open(IOManager.java:81)
      	at org.apache.hyracks.dataflow.common.io.RunFileReader.open(RunFileReader.java:47)
      	at org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode$1.applyInMemHashJoin(OptimizedHybridHashJoinOperatorDescriptor.java:658)
      	at org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode$1.joinPartitionPair(OptimizedHybridHashJoinOperatorDescriptor.java:475)
      	at org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode$1.close(OptimizedHybridHashJoinOperatorDescriptor.java:426)
      	at org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.close(AbstractOneInputOneOutputOneFramePushRuntime.java:57)
      	at org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.close(AbstractOneInputOneOutputOneFramePushRuntime.java:57)
      	at org.apache.hyracks.algebricks.runtime.operators.std.AssignRuntimeFactory$1.close(AssignRuntimeFactory.java:122)
      	at org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$2.close(AlgebricksMetaOperatorDescriptor.java:153)
      	at org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.close(IndexSearchOperatorNodePushable.java:227)
      	... 11 more
      Caused by: java.io.FileNotFoundException: /mnt/data/sdd/pouria/asterixTpch/cs9/./RelS4319758000653184014.waf (No such file or directory)
      	at java.io.RandomAccessFile.open0(Native Method)
      	at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
      	at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243)
      	at org.apache.hyracks.control.nc.io.FileHandle.open(FileHandle.java:70)
      	at org.apache.hyracks.control.nc.io.IOManager.open(IOManager.java:79)
      	... 20 more
      

      Attachments

        Activity

          People

            javierjia Jianfeng Jia
            pouria Pouria Pirzadeh
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: