Uploaded image for project: 'Apache AsterixDB'
  1. Apache AsterixDB
  2. ASTERIXDB-1152

DatasetLifecycleManager returns wrong index because of resourceID fetching issues

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      When you have multiple secondary indexes on a dataset and tries to bulkload data into that dataset, an exception happens. The reason that Young-Seok and I observed was one issue. That is, IndexLifecycleManager returns wrong index because of resourceID fetching issues. Specifically, if an index file already exists on disk, it is cached as a local resource. When indexDataFlowHelper tries to get the index by dataset and resource name, IndexLifecyclyManager returns a wrong resource ID to the currently existing index (e.g., LSMBTree index will be returned for LSMInvertedIndexDataflowHelper).

      You can reproduce this issue by executing one ExecutionTest that creates multiple secondary indexes and does a bulkload - index-leftouterjoin/probe-pidx-with-join-btree-sidx1.

      Then, you can see the following exception, which is the result of mismatch between the expected index-type and actual index.

      org.apache.hyracks.api.exceptions.HyracksDataException: java.lang.ArrayIndexOutOfBoundsException: 2
      at org.apache.hyracks.dataflow.std.sort.ExternalSortRunMerger.process(ExternalSortRunMerger.java:215)
      at org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$MergeActivity$1.initialize(AbstractSorterOperatorDescriptor.java:194)
      at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:85)
      at org.apache.hyracks.control.nc.Task.run(Task.java:255)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:745)
      Caused by: java.lang.ArrayIndexOutOfBoundsException: 2
      at org.apache.hyracks.storage.am.common.tuples.TypeAwareTupleWriter.getFieldSlotsBytes(TypeAwareTupleWriter.java:126)
      at org.apache.hyracks.storage.am.common.tuples.TypeAwareTupleWriter.bytesRequired(TypeAwareTupleWriter.java:40)
      at org.apache.hyracks.storage.am.lsm.btree.tuples.LSMBTreeTupleWriter.bytesRequired(LSMBTreeTupleWriter.java:43)
      at org.apache.hyracks.storage.am.btree.frames.BTreeNSMLeafFrame.getBytesRequriedToWriteTuple(BTreeNSMLeafFrame.java:54)
      at org.apache.hyracks.storage.am.btree.impls.BTree$BTreeBulkLoader.add(BTree.java:974)
      at org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTree$LSMBTreeBulkLoader.add(LSMBTree.java:690)
      at org.apache.hyracks.storage.am.common.dataflow.IndexBulkLoadOperatorNodePushable.nextFrame(IndexBulkLoadOperatorNodePushable.java:95)
      at org.apache.hyracks.dataflow.common.comm.io.AbstractFrameAppender.flush(AbstractFrameAppender.java:83)
      at org.apache.hyracks.dataflow.std.sort.AbstractFrameSorter.flush(AbstractFrameSorter.java:176)
      at org.apache.hyracks.dataflow.std.sort.ExternalSortRunMerger.process(ExternalSortRunMerger.java:128)
      ... 6 more

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                mhubail Murtadha Makki Al Hubail
                Reporter:
                wangsaeu Taewoo Kim
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: