Uploaded image for project: 'Apache AsterixDB'
  1. Apache AsterixDB
  2. ASTERIXDB-1152

DatasetLifecycleManager returns wrong index because of resourceID fetching issues

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • None
    • None

    Description

      When you have multiple secondary indexes on a dataset and tries to bulkload data into that dataset, an exception happens. The reason that Young-Seok and I observed was one issue. That is, IndexLifecycleManager returns wrong index because of resourceID fetching issues. Specifically, if an index file already exists on disk, it is cached as a local resource. When indexDataFlowHelper tries to get the index by dataset and resource name, IndexLifecyclyManager returns a wrong resource ID to the currently existing index (e.g., LSMBTree index will be returned for LSMInvertedIndexDataflowHelper).

      You can reproduce this issue by executing one ExecutionTest that creates multiple secondary indexes and does a bulkload - index-leftouterjoin/probe-pidx-with-join-btree-sidx1.

      Then, you can see the following exception, which is the result of mismatch between the expected index-type and actual index.

      org.apache.hyracks.api.exceptions.HyracksDataException: java.lang.ArrayIndexOutOfBoundsException: 2
      at org.apache.hyracks.dataflow.std.sort.ExternalSortRunMerger.process(ExternalSortRunMerger.java:215)
      at org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$MergeActivity$1.initialize(AbstractSorterOperatorDescriptor.java:194)
      at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:85)
      at org.apache.hyracks.control.nc.Task.run(Task.java:255)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:745)
      Caused by: java.lang.ArrayIndexOutOfBoundsException: 2
      at org.apache.hyracks.storage.am.common.tuples.TypeAwareTupleWriter.getFieldSlotsBytes(TypeAwareTupleWriter.java:126)
      at org.apache.hyracks.storage.am.common.tuples.TypeAwareTupleWriter.bytesRequired(TypeAwareTupleWriter.java:40)
      at org.apache.hyracks.storage.am.lsm.btree.tuples.LSMBTreeTupleWriter.bytesRequired(LSMBTreeTupleWriter.java:43)
      at org.apache.hyracks.storage.am.btree.frames.BTreeNSMLeafFrame.getBytesRequriedToWriteTuple(BTreeNSMLeafFrame.java:54)
      at org.apache.hyracks.storage.am.btree.impls.BTree$BTreeBulkLoader.add(BTree.java:974)
      at org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTree$LSMBTreeBulkLoader.add(LSMBTree.java:690)
      at org.apache.hyracks.storage.am.common.dataflow.IndexBulkLoadOperatorNodePushable.nextFrame(IndexBulkLoadOperatorNodePushable.java:95)
      at org.apache.hyracks.dataflow.common.comm.io.AbstractFrameAppender.flush(AbstractFrameAppender.java:83)
      at org.apache.hyracks.dataflow.std.sort.AbstractFrameSorter.flush(AbstractFrameSorter.java:176)
      at org.apache.hyracks.dataflow.std.sort.ExternalSortRunMerger.process(ExternalSortRunMerger.java:128)
      ... 6 more

      Attachments

        Issue Links

          Activity

            People

              mhubail Murtadha Makki Al Hubail
              wangsaeu Taewoo Kim
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: