Uploaded image for project: 'Accumulo'
  1. Accumulo
  2. ACCUMULO-3646

Duplicate entries when iterator emits entries past seek() range

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.6.1
    • 1.7.0
    • docs
    • None
    • Ubuntu 14.04, Accumulo 1.6.1, Hadoop 2.6.0, Zookeeper 3.4.6

    Description

      The SortedKeyValueIterator's seek() method documents that an iterator may return keys past the range passed to seek(). However, an iterator set at scan-time that returns values past the range passed to seek() will return those keys multiple times if the client uses a BatchScanner. This does not occur when the client uses a Scanner. This has nothing to do with the VersioningIterator. This has nothing to do with the entries actually in the table. Also affects MiniAccumulo.

      If this is intended, we should update the SortedKeyValueIterator seek() documentation with a warning that returning keys past the seek() range may result in a client seeing duplicate keys. If this is not intended, then it is a bug.

      Test code: See InjectTest

      • method testInjectOnScan_Empty fails because it uses a BatchScanner
      • method testInjectOnScan_Empty_Reg passes because it uses a Scanner

      In these methods, the InjectIterator emits entries that go beyond the seek() range. We confirm what is going on by placing a DebugIterator right after.

      Logs when using the BatchScanner:
      notice that the "m1" row is returned twice:

      015-03-05 06:05:34,768 [graphulo.BranchIterator] INFO : class edu.mit.ll.graphulo.InjectIterator: init on scope scan
      2015-03-05 06:05:34,768 [graphulo.BranchIterator] INFO : class edu.mit.ll.graphulo.InjectIterator: init on scope scan
      2015-03-05 06:05:34,770 [iterators.DebugIterator] DEBUG: init(edu.mit.ll.graphulo.InjectIterator@e9fe846, {}, org.apache.accumulo.tserver.TabletIteratorEnvironment@b99fd03)
      2015-03-05 06:05:34,771 [iterators.DebugIterator] DEBUG: 0x516E9F1F seek((-inf,f%00; : [] 9223372036854775807 false), [], false)
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F hasTop() --> true
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F getTopKey() --> a1 colF3:colQ3 [] 1425553534769 false
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F hasTop() --> true
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F getTopKey() --> a1 colF3:colQ3 [] 1425553534769 false
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F getTopValue() --> 1
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F next()
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F hasTop() --> true
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F getTopKey() --> c1 colF3:colQ3 [] 1425553534769 false
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F hasTop() --> true
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F getTopKey() --> c1 colF3:colQ3 [] 1425553534769 false
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F hasTop() --> true
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F getTopKey() --> c1 colF3:colQ3 [] 1425553534769 false
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F getTopValue() --> 1
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F next()
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F hasTop() --> true
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F getTopKey() --> m1 colF3:colQ3 [] 1425553534769 false
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F hasTop() --> true
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F getTopKey() --> m1 colF3:colQ3 [] 1425553534769 false
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F hasTop() --> true
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F getTopKey() --> m1 colF3:colQ3 [] 1425553534769 false
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F getTopValue() --> 1
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F next()
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F hasTop() --> false
      2015-03-05 06:05:34,772 [iterators.DebugIterator] DEBUG: 0x516E9F1F hasTop() --> false
      2015-03-05 06:05:34,773 [iterators.DebugIterator] DEBUG: 0x516E9F1F hasTop() --> false
      2015-03-05 06:05:34,770 [iterators.DebugIterator] DEBUG: init(edu.mit.ll.graphulo.InjectIterator@2528a1f1, {}, org.apache.accumulo.tserver.TabletIteratorEnvironment@244a532a)
      2015-03-05 06:05:34,773 [iterators.DebugIterator] DEBUG: 0x5DBB88BA seek([f%00; : [] 9223372036854775807 false,+inf), [], false)
      2015-03-05 06:05:34,773 [iterators.DebugIterator] DEBUG: 0x5DBB88BA hasTop() --> true
      2015-03-05 06:05:34,773 [iterators.DebugIterator] DEBUG: 0x5DBB88BA getTopKey() --> m1 colF3:colQ3 [] 1425553534769 false
      2015-03-05 06:05:34,773 [iterators.DebugIterator] DEBUG: 0x5DBB88BA hasTop() --> true
      2015-03-05 06:05:34,773 [iterators.DebugIterator] DEBUG: 0x5DBB88BA getTopKey() --> m1 colF3:colQ3 [] 1425553534769 false
      2015-03-05 06:05:34,773 [iterators.DebugIterator] DEBUG: 0x5DBB88BA hasTop() --> true
      2015-03-05 06:05:34,773 [iterators.DebugIterator] DEBUG: 0x5DBB88BA getTopKey() --> m1 colF3:colQ3 [] 1425553534769 false
      2015-03-05 06:05:34,773 [iterators.DebugIterator] DEBUG: 0x5DBB88BA getTopValue() --> 1
      2015-03-05 06:05:34,773 [iterators.DebugIterator] DEBUG: 0x5DBB88BA next()
      2015-03-05 06:05:34,773 [iterators.DebugIterator] DEBUG: 0x5DBB88BA hasTop() --> false
      2015-03-05 06:05:34,773 [iterators.DebugIterator] DEBUG: 0x5DBB88BA hasTop() --> false
      2015-03-05 06:05:34,773 [iterators.DebugIterator] DEBUG: 0x5DBB88BA hasTop() --> false
      

      Attachments

        Issue Links

          Activity

            People

              shutchis Shana Hutchison
              shutchis Shana Hutchison
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m