Cassandra
  1. Cassandra
  2. CASSANDRA-2316

NoSuchElement exception on node which is streaming a repair

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Fix Version/s: 0.7.5
    • Component/s: Core
    • Labels:

      Description

      Running latest SVN snapshot of 0.7.

      When I ran a repair on a node, that node's neighbor threw the following exception. Let me know what other info could be helpful.

       INFO 23:43:44,358 Streaming to /10.251.166.15
      ERROR 23:50:21,321 Fatal exception in thread Thread[CompactionExecutor:1,1,main]
      java.util.NoSuchElementException
              at com.google.common.collect.AbstractIterator.next(AbstractIterator.java:146)
              at org.apache.cassandra.service.AntiEntropyService$Validator.add(AntiEntropyService.java:366)
              at org.apache.cassandra.db.CompactionManager.doValidationCompaction(CompactionManager.java:825)
              at org.apache.cassandra.db.CompactionManager.access$800(CompactionManager.java:56)
              at org.apache.cassandra.db.CompactionManager$6.call(CompactionManager.java:358)
              at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
              at java.util.concurrent.FutureTask.run(FutureTask.java:166)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
              at java.lang.Thread.run(Thread.java:636)
      
      1. 2316-assert.txt
        1 kB
        Jonathan Ellis

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Resolved Resolved
          37d 15h 51m 1 Jonathan Ellis 19/Apr/11 00:04
          Gavin made changes -
          Workflow patch-available, re-open possible [ 12752735 ] reopen-resolved, no closed status, patch-avail, testing [ 12758415 ]
          Gavin made changes -
          Workflow no-reopen-closed, patch-avail [ 12607500 ] patch-available, re-open possible [ 12752735 ]
          Hide
          Hudson added a comment -

          Integrated in Cassandra-0.7 #447 (See https://hudson.apache.org/hudson/job/Cassandra-0.7/447/)

          Show
          Hudson added a comment - Integrated in Cassandra-0.7 #447 (See https://hudson.apache.org/hudson/job/Cassandra-0.7/447/ )
          Jonathan Ellis made changes -
          Reviewer stu stuhood
          Jonathan Ellis made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Assignee Stu Hood [ stuhood ] Jonathan Ellis [ jbellis ]
          Reviewer stu
          Resolution Fixed [ 1 ]
          Hide
          Stu Hood added a comment -

          +1 For the assert.

          Show
          Stu Hood added a comment - +1 For the assert.
          Stu Hood made changes -
          Link This issue relates to CASSANDRA-2324 [ CASSANDRA-2324 ]
          Jonathan Ellis made changes -
          Attachment 2316-assert.txt [ 12474026 ]
          Hide
          Jonathan Ellis added a comment -

          proposed assert attached

          Show
          Jonathan Ellis added a comment - proposed assert attached
          Hide
          Stu Hood added a comment -

          Order matters, because there will be up to 2^16 invalid ranges. If keys arrive out of order we will consume ranges that should have contained keys, possibly leading us to consume all invalid ranges.

          Either way, an assert that keys are arriving in order would be handy here.

          Show
          Stu Hood added a comment - Order matters, because there will be up to 2^16 invalid ranges. If keys arrive out of order we will consume ranges that should have contained keys, possibly leading us to consume all invalid ranges. Either way, an assert that keys are arriving in order would be handy here.
          Hide
          Jonathan Ellis added a comment -

          Since we iterate over each key in the CF, order shouldn't actually matter should it?

          Show
          Jonathan Ellis added a comment - Since we iterate over each key in the CF, order shouldn't actually matter should it?
          Hide
          Stu Hood added a comment -

          I wonder if this is a keys-out-of-order problem?

          Show
          Stu Hood added a comment - I wonder if this is a keys-out-of-order problem?
          Hide
          Stu Hood added a comment -

          "Invalid" ranges in the tree are ranges that need to be hashed. The idea was that the tree could be persisted between repair sessions, and ranges would be invalidated as writes arrived: then the validation compaction would only need to compact invalid ranges of the tree.

          In the current implementation, the tree will only contain invalid ranges, since it is being created from scratch for every repair.

          Show
          Stu Hood added a comment - "Invalid" ranges in the tree are ranges that need to be hashed. The idea was that the tree could be persisted between repair sessions, and ranges would be invalidated as writes arrived: then the validation compaction would only need to compact invalid ranges of the tree. In the current implementation, the tree will only contain invalid ranges, since it is being created from scratch for every repair.
          Jonathan Ellis made changes -
          Fix Version/s 0.7.5 [ 12316288 ]
          Affects Version/s 0.6 [ 12314361 ]
          Affects Version/s 0.7.3 [ 12316182 ]
          Priority Minor [ 4 ] Major [ 3 ]
          Component/s Core [ 12312978 ]
          Hide
          Jonathan Ellis added a comment -

          Looks like this dates back to 0.6.

          Show
          Jonathan Ellis added a comment - Looks like this dates back to 0.6.
          Jonathan Ellis made changes -
          Field Original Value New Value
          Assignee Stu Hood [ stuhood ]
          Hide
          Jonathan Ellis added a comment -

          The loop in validator.add apparently assumes that some range will contain any given row. But

          • ranges is supposed to be "invalid" ranges not all ranges
          • comments in validator.add say it is called for each row in the CF
          • so any rows that are not part of an "invalid" range will cause this exception

          So either my superficial understanding of what "invalid" ranges are is broken, or the comments are wrong, or I'm surprised we're not hitting this a lot more frequently.

          Show
          Jonathan Ellis added a comment - The loop in validator.add apparently assumes that some range will contain any given row. But ranges is supposed to be "invalid" ranges not all ranges comments in validator.add say it is called for each row in the CF so any rows that are not part of an "invalid" range will cause this exception So either my superficial understanding of what "invalid" ranges are is broken, or the comments are wrong, or I'm surprised we're not hitting this a lot more frequently.
          Jason Harvey created issue -

            People

            • Assignee:
              Jonathan Ellis
              Reporter:
              Jason Harvey
              Reviewer:
              Stu Hood
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development