Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-4259

Bug in SSTableReader.getSampleIndexesForRanges(...) causes uneven InputSplits generation for Hadoop mappers

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 1.1.1
    • None
    • None
    • Small cassandra cluster with 2 nodes. Version 1.1.0.

      Tokens: 0, 85070591730234615865843651857942052864

      Hadoop 1.0.1 and Pig 0.10.0.

    • Normal

    Description

      Running a simple mapreduce job on cassandra column family results in creating multiple small mappers for one half of the ring and one big mapper for the other half. Upper part (85... - 0) is cut into smaller slices. Lower part (0 - 85...) generates one big input slice. One mapper processing half of the ring causes huge inefficiency. Also the progress meter for this mapper is incorrect - it goes to 100% in a couple of seconds, than stays at 100% for an hour or two.

      I've investigated the problem a bit. I think it is related to incorrect output of 'nodetool rangekeysample'. On the node resposible for part (0 - 85...) the output is empty! On the other node it works fine.

      I think the bug is in SSTableReader.getSampleIndexesForRanges(...). These two lines:

      RowPosition leftPosition = range.left.maxKeyBound();
      RowPosition rightPosition = range.left.maxKeyBound();

      should be changed to:

      RowPosition leftPosition = range.left.maxKeyBound();
      RowPosition rightPosition = range.right.maxKeyBound();

      After that fix the output of nodetool is correct and the whole ring is split into small mappers.

      The other half of the ring works fine because of extra 'if' in the code:

      int right = Range.isWrapAround(range.left, range.right)...

      This causes that the bug does not show up in one-node cluster or in the "last" ring partition in muli-node clusters.

      Can anyone look at it and verify my thoughts? I'm rather new to Cassandra.

      Attachments

        Activity

          People

            br1985 Bartłomiej Romański
            br1985 Bartłomiej Romański
            Bartłomiej Romański
            Jonathan Ellis
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: