Cassandra
  1. Cassandra
  2. CASSANDRA-4834

Old-style mapred interface only populates row key for first column when using wide rows

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 1.1.7
    • Component/s: Hadoop
    • Labels:
      None

      Description

      When using the ColumnFamilyRecordReader with the old-style Hadoop interface to iterate over wide row columns, the row key is only populated on the first column.
      See attached tests.

      1. cassandra-1.1-CASSANDRA-4834.txt
        1 kB
        Ben Kempe
      2. TestJob.java
        3 kB
        Ben Kempe
      3. TestJobOldHadoop.java
        3 kB
        Ben Kempe
      4. trunk-CASSANDRA-4834.txt
        1 kB
        Ben Kempe

        Activity

        Gavin made changes -
        Workflow patch-available, re-open possible [ 12753507 ] reopen-resolved, no closed status, patch-avail, testing [ 12756645 ]
        Gavin made changes -
        Workflow no-reopen-closed, patch-avail [ 12730412 ] patch-available, re-open possible [ 12753507 ]
        Brandon Williams made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Brandon Williams added a comment -

        Committed, thanks.

        Show
        Brandon Williams added a comment - Committed, thanks.
        Ben Kempe made changes -
        Attachment cassandra-1.1-CASSANDRA-4834.txt [ 12552117 ]
        Hide
        Ben Kempe added a comment -

        added patch for cassandra-1.1 branch

        Show
        Ben Kempe added a comment - added patch for cassandra-1.1 branch
        Ben Kempe made changes -
        Comment [ added patch for cassandra-1.1 branch ]
        Ben Kempe made changes -
        Attachment cassandra-1.1-CASSANDRA-4834.txt [ 12552114 ]
        Ben Kempe made changes -
        Attachment cassandra-1.1-CASSANDRA-4834.txt [ 12552114 ]
        Hide
        Brandon Williams added a comment -

        Ben Kempe can you rebase to 1.1? I don't see any reason to not include this there. Looks good otherwise, thanks!

        Show
        Brandon Williams added a comment - Ben Kempe can you rebase to 1.1? I don't see any reason to not include this there. Looks good otherwise, thanks!
        Ben Kempe made changes -
        Description When using the ColumnFamilyRecordReader with the old-style Hadoop interface to iterate over wide row columns, the row key is only populated on the first column.
        See attached tests.
        When using the ColumnFamilyRecordReader with the old-style Hadoop interface to iterate over wide row columns, the row key is only populated on the first column.
        See attached tests.

        Ben Kempe made changes -
        Attachment trunk-CASSANDRA-4834.txt [ 12549760 ]
        Ben Kempe made changes -
        Attachment trunk-CASSANDRA-4834.txt [ 12550891 ]
        Hide
        Ben Kempe added a comment -

        lastColumn also needs a duplicated ByteBuffer.
        Otherwise, if the column name is consumed in the map/reduce code, the expression in
        https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordReader.java#L458
        will evaluate to false and the CFRR can get into an infinite loop.

        Show
        Ben Kempe added a comment - lastColumn also needs a duplicated ByteBuffer. Otherwise, if the column name is consumed in the map/reduce code, the expression in https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordReader.java#L458 will evaluate to false and the CFRR can get into an infinite loop.
        Jonathan Ellis made changes -
        Fix Version/s 1.1.7 [ 12323354 ]
        Affects Version/s 1.1.0 [ 12317615 ]
        Affects Version/s 1.1.5 [ 12322941 ]
        Jonathan Ellis made changes -
        Assignee Ben Kempe [ bkempe ]
        Reviewer brandon.williams
        Ben Kempe made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Ben Kempe made changes -
        Attachment trunk-CASSANDRA-4834.txt [ 12549760 ]
        Hide
        Ben Kempe added a comment -

        patch

        Show
        Ben Kempe added a comment - patch
        Ben Kempe made changes -
        Field Original Value New Value
        Attachment TestJob.java [ 12549758 ]
        Attachment TestJobOldHadoop.java [ 12549759 ]
        Hide
        Ben Kempe added a comment -

        TestJob is using the new mapred interface and produces the correct output.
        TestJobOldHadoop does not populate the row key for columns after the first.

        Show
        Ben Kempe added a comment - TestJob is using the new mapred interface and produces the correct output. TestJobOldHadoop does not populate the row key for columns after the first.
        Ben Kempe created issue -

          People

          • Assignee:
            Ben Kempe
            Reporter:
            Ben Kempe
            Reviewer:
            Brandon Williams
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development