Cassandra
  1. Cassandra
  2. CASSANDRA-4834

Old-style mapred interface only populates row key for first column when using wide rows

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 1.1.7
    • Component/s: Hadoop
    • Labels:
      None

      Description

      When using the ColumnFamilyRecordReader with the old-style Hadoop interface to iterate over wide row columns, the row key is only populated on the first column.
      See attached tests.

      1. trunk-CASSANDRA-4834.txt
        1 kB
        Ben Kempe
      2. TestJobOldHadoop.java
        3 kB
        Ben Kempe
      3. TestJob.java
        3 kB
        Ben Kempe
      4. cassandra-1.1-CASSANDRA-4834.txt
        1 kB
        Ben Kempe

        Activity

        Hide
        Ben Kempe added a comment -

        TestJob is using the new mapred interface and produces the correct output.
        TestJobOldHadoop does not populate the row key for columns after the first.

        Show
        Ben Kempe added a comment - TestJob is using the new mapred interface and produces the correct output. TestJobOldHadoop does not populate the row key for columns after the first.
        Hide
        Ben Kempe added a comment -

        patch

        Show
        Ben Kempe added a comment - patch
        Hide
        Ben Kempe added a comment -

        lastColumn also needs a duplicated ByteBuffer.
        Otherwise, if the column name is consumed in the map/reduce code, the expression in
        https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordReader.java#L458
        will evaluate to false and the CFRR can get into an infinite loop.

        Show
        Ben Kempe added a comment - lastColumn also needs a duplicated ByteBuffer. Otherwise, if the column name is consumed in the map/reduce code, the expression in https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordReader.java#L458 will evaluate to false and the CFRR can get into an infinite loop.
        Hide
        Brandon Williams added a comment -

        Ben Kempe can you rebase to 1.1? I don't see any reason to not include this there. Looks good otherwise, thanks!

        Show
        Brandon Williams added a comment - Ben Kempe can you rebase to 1.1? I don't see any reason to not include this there. Looks good otherwise, thanks!
        Hide
        Ben Kempe added a comment -

        added patch for cassandra-1.1 branch

        Show
        Ben Kempe added a comment - added patch for cassandra-1.1 branch
        Hide
        Brandon Williams added a comment -

        Committed, thanks.

        Show
        Brandon Williams added a comment - Committed, thanks.

          People

          • Assignee:
            Ben Kempe
            Reporter:
            Ben Kempe
            Reviewer:
            Brandon Williams
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development