Cassandra
  1. Cassandra
  2. CASSANDRA-5225

Missing columns, errors when requesting specific columns from wide rows

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Fix Version/s: 1.2.2
    • Component/s: Core
    • Labels:
      None

      Description

      With Cassandra 1.2.1 (and probably 1.2.0), I'm seeing some problems with Thrift queries that request a set of specific column names when the row is very wide.

      To reproduce, I'm inserting 10 million columns into a single row and then randomly requesting three columns by name in a loop. It's common for only one or two of the three columns to be returned. I'm also seeing stack traces like the following in the Cassandra log:

      ERROR 13:12:01,017 Exception in thread Thread[ReadStage:76,5,main]
      java.lang.RuntimeException: org.apache.cassandra.io.sstable.CorruptSSTableException: org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid column name length 0 (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 bytes remaining)
      	at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1576)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
      	at java.lang.Thread.run(Thread.java:662)
      Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid column name length 0 (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 bytes remaining)
      	at org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:69)
      	at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81)
      	at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68)
      	at org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:133)
      	at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
      	at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1358)
      	at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1215)
      	at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1127)
      	at org.apache.cassandra.db.Table.getRow(Table.java:355)
      	at org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:64)
      	at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052)
      	at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1572)
      	... 3 more
      

      This doesn't seem to happen when the row is smaller, so it might have something to do with incremental large row compaction.

      1. 5225.txt
        3 kB
        Sylvain Lebresne
      2. corrected-pycassa-repro.py
        1 kB
        Daniel Meyer
      3. pycassa-repro.py
        1 kB
        Tyler Hobbs

        Issue Links

          Activity

          Hide
          Daniel Meyer added a comment - - edited

          Fixed a small bug in the repro script. Please use the 'corrected' version.

          Show
          Daniel Meyer added a comment - - edited Fixed a small bug in the repro script. Please use the 'corrected' version.
          Show
          Daniel Meyer added a comment - Added a dtest to cover this scenario: https://github.com/riptano/cassandra-dtest/blob/75bffeba0af410a41eb97b269ae1c94f4227c312/wide_rows_test.py
          Hide
          Jonathan Ellis added a comment -

          Nevertheless, the bug fixed here was a regression introduced by CASSANDRA-3885 for 1.2.0.

          Show
          Jonathan Ellis added a comment - Nevertheless, the bug fixed here was a regression introduced by CASSANDRA-3885 for 1.2.0.
          Hide
          Ahmed Bashir added a comment -

          The reporter for CASSANDRA-5210 (marked as a dupe of this) tested as far back as 0.8

          Show
          Ahmed Bashir added a comment - The reporter for CASSANDRA-5210 (marked as a dupe of this) tested as far back as 0.8
          Hide
          Jonathan Ellis added a comment -

          If you have a test case that fails against 1.1, please post it.

          Show
          Jonathan Ellis added a comment - If you have a test case that fails against 1.1, please post it.
          Hide
          Ahmed Bashir added a comment -

          This affects 1.1.x as well; will the fix be a part of 1.1.11?

          Show
          Ahmed Bashir added a comment - This affects 1.1.x as well; will the fix be a part of 1.1.11?
          Hide
          Sylvain Lebresne added a comment -

          Committed, thanks.

          Show
          Sylvain Lebresne added a comment - Committed, thanks.
          Hide
          Brandon Williams added a comment -

          +1

          Show
          Brandon Williams added a comment - +1
          Hide
          Elden Bishop added a comment -

          This patch also fixes CASSANDRA-5210. I'll mark that one as a dupe.

          Show
          Elden Bishop added a comment - This patch also fixes CASSANDRA-5210 . I'll mark that one as a dupe.
          Hide
          Brandon Williams added a comment -

          I applied the patch correctly, but the bug is in the pycassa script itself... I was hitting an edge case where it asked for the same column twice.

          Show
          Brandon Williams added a comment - I applied the patch correctly, but the bug is in the pycassa script itself... I was hitting an edge case where it asked for the same column twice.
          Hide
          Mukund added a comment -

          Yes..The patch works with my test code too..

          Show
          Mukund added a comment - Yes..The patch works with my test code too..
          Hide
          Sylvain Lebresne added a comment -

          Are you sure you applied the patch correctly? I just tested the pycassa-repro.py test above and it fails every time without the patch but haven't failed once with the patch.

          Show
          Sylvain Lebresne added a comment - Are you sure you applied the patch correctly? I just tested the pycassa-repro.py test above and it fails every time without the patch but haven't failed once with the patch.
          Hide
          Brandon Williams added a comment -

          It still doesn't pass

          Show
          Brandon Williams added a comment - It still doesn't pass
          Hide
          Sylvain Lebresne added a comment -

          So attaching patch for this. Interestingly enough, the IndexHelperTest were broken too (which kind of make it obvious this was doing the wrong thing).

          I haven't tried the pycassa repro script to validate this fixes thing though.

          Show
          Sylvain Lebresne added a comment - So attaching patch for this. Interestingly enough, the IndexHelperTest were broken too (which kind of make it obvious this was doing the wrong thing). I haven't tried the pycassa repro script to validate this fixes thing though.
          Hide
          Sylvain Lebresne added a comment -

          With just eyeballing the code, I would say that the line at https://github.com/apache/cassandra/blob/cassandra-1.2/src/java/org/apache/cassandra/io/sstable/IndexHelper.java#L179 should be:

          if (!reversed)
          

          i.e. both branch should be inverted. The goal of the lastIndex parameter is to ignore index block we know are "behind" us. So when we go forward (not reversed) you'd want to look at [lastIndex, index.size()], not the contrary.

          Show
          Sylvain Lebresne added a comment - With just eyeballing the code, I would say that the line at https://github.com/apache/cassandra/blob/cassandra-1.2/src/java/org/apache/cassandra/io/sstable/IndexHelper.java#L179 should be: if (!reversed) i.e. both branch should be inverted. The goal of the lastIndex parameter is to ignore index block we know are "behind" us. So when we go forward (not reversed) you'd want to look at [lastIndex, index.size()] , not the contrary.
          Hide
          Yuki Morishita added a comment -

          It looks like cassandra is reading from wrong column index here(https://github.com/apache/cassandra/blob/cassandra-1.2/src/java/org/apache/cassandra/db/columniterator/SSTableNamesIterator.java#L236).

          Suppose we have col indexes of [[1..5][6..10][11..15][16..20]](numbers are column names), and we want to 'SELECT 2, 18 FROM CF';
          First, we check '2' against indexes and get indexes[0]. Next, we check '18' against indexes with lastIndexIdx of 0.
          Now, because we are limiting the second index check to the sublist of indexes[0, lastIndexIdx + 1] here(https://github.com/apache/cassandra/blob/cassandra-1.2/src/java/org/apache/cassandra/io/sstable/IndexHelper.java#L186), it only checks against only first two indexes and gets wrong index position of indexes[2]. So it thinks '20' is not in the sstable.

          In fact, if I removed sublisting part from IndexHelper.indexFor, SSTableNamesIterator started returning collect values. But I don't know that's the right way to do. Sylvain Lebresne?

          Show
          Yuki Morishita added a comment - It looks like cassandra is reading from wrong column index here( https://github.com/apache/cassandra/blob/cassandra-1.2/src/java/org/apache/cassandra/db/columniterator/SSTableNamesIterator.java#L236 ). Suppose we have col indexes of [ [1..5] [6..10] [11..15] [16..20] ](numbers are column names), and we want to 'SELECT 2, 18 FROM CF'; First, we check '2' against indexes and get indexes [0] . Next, we check '18' against indexes with lastIndexIdx of 0. Now, because we are limiting the second index check to the sublist of indexes [0, lastIndexIdx + 1] here( https://github.com/apache/cassandra/blob/cassandra-1.2/src/java/org/apache/cassandra/io/sstable/IndexHelper.java#L186 ), it only checks against only first two indexes and gets wrong index position of indexes [2] . So it thinks '20' is not in the sstable. In fact, if I removed sublisting part from IndexHelper.indexFor, SSTableNamesIterator started returning collect values. But I don't know that's the right way to do. Sylvain Lebresne ?
          Hide
          Brandon Williams added a comment - - edited

          Bisect says the winner is CASSANDRA-3885, but I never encountered the corrupt sstable exception.

          Show
          Brandon Williams added a comment - - edited Bisect says the winner is CASSANDRA-3885 , but I never encountered the corrupt sstable exception.
          Hide
          Tyler Hobbs added a comment -

          Attached python script reproduces the issue with pycassa.

          Show
          Tyler Hobbs added a comment - Attached python script reproduces the issue with pycassa.

            People

            • Assignee:
              Sylvain Lebresne
              Reporter:
              Tyler Hobbs
              Reviewer:
              Brandon Williams
              Tester:
              Daniel Meyer
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development