Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-2628

Empty Result with Secondary Index Queries with "limit 1"

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 0.7.6, 0.8.0
    • Feature/2i Index
    • None
    • CentOS 5.5

    • Normal

    Description

      Empty result is returned by secondary index queries with "limit 1". Cassandra returns correct result for other numbers than "1" (e.g. limit 2, limit 3, etc.).

      You can reproduce the problem with programs attached on CASSANDRA-2406.

      • 1. Start Cassandra cluster. It consists of 3 cassandra nodes and distributes data by ByteOrderedPartitioner. Initial tokens of those nodes are ["31", "32", "33"].
      • 2. Create keyspace and column family, according to "create_table.cli",
      • 3. Execute "secondary_index_insertv2.py", inserting a few hundred columns to cluster
      • 4. Here, when you first use cassandra-cli and execute following lines, you can get correct result.

      % bin/cassandra-cli
      [default@unknown] connect localhost/9160;
      [default@unknown] use SampleKS;
      [default@SampleKS] get SampleCF where up = 'up' limit 3;
      -------------------
      RowKey: 150
      => (column=date, value=150, timestamp=1304937931)
      => (column=up, value=up, timestamp=1304937931)
      -------------------
      RowKey: 151
      => (column=date, value=151, timestamp=1304937932)
      => (column=up, value=up, timestamp=1304937932)
      -------------------
      RowKey: 152
      => (column=date, value=152, timestamp=1304937932)
      => (column=up, value=up, timestamp=1304937932)
      3 Rows Returned.

      On the other hand, if you set limit to "1", you can reproduce the problem.

      [default@SampleKS] get SampleCF where up = 'up' and date > 150 limit 1;
      0 Row Returned.

      There are two factors to cause this problem:

      • 1. scanned first column doesn't match at specified clause like "date > 150".
      • 2. "limit 1"

      Only one factor doesn't cause problem. For example, I can get correct data when I specify as following:

      • "limit 1" -> "limit 2"

        [default@SampleKS] get SampleCF where up = 'up' and date > 150 limit 2;
        -------------------
        RowKey: 151
        => (column=date, value=151, timestamp=1304937932)
        => (column=up, value=up, timestamp=1304937932)
        -------------------
        RowKey: 152
        => (column=date, value=152, timestamp=1304937932)
        => (column=up, value=up, timestamp=1304937932)
        2 Rows Returned.

      • "date > 150" -> "date >= 150"

        [default@SampleKS] get SampleCF where up = 'up' and date >= 150 limit 1;
        -------------------
        RowKey: 150
        => (column=date, value=150, timestamp=1304937931)
        => (column=up, value=up, timestamp=1304937931)
        1 Row Returned.

      Attachments

        1. 0001-2628.patch
          5 kB
          Sylvain Lebresne

        Activity

          People

            slebresne Sylvain Lebresne
            muga_nishizawa Muga Nishizawa
            Sylvain Lebresne
            Jonathan Ellis
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: