Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3491

SELECT COUNT(*) FROM HBASE Returns Incorrect Row Count

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.0.0, 1.1.0
    • Fix Version/s: Future
    • Component/s: Storage - HBase
    • Labels:
    • Environment:

      CentOS6.5
      jdk1.8.0_45
      hadoop-2.6.0-cdh5.4.2
      hbase-1.0.0-cdh5.4.2
      IntelliJ14.1.4
      Maven3.0.5

      Description

      Create a table 'test' in Hbase with 1 column family, 7 columns.
      Inserting 100,000 rows into 'test' using Java API, each column with same value = "value".

      SELECT COUNT(<all>) FROM hbase.test
      returns an incorrect row count.

      Verified using count 'test' in hbase shell, the row count is correct.

      SELECT COUNT(row_key) is correct,
      SELECT COUNT(<Any subset of the columns>) is also correct.

      Clear the table, and changed to inserting 1000 rows, keep the number of columns, Drill returns the right count. But when increasing the number of columns to 30. SELLECT COUNT(<all>) returns an incorrect row count (only 673).

      Use count 'test' and scan 'test' in hbase to check the result, nothing usual were noticed.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              carrotyiyi Carrot Hu
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: