Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3491

SELECT COUNT(*) FROM HBASE Returns Incorrect Row Count

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.0.0, 1.1.0
    • Future
    • Storage - HBase
    • CentOS6.5
      jdk1.8.0_45
      hadoop-2.6.0-cdh5.4.2
      hbase-1.0.0-cdh5.4.2
      IntelliJ14.1.4
      Maven3.0.5

    Description

      Create a table 'test' in Hbase with 1 column family, 7 columns.
      Inserting 100,000 rows into 'test' using Java API, each column with same value = "value".

      SELECT COUNT(<all>) FROM hbase.test
      returns an incorrect row count.

      Verified using count 'test' in hbase shell, the row count is correct.

      SELECT COUNT(row_key) is correct,
      SELECT COUNT(<Any subset of the columns>) is also correct.

      Clear the table, and changed to inserting 1000 rows, keep the number of columns, Drill returns the right count. But when increasing the number of columns to 30. SELLECT COUNT(<all>) returns an incorrect row count (only 673).

      Use count 'test' and scan 'test' in hbase to check the result, nothing usual were noticed.

      Attachments

        Activity

          People

            Unassigned Unassigned
            carrotyiyi Carrot Hu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: