Details
-
Sub-task
-
Status: Resolved
-
Resolution: Duplicate
-
None
-
None
-
None
-
523
Description
When you create a TABLE, we insert an empty key value into the first column family that we can count on being there for every row. For a VIEW, we don't do that, so we just fall back on projecting everything into a scan. If there are lots of columns (for example, 60,000 in [this](https://groups.google.com/forum/_!topic/phoenix-hbase-user/JgQjlqC4-uw) case), the scan is very slow.
Instead, we should only project everything when absolutely necessary, in these cases:
- IS NULL expression
- CASE WHEN with an ELSE expression
- Usages of row value constructor
- When a column in the primary key is used
- When there is no where clause
- When there is a group by of a nullable expression
We could potentially do the same for a TABLE, but the empty key value seems like a better trade off as far as performance goes. In addition, we need the empty key value as a row cannot exist without at least one key value, making it impossible to support use cases that only define a primary key.