Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
4.0.0
-
None
-
None
Description
For many applications one of the column of a table can be (and must be) naturally mapped
to HBase timestamp. What it gives us is the optimization on StoreScanner where HFiles with timestamps out of range of
a Scan operator will be omitted. Let us say that we have time-series type of data (EVENTS) and custom compaction, where we create
series of HFiles with continuous non-overlapping timestamp ranges.
CREATE TABLE IF NOT EXISTS ODS.EVENTS (
METRICID VARCHAR NOT NULL,
METRICNAME VARCHAR,
SERVICENAME VARCHAR NOT NULL,
ORIGIN VARCHAR NOT NULL,
APPID VARCHAR,
IPID VARCHAR,
NVALUE DOUBLE,
TIME TIMESTAMP NOT NULL /+ TIMESTAMP +/,
DATA VARCHAR,
SVALUE VARCHAR
CONSTRAINT PK PRIMARY KEY (METRICID, SERVICENAME, ORIGIN, APPID, IPID, TIME)
) SALT_BUCKETS=40, IMMUTABLE_ROWS=true,VERSIONS=1,DATA_BLOCK_ENCODING='NONE';
Make note on TIME TIMESTAMP NOT NULL /+ TIMESTAMP +/ - this is the Hint to Phoenix that the column
TIME must be mapped to HBase timestamp.
The Query:
Select all events of type 'X' for last 7 days
SELECT * from EVENTS WHERE METRICID = 'X' and TIME < NOW() and TIME > NOW() - 7*24*3600000; (this may be not correct SQL syntax of course)
These types of queries will be efficiently optimized if:
1. Phoenix maps TIME column to HBase timestamp
2. Phoenix smart enough to map WHERE clause on TIME attribute to Scan timerange
Although this :
Properties props = new Properties();
props.setProperty(PhoenixRuntime.CURRENT_SCN_ATTRIB, Long.toString(ts));
Connection conn = DriverManager.connect(myUrl, props);
conn.createStatement().execute("UPSERT INTO myTable VALUES ('a')");
conn.commit();
will work in my case- it may not be efficient from performance point of view because for every INSERT/UPSERT
new Connection object and new Statement is created, beside this we still need the optimization 2. (see above).
Attachments
Attachments
Issue Links
- relates to
-
PHOENIX-967 Consider (some of) SQL 2011 temporal support
- Open