Details
Description
Currently, Tajo uses HDFS as a primary storage. But, as a data warehouse system, Tajo should easily support various data sources.
For this, I propose a generic storage handler interface that provides common storage methods as follows:
- splitting input data
- locality
- accessing catalog (if providing)
- creating a table
- removing a table
- adding default table properties and validating properties
- committing, rollback, and clean up output tables
- getting table physical information like table volumes and others
- managing connection pool for connection-based storages
- adding storage-specified rewrite rules
- adding hooks for query phases
- physical properties like instant random access, indexible, read throughput, and write throughput
Attachments
Attachments
Issue Links
- incorporates
-
TAJO-1163 TableDesc should use URI instead of Path.
- Resolved
- is related to
-
TAJO-1127 Implements HBaseStorageManager
- Resolved
-
TAJO-1123 Use Fragment instead of FileFragment.
- Resolved
-
TAJO-1879 Add enabled and description fields to tablespace entry in storage-site.json
- Open
- is required by
-
TAJO-367 Separate the locality information from Fragment
- Resolved
-
TAJO-1595 Pluggable Storage Handler
- Resolved
1.
|
Add 'UPDATE' statement for OLTP-like storages | Open | Unassigned | |
2.
|
Add 'Upsert' statement for OLTP-like storages | Open | Unassigned | |
3.
|
Index support of underlying storages | Open | Unassigned | |
4.
|
Implement formats loading and registration | Open | Unassigned |