Details
-
New Feature
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
None
-
None
Description
Currently Apache spot can be intergated with physical Hive tables only (Parquet)
It's good for new projects/parsers but not convinient for existing.
Supporting Hive views in Apache Spot can solve at least 3 problems:
1) Any existing schema can be easy converted to ODM (It will be just mapping ODM fields with schema-specific, without rewriting existing parsers and exporting/importing existing data into).
2) As result - no duplication of data.
3) 2 similar data sources (like bluecoat and squid - both are proxies) can be unified into one view and this view represents one kind of data - proxy
One possible impact - perfomance. However Hive views can be partitioned as well, see https://cwiki.apache.org/confluence/display/Hive/PartitionedViews
Attachments
Issue Links
- links to