Type: New Feature
Resolution: Won't Fix
Affects Version/s: None
Fix Version/s: None
In the last 18 month PigLatin has gained significant popularity within the open source community. Many users like its data flow model, its rich type system and its ability to work with any data available on HDFS or outside. We have also heard from many users that having Pig speak SQL would bring many more users. Having a single system that exports multiple interfaces is a big advantage as it guarantees consistent semantics, custom code reuse, and reduces the amount of maintenance. This is especially relevant for project where using both interfaces for different parts of the system is relevant. For instance, in a
data warehousing system, you would have ETL component that brings data into the warehouse and a component that analyzes the data and produces reports. PigLatin is uniquely suited for ETL processing while SQL might be a better fit for report generation.
To start, it would make sense to implement a subset of SQL92 standard and to be as much as possible standard compliant. This would include all the standard constructs: select, from, where, group-by + having, order by, limit, join (inner + outer). Several extensions such as support for pig's UDFs and possibly streaming, multiquery and support for pig's complex types would be helpful.
This work is dependent on metadata support outlined in https://issues.apache.org/jira/browse/PIG-823
|Assignee||Thejas M Nair [ thejas ]|
|Status||Open [ 1 ]||Resolved [ 5 ]|
|Resolution||Won't Fix [ 2 ]|