Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Add join support to DataFrame and LogicalPlan.
Logical Plan
My initial thoughts on the design of the LogicalPlan struct would be:
struct InnerJoin { left: Box<LogicalPlan>, right: Box<LogicalPlan>, left_keys: Vec<Expr>, right_keys: Vec<Expr> }
The left_keys and right_keys vectors must have the same length. Example pseudo-code:
let join = InnerJoin { left: read_parquet("customers"), right: read_parquer("orders"), left_keys: vec![col("id")], right_keys: vec![col("customer_id")] };
DataFrame
let customer = ctx.read_parquet("customers").alias("c"); let orders = ctx.read_parquet("orders").alias("o"); // generic join method that can support all types of join let join = customer.join(orders, col("c.id").eq("o.customer_id")) // or we could start with a more specific equijoin method let join = customer.inner_join(orders, vec![col("id")], vec![col("customer_id")]);
Attachments
Issue Links
- links to