Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-8197

[Rust] DataFusion "create_physical_plan" returns incorrect schema?

    XMLWordPrintableJSON

Details

    Description

      I am using DataFusion in a situation where I know there will only be a single file.  DataFusion currently collects all batches into a vector.

      As I am writing the data back out I want to work with an iterator instead of a vector.

      I have something as follows:

      let plan = ctx.create_logical_plan(&sql).unwrap();
      let plan = ctx.optimize(&plan).unwrap();
      dbg!(plan.schema());  // Returns field names
      let plan = ctx.create_physical_plan(&plan, batch_size).unwrap();
      dbg!(plan.schema()); // Returns c0, c1, etc

      Maybe this is expected after turning the plan into a physical plan?

      I can change the schema of the returned batches, would this be the recommended way to address this or is there something in DataFusion I should leverage to do this?

      Attachments

        Issue Links

          Activity

            People

              andygrove Andy Grove
              paddyhoran Paddy Horan
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h