Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-17229

[C++] ReadRel is translated to a source node that emits unexpected fields

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • C++

    Description

      Currently, a Substrait plan with a RelRoot containing a ReadRel will contain extra, unexpected fields, namely __fragment_index et. al. Right now they are always included by default. There are a few things to be done:

      • ReadRel's base_schema could be converted into a ScanOptions.dataset_schema to limit the fields read. (Also see ARROW-15585, these fields should be used for pushdown projection)
      • The scanner always adds these extra fields - maybe it should be opt-in instead
      • There's no way to manually insert a Project to "fix" things because as implemented, it can only add new columns

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              lidavidm David Li
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: