Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-9788

Handle naming inconsistencies between SQL, DataFrame API and struct names

    XMLWordPrintableJSON

Details

    Description

      Currently, we have naming inconsistencies between the different APIs that make it a bit confusing. The typical example atm is

      `df.where().to_plan?.explain()` shows a "Selection" in the plan when "Selection" in SQL and many other query languages is a projection, not a filter.

      Other examples:

      ```
      name: Selection
      SQL: WHERE
      DF: filter
      ```

      ```
      name: Aggregation
      SQL: GROUP BY
      DF: aggregate
      ```

      ```
      name: Projection
      SQL: SELECT
      DF: select,select_columns
      ```

      I suggest that we align them with a common notation, preferably aligned with other more common query languages.

      I am assigning this to you andygrove as you are probably the only person that can take a decision on this.

      Attachments

        Issue Links

          Activity

            People

              andygrove Andy Grove
              jorgecarleitao Jorge Leitão
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 20m
                  2h 20m