Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-8314

[Python] Provide a method to select a subset of columns of a Table

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.0.0
    • Python

    Description

      I looked through the open issues and in our API, but didn't directly find something about selecting a subset of columns of a table.

      Assume you have a table like:

      table = pa.table({'a': [1, 2], 'b': [.1, .2], 'c': ['a', 'b']})
      

      You can select a single column with table.column('a') or table['a'] to get a chunked array. You can add, append, remove and replace columns (with add_column, append_column, remove_column, set_column).
      But an easy way to get a subset of the columns (without the manuall removing the ones you don't want one by one) doesn't seem possible.

      I would propose something like:

      table.select(['a', 'c'])
      

      Attachments

        Issue Links

          Activity

            People

              jorisvandenbossche Joris Van den Bossche
              jorisvandenbossche Joris Van den Bossche
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 3.5h
                  3.5h