Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-5287

[Python] automatic type inference for arrays of tuples

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Python
    • Labels:
      None

      Description

      Arrays of tuples are support to be converted to either ListArray or StructArray, if you specify the type explicitly:

      In [6]: pa.array([(1, 2), (3, 4, 5)], type=pa.list_(pa.int64())) 
      Out[6]: 
      <pyarrow.lib.ListArray object at 0x7f1b01a4d408>
      [
        [
          1,
          2
        ],
        [
          3,
          4,
          5
        ]
      ]
      
      In [7]: pa.array([(1, 2), (3, 4)], type=pa.struct([('a', pa.int64()), ('b', pa.int64())]))
      Out[7]: 
      <pyarrow.lib.StructArray object at 0x7f1b01a51b88>
      -- is_valid: all not null
      -- child 0 type: int64
        [
          1,
          3
        ]
      -- child 1 type: int64
        [
          2,
          4
        ]
      

      But not when no type is specified:

      In [8]: pa.array([(1, 2), (3, 4)])                                                                                                                            
      ---------------------------------------------------------------------------
      ArrowInvalid                              Traceback (most recent call last)
      <ipython-input-8-ab2d80c7486d> in <module>
      ----> 1 pa.array([(1, 2), (3, 4)])
      
      ~/scipy/repos/arrow/python/pyarrow/array.pxi in pyarrow.lib.array()
      
      ~/scipy/repos/arrow/python/pyarrow/array.pxi in pyarrow.lib._sequence_to_array()
      
      ~/scipy/repos/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status()
      
      ArrowInvalid: Could not convert (1, 2) with type tuple: did not recognize Python value type when inferring an Arrow data type
      

      Do we want to do automatic type inference for tuples as well? (defaulting to the ListArray case, just as arrays of python lists are supported)
      Or was there a specific reason to not support this by default?

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                jorisvandenbossche Joris Van den Bossche
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated: