Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-9017

[Python] Refactor the Scalar classes

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.0.0
    • Python

    Description

      The situation regarding scalars in Python is currently not optimal.

      We have two different "types" of scalars:

      • ArrayValue(Scalar) (and subclasses of that for all types): this is used when you access a single element of an array (eg arr[0])
      • ScalarValue(Scalar) (and subclasses of that for some types): this is used when wrapping a C++ scalar into a python scalar, eg when you get back a scalar from a reduction like arr.sum().

      And while we have two versions of scalars, neither of them can actually easily be used as scalar as they both can't be constructed from a python scalar (there is no scalar(1) function to use when calling a kernel, for example).

      I think we should try to unify those scalar classes? (which probably means getting rid of the ArrayValue scalar)

      In addition, there is an issue of trying to re-use python scalar <-> arrow conversion code, as this is also logic for this in the python_to_arrow.cc code. But this is probably a bigger change. cc kszucs

      Attachments

        Issue Links

          Activity

            People

              kszucs Krisztian Szucs
              jorisvandenbossche Joris Van den Bossche
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10h 20m
                  10h 20m