Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-5858

[Doc] Better document the Tensor classes in the prose documentation

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: C++, Documentation, Python
    • Labels:
      None

      Description

      From a comment from Wes McKinney in ARROW-2714:

      The Tensor classes are independent from the columnar data structures, though they reuse pieces of metadata, metadata serialization, memory management, and IPC.

      The purpose of adding these to the library was to have in-memory data structures for handling Tensor/ndarray data and metadata that "plug in" to the rest of the Arrow C++ system (Plasma store, IO subsystem, memory pools, buffers, etc.).

      Theoretically you could return a Tensor when creating a non-contiguous slice of an Array; in light of the above, I don't think that would be intuitive.

      When we started the project, our focus was creating an open standard for in-memory columnar data, a hitherto unsolved problem. The project's scope has expanded into peripheral problems in the same domain in the meantime (with the mantra of creating interoperable components, a use-what-you-need development platform for system developers). I think this aspect of the project could be better documented / advertised, since the project's initial focus on the columnar standard has given some the mistaken impression that we are not interested in any work outside of that.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              jorisvandenbossche Joris Van den Bossche
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: