[ARROW-1993] [Python] Add function for determining implied Arrow schema from pandas.DataFrame - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.12.0
Component/s: Python
Labels:
- beginner
- pull-request-available

External issue URL:
https://github.com/apache/arrow/issues/17976

Description

Currently the only option is to use Table/Array.from_pandas which does significant unnecessary work and allocates memory. If only the schema is of interest, then we could do less work and not allocate memory.

We should provide the user a function pyarrow.Schema.from_pandas which takes a DataFrame as an input and returns the respective Arrow schema. The functionality for determing the schema is already available in the Python code, it is at moment just very tightly bound to the conversion infrastructure.

Attachments

Issue Links

links to

GitHub Pull Request #1929

Activity

People

Assignee:: Krisztian Szucs

Reporter:: Wes McKinney

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 12/Jan/18 23:03

Updated:: 11/Jan/23 07:18

Resolved:: 26/Nov/18 14:29

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

2h 20m