[REEF-1765] Building a Parquet Reader for Potential Integrations with Other ML Frameworks - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: 0.16
Fix Version/s: 0.16
Component/s: REEF.NET IO
Labels:
- features

Description

Parquet file format is very common in some well-known frameworks like Hadoop and Spark. By enabling REEF to read parquet file, we could potentially integrate with those frameworks. Currently we want to only support data of non-nested types with a table-like property. This allows us to transform the data into formats like RDDs, etc.

A draft of ParquetReader is provided here in a PR: https://github.com/apache/reef/pull/1283

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Shouheng Yi

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 04/Apr/17 17:55

Updated:: 11/Apr/17 18:37

Resolved:: 11/Apr/17 18:11

Time Tracking

Estimated:

336h

Remaining:

336h

Logged:

Not Specified