Description
New York TLC has replaced their CSV dataset with a Parquet version, so we should switch to that.
Since 5/12, NYC Taxi dataset used in benchmarks no longer exists as CSV's; has been replaced with Parquet
https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page
On 05/13/2022, we are making the following changes to trip record files: All files will be stored in the Parquet format. Please see the ‘Working With Parquet Format’ under the Data Dictionaries and MetaData section.
Attachments
Issue Links
- causes
-
ORC-1696 Fix ClassCastException when reading avro decimal type in bechmark
- Closed
- links to