Details
Description
Spot-ml looks for data in days and months numbers prefixed by zero when the number is a single digit:
Exception in thread "main" java.lang.AssertionError: assertion failed: No predefined schema found, and no Parquet data files or summary files found under maprfs:///user/spot/flow/hive/y=2016/m=12/d=01. <<<<---- zero pad
While spot-ingest has stored them in Hive without the zero prefix:
hadoop fs -ls /user/spot/flow/hive/y=2016/m=12/
drwxr-xrwx - spot spot 24 2016-12-01 18:12 /user/spot/flow/hive/y=2016/m=12/d=1 <<<---- No zero pad
If the directory/partition is manually renamed in HDFS, and 'MSCK REPAIR TABLE flow;' is run manually from Hive, ML then will run for that day.
Attachments
Issue Links
- relates to
-
SPOT-239 Partition columns should be defined as string in create_flow_parquet.hql
- Open