Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
v0.7.1
-
None
Description
There's assumption previous is lookup table should be small to fix into memory. And there's validation rule to check if there's only one HDFS file for that lookup table
But there are too many cases are facing such issue, also there's requirement to support big lookup table.
Exception:
========================================
java.lang.IllegalStateException: Expect 1 and only 1 non-zero file under hdfs://masters/apps/hive/warehouse/d_nw_ne_ecell2, but find 4
at org.apache.kylin.dict.lookup.HiveTable.findOnlyFile(HiveTable.java:123)
at org.apache.kylin.dict.lookup.HiveTable.computeHDFSLocation(HiveTable.java:107)
at org.apache.kylin.dict.lookup.HiveTable.getHDFSLocation(HiveTable.java:83)
at org.apache.kylin.dict.lookup.HiveTable.getFileTable(HiveTable.java:76)
at org.apache.kylin.dict.lookup.HiveTable.getSignature(HiveTable.java:71)
at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:164)
at org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:154)
at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:53)
at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42)
at org.apache.kylin.job.hadoop.dict.CreateDictionaryJob.run(CreateDictionaryJob.java:53)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.kylin.job.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:132)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
result code:2