Uploaded image for project: 'Kylin'
  1. Kylin
  2. KYLIN-889

Support more than one HDFS files of lookup table

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: v0.7.1
    • Fix Version/s: v1.0, v1.4.0
    • Component/s: Job Engine
    • Labels:
      None

      Description

      There's assumption previous is lookup table should be small to fix into memory. And there's validation rule to check if there's only one HDFS file for that lookup table

      But there are too many cases are facing such issue, also there's requirement to support big lookup table.

      Exception:
      ========================================
      java.lang.IllegalStateException: Expect 1 and only 1 non-zero file under hdfs://masters/apps/hive/warehouse/d_nw_ne_ecell2, but find 4
      at org.apache.kylin.dict.lookup.HiveTable.findOnlyFile(HiveTable.java:123)
      at org.apache.kylin.dict.lookup.HiveTable.computeHDFSLocation(HiveTable.java:107)
      at org.apache.kylin.dict.lookup.HiveTable.getHDFSLocation(HiveTable.java:83)
      at org.apache.kylin.dict.lookup.HiveTable.getFileTable(HiveTable.java:76)
      at org.apache.kylin.dict.lookup.HiveTable.getSignature(HiveTable.java:71)
      at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:164)
      at org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:154)
      at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:53)
      at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42)
      at org.apache.kylin.job.hadoop.dict.CreateDictionaryJob.run(CreateDictionaryJob.java:53)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
      at org.apache.kylin.job.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
      at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
      at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
      at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
      at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:132)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)
      result code:2

        Attachments

          Activity

            People

            • Assignee:
              liyang.gmt8@gmail.com liyang
              Reporter:
              lukehan Luke Han
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: