Uploaded image for project: 'Kylin'
  1. Kylin
  2. KYLIN-889

Support more than one HDFS files of lookup table

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • v0.7.1
    • v1.0, v1.4.0
    • Job Engine
    • None

    Description

      There's assumption previous is lookup table should be small to fix into memory. And there's validation rule to check if there's only one HDFS file for that lookup table

      But there are too many cases are facing such issue, also there's requirement to support big lookup table.

      Exception:
      ========================================
      java.lang.IllegalStateException: Expect 1 and only 1 non-zero file under hdfs://masters/apps/hive/warehouse/d_nw_ne_ecell2, but find 4
      at org.apache.kylin.dict.lookup.HiveTable.findOnlyFile(HiveTable.java:123)
      at org.apache.kylin.dict.lookup.HiveTable.computeHDFSLocation(HiveTable.java:107)
      at org.apache.kylin.dict.lookup.HiveTable.getHDFSLocation(HiveTable.java:83)
      at org.apache.kylin.dict.lookup.HiveTable.getFileTable(HiveTable.java:76)
      at org.apache.kylin.dict.lookup.HiveTable.getSignature(HiveTable.java:71)
      at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:164)
      at org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:154)
      at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:53)
      at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42)
      at org.apache.kylin.job.hadoop.dict.CreateDictionaryJob.run(CreateDictionaryJob.java:53)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
      at org.apache.kylin.job.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
      at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
      at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
      at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
      at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:132)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)
      result code:2

      Attachments

        Activity

          People

            liyang.gmt8@gmail.com liyang
            lukehan Luke Han
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: