Uploaded image for project: 'Kylin'
  1. Kylin
  2. KYLIN-4015

Kylin build cube error at the "Build UHC Dictionary" step

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: v2.5.2
    • Fix Version/s: v2.6.3
    • Component/s: Job Engine
    • Labels:
    • Environment:
      Fusion Insight

      Description

      Hi All:

      We know, kylin builds dimension dictionary in kylin job client. But if a cube has uhc dimensions, it will cost much more CPU and memory resources. Kylin provides the ability to build uhc dictionary using the MR engine to reduce the resource consumption of the build engine.

      But I find that the "Build UHC Dictionary" step build error. This step run using MR engine. This is the error info from yarn:

      org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: hdfs://hacluster/xxx.../xxx/fact_distinct_columns/xxx/FIELD_NAME.dic-r-00001 not a SequenceFile.
      at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java
      at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java
      at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java

      The reason of this problem is that the "Extract Fact Table Distinct " step output two type of files:".dci" and ".rldict"; but the ".dci" file is not  a sequence file, so the "Build UHC Dictionary" step should filter ".dci" file when run with MR engine.

      I resolve this problem and will summit my code.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                zhao jintao zhao jintao
                Reporter:
                zhao jintao zhao jintao
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Due:
                  Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 168h
                  168h
                  Remaining:
                  Remaining Estimate - 168h
                  168h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified