Uploaded image for project: 'Kylin'
  1. Kylin
  2. KYLIN-4106

Illegal partition for SelfDefineSortableKey when “Extract Fact Table Distinct Columns”

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: v2.6.1, v2.6.2
    • Fix Version/s: v3.0.0-beta, v2.6.4
    • Component/s: Job Engine
    • Labels:

      Description

      We got this error when Extract Fact Table Distinct Columns  @kylin 2.6.1

       

      Error: java.io.IOException: Illegal partition for org.apache.kylin.engine.mr.steps.SelfDefineSortableKey@6b69761b (254)
      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1096)
      at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:727)
      at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
      at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
      at org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper.writeFieldValue(FactDistinctColumnsMapper.java:
      281) at org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper.doMap(FactDistinctColumnsMapper.java:186)
      at org.apache.kylin.engine.mr.KylinMapper.map(KylinMapper.java:77)
      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
      at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:422)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
      at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
      

      I've found the problem in the follow code in FactDistinctColumnsReducerMapping.java – engine-mr

      public int getReducerIdForCol(int colId, Object fieldValue) {
          int begin = colIdToReducerBeginId[colId];
          int span = colIdToReducerBeginId[colId + 1] - begin;
           
          if (span == 1)
              return begin;
           
          int hash = fieldValue == null ? 0 : fieldValue.hashCode();
          return begin + Math.abs(hash) % span;
      }
      

      for the error rowkey it's begin=1, span=5 ,and we got hash=-2147483648 ,meanwhile Math.abs(-2147483648) return -2147483648 ,so for the above code it return -2 ( which was 254 while unsigned).

      this will also cause problem bellow when  Function getReduerIdForCol return -1 (when begin=1,span=3,hash= -2147483648) ,because value write to rowkey reducer is empty_text , but  No. -1 reducer need value text

      Error: java.nio.BufferUnderflowException at java.nio.Buffer.nextGetIndex(Buffer.java:500) 
      at java.nio.HeapByteBuffer.get(Heap.ByteBuffer.java:135)
      at org.apache.kylin.measure.hllc.HLLCounter.readRegisters(HLLCounter.java:327)
      at org.apache.kylin.engine.mr.steps.FactDistinctColumnsReducer.doReduce(FactDistinctColumnsReducer.java:145)
      org.apache.kylin.engine.mr.steps.FactDistinctColumnsReducer.doReduce(FactDistinctColumnsReducer.java:60)
      ...

       

       

        Attachments

          Activity

            People

            • Assignee:
              langdamao langdamao
              Reporter:
              langdamao langdamao
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: