Uploaded image for project: 'Kylin'
  1. Kylin
  2. KYLIN-1985

SnapshotTable should only keep the columns described in tableDesc

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • v1.5.3
    • v1.5.4
    • Job Engine
    • None

    Description

      we suffered from a strange problem that we got a java.lang.ArrayIndexOutOfBoundsException when build of refresh a cube, exception stack like this :

      java.lang.IllegalStateException: Failed to load lookup table DIM_TABLE_NAME from snapshot /table_snapshot/dim_table_name/5a78a522-6f85-4650-b47d-6a5f5806b7f7.snapshot
      at org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:621)
      at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:61)
      at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42)
      at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:56)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
      at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
      at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:112)
      at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
      at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:112)
      at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:127)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:745)
      Caused by: java.lang.ArrayIndexOutOfBoundsException: 19
      at org.apache.kylin.dict.lookup.LookupStringTable.convertRow(LookupStringTable.java:85)
      at org.apache.kylin.dict.lookup.LookupStringTable.convertRow(LookupStringTable.java:34)
      at org.apache.kylin.dict.lookup.LookupTable.initRow(LookupTable.java:76)
      at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.java:67)
      at org.apache.kylin.dict.lookup.LookupStringTable.init(LookupStringTable.java:79)
      at org.apache.kylin.dict.lookup.LookupTable.<init>(LookupTable.java:55)
      at org.apache.kylin.dict.lookup.LookupStringTable.<init>(LookupStringTable.java:65)
      at org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:619)
      ... 13 more

      and a simple exception when queried by a lookup table dimension

      ERROR [http-bio-7070-exec-7] controller.QueryController:209 : Exception when execute sql
      at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
      at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
      at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:143)
      at org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:186)
      at org.apache.kylin.rest.service.QueryService.execute(QueryService.java:366)
      at org.apache.kylin.rest.service.QueryService.queryWithSqlMassage(QueryService.java:278)
      at org.apache.kylin.rest.service.QueryService.query(QueryService.java:121)
      at org.apache.kylin.rest.service.QueryService$$FastClassByCGLIB$$4957273f.invoke(<generated>)
      at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
      at org.springframework.aop.framework.Cglib2AopProxy$DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:618)
      at org.apache.kylin.rest.service.QueryService$$EnhancerByCGLIB$$315e2079.query(<generated>)
      at org.apache.kylin.rest.controller.QueryController.doQueryWithCache(QueryController.java:192)
      at org.apache.kylin.rest.controller.QueryController.query(QueryController.java:94)
      Caused by: java.lang.ArrayIndexOutOfBoundsException: 19
      at org.apache.kylin.dict.lookup.LookupStringTable.convertRow(LookupStringTable.java:85)
      at org.apache.kylin.dict.lookup.LookupStringTable.convertRow(LookupStringTable.java:34)
      at org.apache.kylin.dict.lookup.LookupTable.initRow(LookupTable.java:76)
      at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.java:67)
      at org.apache.kylin.dict.lookup.LookupStringTable.init(LookupStringTable.java:79)
      at org.apache.kylin.dict.lookup.LookupTable.<init>(LookupTable.java:55)
      at org.apache.kylin.dict.lookup.LookupStringTable.<init>(LookupStringTable.java:65)
      at org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:619)

      Though the exception message, we found that one lookup table had been changed in hive (add columns) and not been synchronized with kylin. However, the cause of this problem is too subtle and not easily found.
      As for SnapshotTable, only checking 'row.length <= maxIndex' in takeSnapshot method to detect 'Bad hive table row ' is not enough. And only encode and store the data columns described in tableDesc could be better since these columns not in tableDesc are not used in any cube definition.

      Attachments

        1. KYLIN-1985.patch
          2 kB
          zhengdong

        Activity

          People

            zhengd zhengdong
            zhengd zhengdong
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: