Uploaded image for project: 'Hivemall'
  1. Hivemall
  2. HIVEMALL-199

Reduce memory usage of lda_predict

    XMLWordPrintableJSON

Details

    • Wish
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.5.0
    • 0.7.0
    • None

    Description

      LDA predict does not provide @AggregationType(estimable = true) and then optimizer does not perform reduce parallelization.

      And, we should revise LDAPredictUDAF to use less memory to avoid OOM.

      2018-04-23 04:04:34,081 FATAL [Thread-5] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: GC overhead limit exceeded
          at java.nio.ByteBuffer.wrap(ByteBuffer.java:373)
          at org.apache.hadoop.io.Text.decode(Text.java:389)
          at org.apache.hadoop.io.Text.toString(Text.java:280)
          at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveJavaObject(WritableStringObjectInspector.java:46)
          at org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getString(PrimitiveObjectInspectorUtils.java:823)
          at hivemall.topicmodel.LDAPredictUDAF$Evaluator.iterate(LDAPredictUDAF.java:298)
          at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:184)
          at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:641)
          at org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:838)
          at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:735)
          at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:803)
          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
          at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
          at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:638)
          at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:651)
          at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:654)
          at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:758)
          at org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256)
          at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:311)
          at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
          at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
          at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
          at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
          at java.security.AccessController.doPrivileged(Native Method)
          at javax.security.auth.Subject.doAs(Subject.java:422)
          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
          at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
      

      Attachments

        Issue Links

          Activity

            People

              myui Makoto Yui
              myui Makoto Yui
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: