Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-1226

mahout ssvd Bt-job bug

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 0.7
    • 0.8
    • None
    • None
    • mahout-0.7
      hadoop-0.20.205.0

    Description

      when using mahout ssvd job, Bt-job creates lots of spills to disk.
      Those might be minimized by tuning hadoop io.sort.mb parameter.
      However, when io.sort.mb is bigger than ~ 1100 , ie. 1500 I'm getting that exception:

      java.io.IOException: Spill failed
      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1029)
      at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
      at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
      at org.apache.mahout.math.hadoop.stochasticsvd.BtJob$BtMapper$1.collect(BtJob.java:261)
      at org.apache.mahout.math.hadoop.stochasticsvd.BtJob$BtMapper$1.collect(BtJob.java:255)
      at org.apache.mahout.math.hadoop.stochasticsvd.SparseRowBlockAccumulator.flushBlock(SparseRowBlockAccumulator.java:65)
      at org.apache.mahout.math.hadoop.stochasticsvd.SparseRowBlockAccumulator.collect(SparseRowBlockAccumulator.java:75)
      at org.apache.mahout.math.hadoop.stochasticsvd.BtJob$BtMapper.map(BtJob.java:158)
      at org.apache.mahout.math.hadoop.stochasticsvd.BtJob$BtMapper.map(BtJob.java:102)
      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
      at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:415)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
      at org.apache.hadoop.mapred.Child.main(Child.java:249)
      Caused by: java.lang.RuntimeException: next value iterator failed
      at org.apache.hadoop.mapreduce.ReduceContext$ValueIterator.next(ReduceContext.java:166)
      at org.apache.mahout.math.hadoop.stochasticsvd.BtJob$OuterProductCombiner.reduce(BtJob.java:322)
      at org.apache.mahout.math.hadoop.stochasticsvd.BtJob$OuterProductCombiner.reduce(BtJob.java:302)
      at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
      at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1502)
      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1436)
      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853)
      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344)
      Caused by: java.io.EOFException
      at java.io.DataInputStream.readByte(DataInputStream.java:267)
      at org.apache.mahout.math.Varint.readUnsignedVarInt(Varint.java:159)
      at org.apache.mahout.math.hadoop.stochasticsvd.SparseRowBlockWritable.readFields(SparseRowBlockWritable.java:60)
      at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
      at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
      at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
      at org.apache.hadoop.mapreduce.ReduceContext$ValueIterator.next(ReduceContext.java:163)
      ... 7 more

      by changing this value I've already managed to reduce spills from 100 (for default io.sort.mb value) to 10, disk usage dropped from around 7 gigabytes for my small data set to around 900 mb. repairing this issue might bring big performance improvements.

      I've got lots of free ram, that's not some lack of memory issue.

      Attachments

        1. core-site.xml
          0.6 kB
          Jakub
        2. hdfs-site.xml
          0.5 kB
          Jakub
        3. mapred-site.xml
          2 kB
          Jakub

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            pawloch Jakub
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment