Mahout
  1. Mahout
  2. MAHOUT-722

Ignore the line when input text file contains irregular entry

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.6
    • Component/s: None
    • Labels:
      None

      Description

      RecommenderJob with usersFile/itemsFile which contains newline at end of file is failed.

      java.lang.NumberFormatException: For input string: ""
      	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
      	at java.lang.Long.parseLong(Long.java:431)
      	at java.lang.Long.parseLong(Long.java:468)
      	at org.apache.mahout.cf.taste.hadoop.item.UserVectorSplitterMapper.setup(UserVectorSplitterMapper.java:61)
      	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
      	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:629)
      	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:310)
      	at org.apache.hadoop.mapred.Child.main(Child.java:170)
      

      I think lines which cause parse error should be ignored.

      1. ignore.patch
        3 kB
        Daisuke Miyamoto

        Activity

        Daisuke Miyamoto created issue -
        Daisuke Miyamoto made changes -
        Field Original Value New Value
        Attachment ignore.patch [ 12481446 ]
        Daisuke Miyamoto made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Sean Owen made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Assignee Sean Owen [ srowen ]
        Fix Version/s 0.6 [ 12316364 ]
        Resolution Fixed [ 1 ]
        Sean Owen made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Sean Owen
            Reporter:
            Daisuke Miyamoto
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development