Mahout
  1. Mahout
  2. MAHOUT-354

make the output of RecommenderJob more readable

    Details

      Description

      now the output of RecommenderMapper as following:
      output.collect(userID, new RecommendedItemsWritable(recommendations));

      Can we change it more readable like following:

      private final Text user = new Text();
      private final Text recomScore = new Text();
      private static final String FIELD_SEPERATOR = ",";
      for (RecommendedItem recommendation : recommendations)

      { user.set(String.valueOf(userId)); recomScore.set(recommendation.getItemID() + FIELD_SEPERATOR + recommendation.getValue()); output.collect(user, recomScore); }

      then user can read and verify the result more convenient and need not depend on the mahout API

      1. screenshot-1.jpg
        322 kB
        Han Hui Wen

        Activity

        Hide
        Sean Owen added a comment -

        I understand your point, though I'm reluctant to express the output this way. There is one, ordered list of recommendations per user. Expressing it this way says they are effectively a "set" of unordered and unrelated recommendations, which isn't quite right.

        It's also not guaranteed that recommendations for one user will appear together this way, which could be confusing.

        Does it make anything meaningfully easier to parse? it's already human-readable, and not a binary format.

        Show
        Sean Owen added a comment - I understand your point, though I'm reluctant to express the output this way. There is one, ordered list of recommendations per user. Expressing it this way says they are effectively a "set" of unordered and unrelated recommendations, which isn't quite right. It's also not guaranteed that recommendations for one user will appear together this way, which could be confusing. Does it make anything meaningfully easier to parse? it's already human-readable, and not a binary format.
        Hide
        Han Hui Wen added a comment -

        the output like this

        Show
        Han Hui Wen added a comment - the output like this
        Hide
        Sean Owen added a comment -

        You're dumping the compressed file as raw bytes – uncompress it to see the textual representation.

        Show
        Sean Owen added a comment - You're dumping the compressed file as raw bytes – uncompress it to see the textual representation.
        Hide
        Han Hui Wen added a comment -

        I already comment line following file for test.
        recommenderConf.setClass("mapred.output.compression.codec", GzipCodec.class, CompressionCodec.class);

        JobConf recommenderConf = AbstractJob.prepareJobConf(userVectorPath, outputPath, jarFile,
        SequenceFileInputFormat.class, RecommenderMapper.class, LongWritable.class,
        RecommendedItemsWritable.class, IdentityReducer.class, LongWritable.class,
        RecommendedItemsWritable.class, TextOutputFormat.class);

        I am not sure TextOutputFormat compatible with RecommendedItemsWritable.class.

        Show
        Han Hui Wen added a comment - I already comment line following file for test. recommenderConf.setClass("mapred.output.compression.codec", GzipCodec.class, CompressionCodec.class); JobConf recommenderConf = AbstractJob.prepareJobConf(userVectorPath, outputPath, jarFile, SequenceFileInputFormat.class, RecommenderMapper.class, LongWritable.class, RecommendedItemsWritable.class, IdentityReducer.class, LongWritable.class, RecommendedItemsWritable.class, TextOutputFormat.class); I am not sure TextOutputFormat compatible with RecommendedItemsWritable.class.
        Hide
        Sean Owen added a comment -

        Hm, there's no reason it should not work with a text output format. It implements toString(), and It works for me as intended. Are you sure you're looking at the correct file, and the ones generated by a run where compression is disabled?

        Show
        Sean Owen added a comment - Hm, there's no reason it should not work with a text output format. It implements toString(), and It works for me as intended. Are you sure you're looking at the correct file, and the ones generated by a run where compression is disabled?
        Hide
        Han Hui Wen added a comment - - edited

        I will try it again

        Show
        Han Hui Wen added a comment - - edited I will try it again
        Hide
        Han Hui Wen added a comment -

        I run the test several times,It 's also can not readable,seems binary.

        I already commented compression line.

        if (usersFile != null)

        { recommenderConf.set(RecommenderMapper.USERS_FILE, usersFile); }

        //recommenderConf.setClass("mapred.output.compression.codec", GzipCodec.class, CompressionCodec.class);
        JobClient.runJob(recommenderConf);

        It's very weird.

        Show
        Han Hui Wen added a comment - I run the test several times,It 's also can not readable,seems binary. I already commented compression line. if (usersFile != null) { recommenderConf.set(RecommenderMapper.USERS_FILE, usersFile); } //recommenderConf.setClass("mapred.output.compression.codec", GzipCodec.class, CompressionCodec.class); JobClient.runJob(recommenderConf); It's very weird.
        Hide
        Sean Owen added a comment -

        Try commenting out

        jobConf.setBoolean("mapred.output.compress", true);

        In AbstractJob. I am making this settable from the command line.

        Show
        Sean Owen added a comment - Try commenting out jobConf.setBoolean("mapred.output.compress", true); In AbstractJob. I am making this settable from the command line.

          People

          • Assignee:
            Sean Owen
            Reporter:
            Han Hui Wen
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development