Mahout
  1. Mahout
  2. MAHOUT-484

The RecommenderJob exit ,some sub-jobs can not be run.

    Details

    • Type: Test Test
    • Status: Closed
    • Priority: Major Major
    • Resolution: Incomplete
    • Affects Version/s: 0.4
    • Fix Version/s: None
    • Labels:
      None

      Description

      I have done a few test today,
      The RecommenderJob exit in middle.
      The first time it exited when it finished RowSimilarityJob-CooccurrencesMapper-SimilarityReducer
      the second time it exited when it finished RECOMMENDATION_dogear_bookmark-Mapper-EntriesToVectorsReducer

      1. hs_err_pid7384.log
        32 kB
        Han Hui Wen
      2. patch-20100820.txt
        14 kB
        Han Hui Wen
      3. patch-20100824_2.txt
        9 kB
        Han Hui Wen
      4. screenshot-1.jpg
        269 kB
        Han Hui Wen
      5. screenshot-2.jpg
        276 kB
        Han Hui Wen

        Activity

        Hide
        Sebastian Schelter added a comment -

        I ran the current version of RecommenderJob on Elastic MapReduce today. It completed without any problems, so that issue seems to be related to your setup not to the Mahout code.

        Show
        Sebastian Schelter added a comment - I ran the current version of RecommenderJob on Elastic MapReduce today. It completed without any problems, so that issue seems to be related to your setup not to the Mahout code.
        Hide
        Han Hui Wen added a comment - - edited

        The reason is as following:

        1) I run the recommenderJob liking this way:

        hadoop jar ../../singlejar/mahout-core-0.4-SNAPSHOT.job org.apache.mahout.cf.taste.hadoop.item.RecommenderJob -Dmapred.job.name=RECOMMENDATION_tap_computed -Dmapred.reduce.tasks=80 -Dmapred.input.dir=/in -Dmapred.output.dir=/out -Dmapred.output.compress=false --tempDir /temp --itemsFile /userInvalidItemsFile --userInvalidItemsFile /itemsFile --numRecommendations 10 --booleanData false --similarityClassname SIMILARITY_TANIMOTO_COEFFICIENT --maxPrefsPerUser 10 --maxSimilaritiesPerItem 100 --maxCooccurrencesPerItem 200

        2) When hadoop run the RecommenderJob ,it will open the jar file mahout-core-0.4-SNAPSHOT.job and find all the entry of the jar file.
        2) when run job RowSimilarityJob ,it will open the jar file mahout-core-0.4-SNAPSHOT.job again ,but mahout-core-0.4-SNAPSHOT.job has already opened.
        so cause the problem.

        the detail can see the log file https://issues.apache.org/jira/secure/attachment/12452530/hs_err_pid7384.log.

        This issue is happened in many version JDK.
        http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6390352 (1.4)
        http://forums.sun.com/thread.jspa?threadID=762731 (1.4,1.5)
        http://forums.sun.com/thread.jspa?forumID=546&threadID=5423931

        JDK seems can not avoid this issue. so the sate way is converting nested call to flat call.

        Show
        Han Hui Wen added a comment - - edited The reason is as following: 1) I run the recommenderJob liking this way: hadoop jar ../../singlejar/mahout-core-0.4-SNAPSHOT.job org.apache.mahout.cf.taste.hadoop.item.RecommenderJob -Dmapred.job.name=RECOMMENDATION_tap_computed -Dmapred.reduce.tasks=80 -Dmapred.input.dir=/in -Dmapred.output.dir=/out -Dmapred.output.compress=false --tempDir /temp --itemsFile /userInvalidItemsFile --userInvalidItemsFile /itemsFile --numRecommendations 10 --booleanData false --similarityClassname SIMILARITY_TANIMOTO_COEFFICIENT --maxPrefsPerUser 10 --maxSimilaritiesPerItem 100 --maxCooccurrencesPerItem 200 2) When hadoop run the RecommenderJob ,it will open the jar file mahout-core-0.4-SNAPSHOT.job and find all the entry of the jar file. 2) when run job RowSimilarityJob ,it will open the jar file mahout-core-0.4-SNAPSHOT.job again ,but mahout-core-0.4-SNAPSHOT.job has already opened. so cause the problem. the detail can see the log file https://issues.apache.org/jira/secure/attachment/12452530/hs_err_pid7384.log . This issue is happened in many version JDK. http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6390352 (1.4) http://forums.sun.com/thread.jspa?threadID=762731 (1.4,1.5) http://forums.sun.com/thread.jspa?forumID=546&threadID=5423931 JDK seems can not avoid this issue. so the sate way is converting nested call to flat call.
        Hide
        Sebastian Schelter added a comment -

        Can you find out why this doesn't work for you with nested calling? Seems to work for everyone else and I don't want to change the code without knowing the exact reason.

        Show
        Sebastian Schelter added a comment - Can you find out why this doesn't work for you with nested calling? Seems to work for everyone else and I don't want to change the code without knowing the exact reason.
        Hide
        Han Hui Wen added a comment -

        Use this patch ,I can run correctly. otherwise the JVM will crash, the crash log is in the attachment .

        Show
        Han Hui Wen added a comment - Use this patch ,I can run correctly. otherwise the JVM will crash, the crash log is in the attachment .
        Hide
        Sebastian Schelter added a comment -

        Your patch file contains changes that have absolutely nothing todo with the issue reported here AFAIK. I think they are related to a discussion on the user mailing list about user-specific filtering.

        I cannot see why the RecommenderJob fails for you, it ran without problems on my local machine and on EMR. Did you test the latest version from the repository or did your test include your own changes?

        Show
        Sebastian Schelter added a comment - Your patch file contains changes that have absolutely nothing todo with the issue reported here AFAIK. I think they are related to a discussion on the user mailing list about user-specific filtering. I cannot see why the RecommenderJob fails for you, it ran without problems on my local machine and on EMR. Did you test the latest version from the repository or did your test include your own changes?
        Hide
        Han Hui Wen added a comment -

        Do you plan to change the nested calling jobs (weights,pairwiseSimilarity,asMatrix) to flat calling ?

        Show
        Han Hui Wen added a comment - Do you plan to change the nested calling jobs (weights,pairwiseSimilarity,asMatrix) to flat calling ?
        Hide
        Han Hui Wen added a comment -

        Call weights,pairwiseSimilarity,asMatrix in RecommenderJob directly, it worked.

        The attachment includes other change.

        Show
        Han Hui Wen added a comment - Call weights,pairwiseSimilarity,asMatrix in RecommenderJob directly, it worked. The attachment includes other change.
        Hide
        Han Hui Wen added a comment -

        File mahout-core-0.4-SNAPSHOT.job

        Show
        Han Hui Wen added a comment - File mahout-core-0.4-SNAPSHOT.job
        Hide
        Ted Dunning added a comment -

        I would normally concur with Sean, but this does look like it might be work-aroundable.

        What zip file is this that is being read multiply?

        Show
        Ted Dunning added a comment - I would normally concur with Sean, but this does look like it might be work-aroundable. What zip file is this that is being read multiply?
        Hide
        Han Hui Wen added a comment -

        See http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6390352

        problem appears to be that multiple instances of ZipFile are simultaneously open on the same underlying file
        

        Maybe we can do workaround.

        Maybe the reason is that we nested calling Job. we call RowSimilarityJob in RecommenderJob.

        Show
        Han Hui Wen added a comment - See http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6390352 problem appears to be that multiple instances of ZipFile are simultaneously open on the same underlying file Maybe we can do workaround. Maybe the reason is that we nested calling Job. we call RowSimilarityJob in RecommenderJob.
        Hide
        Sean Owen added a comment -

        That's a JVM bug, nothing to do with Mahout.

        Show
        Sean Owen added a comment - That's a JVM bug, nothing to do with Mahout.
        Hide
        Han Hui Wen added a comment -

        10/08/19 11:04:38 INFO common.AbstractJob: Command line arguments: {--endPhase=2147483647, --maxSimilaritiesPerRow=101, --numberOfColumns=132886, --similarityClassname=SIMILARITY_TANIMOTO_COEFFICIENT, --startPhase=0, --tempDir=/steer/item/tap/temp}
        #

        1. A fatal error has been detected by the Java Runtime Environment:
          #
        2. SIGBUS (0x7) at pc=0x00007f1d761ccf31, pid=7384, tid=139764528719632
          #
        3. JRE version: 6.0_20-b02
        4. Java VM: Java HotSpot(TM) 64-Bit Server VM (16.3-b01 mixed mode linux-amd64 )
        5. Problematic frame:
        6. C [libzip.so+0xaf31]
          #
        7. An error report file with more information is saved as:
        8. /home/randy/app/steer/item/tap/hs_err_pid7384.log
          #
        9. If you would like to submit a bug report, please visit:
        10. http://java.sun.com/webapps/bugreport/crash.jsp
        11. The crash happened outside the Java Virtual Machine in native code.
        12. See problematic frame for where to report the bug.
          #
          Aborted

        /home/randy/app/steer/item/tap/hs_err_pid7384.log file is in the attachment .

        Do not sure it's one system problem or program problem.

        Show
        Han Hui Wen added a comment - 10/08/19 11:04:38 INFO common.AbstractJob: Command line arguments: {--endPhase=2147483647, --maxSimilaritiesPerRow=101, --numberOfColumns=132886, --similarityClassname=SIMILARITY_TANIMOTO_COEFFICIENT, --startPhase=0, --tempDir=/steer/item/tap/temp} # A fatal error has been detected by the Java Runtime Environment: # SIGBUS (0x7) at pc=0x00007f1d761ccf31, pid=7384, tid=139764528719632 # JRE version: 6.0_20-b02 Java VM: Java HotSpot(TM) 64-Bit Server VM (16.3-b01 mixed mode linux-amd64 ) Problematic frame: C [libzip.so+0xaf31] # An error report file with more information is saved as: /home/randy/app/steer/item/tap/hs_err_pid7384.log # If you would like to submit a bug report, please visit: http://java.sun.com/webapps/bugreport/crash.jsp The crash happened outside the Java Virtual Machine in native code. See problematic frame for where to report the bug. # Aborted /home/randy/app/steer/item/tap/hs_err_pid7384.log file is in the attachment . Do not sure it's one system problem or program problem.
        Hide
        Han Hui Wen added a comment -

        Yep .I see . I will talk about it using mail in future.
        but it has no log for this

        Show
        Han Hui Wen added a comment - Yep .I see . I will talk about it using mail in future. but it has no log for this
        Hide
        Sean Owen added a comment -

        I don't really understand what you're saying. If you're saying it terminated abnormally, then you need to provide more specific information, like logs and stack traces. Screen shots are not useful.

        I think that instead of filing JIRA issues, many of your questions could be better discussed at user@mahout.apache.org. Many of the issues you are creating are really questions, and are not yet completely formed enough to justify a JIRA issue.

        Show
        Sean Owen added a comment - I don't really understand what you're saying. If you're saying it terminated abnormally, then you need to provide more specific information, like logs and stack traces. Screen shots are not useful. I think that instead of filing JIRA issues, many of your questions could be better discussed at user@mahout.apache.org. Many of the issues you are creating are really questions, and are not yet completely formed enough to justify a JIRA issue.

          People

          • Assignee:
            Unassigned
            Reporter:
            Han Hui Wen
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development