Mahout
  1. Mahout
  2. MAHOUT-632

PFPGrowth : Exceeded max jobconf size

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.4, 0.5
    • Fix Version/s: 0.6
    • Component/s: None
    • Labels:
      None

      Description

      I'm getting this error right after startParallelCounting finishes :

      11/03/21 19:06:40 INFO mapred.JobClient: Map output records=164272900
      11/03/21 19:06:40 INFO mapred.JobClient: SPLIT_RAW_BYTES=2860
      11/03/21 19:06:40 INFO mapred.JobClient: Reduce input records=67087840
      11/03/21 19:07:02 INFO pfpgrowth.PFPGrowth: No of Features: 1788471
      11/03/21 19:07:09 WARN mapred.JobClient: Use GenericOptionsParser for
      parsing the arguments. Applications should implement Tool for the same.
      11/03/21 19:07:12 INFO input.FileInputFormat: Total input paths to process :
      20
      11/03/21 19:07:17 INFO mapred.JobClient: Cleaning up the staging area
      hdfs://nccc001:54310/mnt/analytics/data/hadoop/tmp/mapred/staging/isapps/.staging/job_201103101218_0287
      Exception in thread "main" org.apache.hadoop.ipc.RemoteException:
      java.io.IOException: java.io.IOException: Exceeded max jobconf size:
      72276915 limit: 52428800
      at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3759)
      at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
      at
      sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557)
      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1416)
      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1412)

      Quoting Robin : "I guess we just hit the limit of storing flist in the conf. Moving it do the distributed cache should fix this."

        Activity

        Vipul Pandey created issue -
        Robin Anil made changes -
        Field Original Value New Value
        Assignee Robin Anil [ robinanil ]
        Hide
        Robin Anil added a comment -

        Attaching patch. This will read and write frequency list to Distributed Cache.

        Show
        Robin Anil added a comment - Attaching patch. This will read and write frequency list to Distributed Cache.
        Robin Anil made changes -
        Attachment MAHOUT-632.patch [ 12474340 ]
        Robin Anil made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Sean Owen made changes -
        Fix Version/s 0.6 [ 12316364 ]
        Affects Version/s 0.5 [ 12315255 ]
        Hide
        Sean Owen added a comment -

        I updated Robin's patch after applying. Looks reasonable and tests pass. Committed.

        Show
        Sean Owen added a comment - I updated Robin's patch after applying. Looks reasonable and tests pass. Committed.
        Sean Owen made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Hudson added a comment -

        Integrated in Mahout-Quality #900 (See https://builds.apache.org/job/Mahout-Quality/900/)

        Show
        Hudson added a comment - Integrated in Mahout-Quality #900 (See https://builds.apache.org/job/Mahout-Quality/900/ )
        Sean Owen made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Suneel Marthi made changes -
        Component/s Frequent Itemset/Association Rule Mining [ 12313060 ]

          People

          • Assignee:
            Robin Anil
            Reporter:
            Vipul Pandey
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development