Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-154

Mapper runs out of memory

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Cannot Reproduce
    • None
    • None
    • None
    • None
    • Amazon EC2 Extra Large instance (4 cores, 15 GB RAM), Sun Java 6 (1.6.0_10); 1 Master, 4 Slaves (all the same); each Java process takes the argument "-Xmx700m" (2 Java processes per Instance)

    Description

      The hadoop job has the task of processing 4 directories in HDFS, each with 15 files. This is sample data, a test run, before I go to the needed 5 directories of about 800 documents each. The mapper takes in nearly 200 pages (not files) and throws an OutOfMemory exception. The largest file is 17 MB.

      If this problem is something on my end and not truly a bug, I apologize. However, after Googling a bit, I did see many threads of people running out of memory with small data sets.

      Attachments

        Activity

          People

            Unassigned Unassigned
            rjzak Richard J. Zak
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: