Uploaded image for project: 'HCatalog'
  1. HCatalog
  2. HCATALOG-538

HCatalogStorer fails for 100GB of data with dynamic partitioning (number of partition is 300)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.4, 0.5
    • 0.4.1
    • None
    • None
    • Hadoop 0.23.4
      HCatalog 0.4

    Description

      A hadoop job with 100GB of data and 300 partitions fails. All the maps succeed fine but the commit job fails after that. This looks like a timeout issue as commitJob() takes more than 10 minutes. I am running this on hadoop-0.23.4. I am playing with yarn.nm.liveness-monitor.expiry-interval-ms, yarn.am.liveness-monitor.expiry-interval-ms etc to make it work.

      This JIRA is for optimizing the commitJob(), as 10 minutes is too long.
      On a side note for storing 100GB of data without partition takes ~12 minutes, same amount of data with 300 partitions fails after 45 minutes. These tests were run on a 10 node cluster.

      Attachments

        1. HCATALOG-538-branch0.4-0.patch
          17 kB
          Arup Malakar
        2. HCATALOG-538-trunk-0.patch
          16 kB
          Arup Malakar

        Issue Links

          Activity

            People

              amalakar Arup Malakar
              amalakar Arup Malakar
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: