Uploaded image for project: 'Hama'
  1. Hama
  2. HAMA-531

Data re-partitioning in BSPJobClient

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 0.6.1
    • None
    • None

    Description

      The re-partitioning the data is a very expensive operation. By the way, currently, we processes read/write operations sequentially using HDFS api in BSPJobClient from client-side. This causes potential too many open files error, contains HDFS overheads, and shows slow performance.

      We have to find another way to re-partitioning data.

      Attachments

        1. patch_v04.txt
          43 kB
          Edward J. Yoon
        2. patch_v03.txt
          42 kB
          Edward J. Yoon
        3. patch_v02.txt
          30 kB
          Edward J. Yoon
        4. patch.txt
          19 kB
          Edward J. Yoon
        5. HAMA-531_final.patch
          41 kB
          Thomas Jungblut
        6. HAMA-531_2.patch
          32 kB
          Thomas Jungblut
        7. HAMA-531_1.patch
          19 kB
          Thomas Jungblut

        Issue Links

          Activity

            People

              udanax Edward J. Yoon
              udanax Edward J. Yoon
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: