Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-1486

optimize estimation of number of reducers and local mode

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Query Processor
    • None

    Description

      Hive uses file system metadata to estimate number of reducers and to determine if jobs can be executed locally. This currently looks up file system metadata about each path serially and can take a long time in case number of files is very high.

      instead we can lookup part of the input space and try to approximate the size etc. summaries

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              jsensarma Joydeep Sen Sarma
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: