Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-2309

Incorrect regular expression for extracting task id from filename

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 0.7.1
    • 0.8.0
    • Query Processor
    • None

    Description

      For producing the correct filenames for bucketed tables, there is a method in Utilities.java that extracts out the task id from the filename and replaces it with the bucket number. There is a bug in the regex that is used to extract this value for attempt numbers >= 10:

      >>> re.match("^.*?([0-9]+)(_[0​-9])?(\\..*)?$", 'attempt_201107090429_6496​5_m_001210_10').group(1)
      '10'
      >>> re.match("^.*?([0-9]+)(_[0​-9])?(\\..*)?$", 'attempt_201107090429_6496​5_m_001210_9').group(1)
      '001210'
      

      Attachments

        1. HIVE-2309.1.patch
          0.8 kB
          Paul Yang
        2. HIVE-2309.2.patch
          1 kB
          Paul Yang

        Issue Links

          Activity

            People

              pauly Paul Yang
              pauly Paul Yang
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: