Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3895

Matching for HDFS filenames in ERROR sections is broken

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: Impala 2.6.0
    • Fix Version/s: Impala 2.7.0
    • Component/s: Infrastructure
    • Labels:
      None

      Description

      In a test ERROR section, some tests write:

      file: hdfs://regex:.$

      to make it look as though they match a regex after hdfs:// to match up to the end of the line. This is not what happens in reality: instead the test result verifier substitutes result strings that look like hdfs filenames with file:/ hdfs://regex:.$. The 'regex' is never actually compiled or matched against, and the eagle-eyed reader will note that it would only match one character long HDFS filenames.

      In fact, regex matching is broken in ERROR sections anyhow as they are compared without respect to order, for which our matching algorithm does not work when the expected and actual results are textually different (even if they are regex matches for each other).

      Instead, let's stop pretending that there are user-controlled regexes here at all, and just use a simple _HDFS_FILENAME_ special string to stand for 'match something that looks like an HDFS filename'. This will have the benefit of being far less confusing, and allowing us to continue matching text up until the end of the row (rather than substituting it with '.*$' which matches anything).

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                henryr Henry Robinson
                Reporter:
                henryr Henry Robinson
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: