Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-6348

Redact only sensitive fields in runtime profile

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 2.7.0, Impala 2.8.0, Impala 2.10.0, Impala 2.11.0
    • Impala 2.12.0
    • None
    • ghx-label-1

    Description

      Currently, the redactor is run on every info string in the run-time profile.

      void RuntimeProfile::AddInfoStringInternal(
          const string& key, const string& value, bool append) {
        // Values may contain sensitive data, such as a query.
        const string& info = RedactCopy(value);  <-----
        lock_guard<SpinLock> l(info_strings_lock_);
        InfoStrings::iterator it = info_strings_.find(key);
        if (it == info_strings_.end()) {
          info_strings_.insert(make_pair(key, info));
          info_strings_display_order_.push_back(key);
        } else {
          if (append) {
            it->second += ", " + value;
          } else {
            it->second = info;
          }
        }
      }
      

      For example, if the user tries to redact with the following regex with the intention that all emails in the query string to be redacted, the side effect of the bug is that it redacts the "User" and "Connected user" parts of the query profile.

      {
        "version": 1,
        "rules": [
          {
            "description": "Email addresses",
            "search": "\\b([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\\-\\._]*[A-Za-z0-9])@([A-Za-z0-9\\.]|[A-Za-z\\.][A-Za-z0-9\\-\\.]*[A-Za-z0-9\\.])+\\b",
            "caseSensitive": true,
            "replace": "email@redacted.host"
          }
        ]
      
      Query (id=e24f32fa563e2c5d:9ddefb2300000000)
        Summary
          Session ID: 634deaf67308fdd0:781af1fe76464ca9
          Session Type: BEESWAX
          Start Time: 2017-12-13 13:34:31.984911000
          End Time: 2017-12-13 13:34:37.781489000
          Query Type: QUERY
          Query State: FINISHED
          Query Status: OK
          Impala Version: impalad version 2.10.0 RELEASE (build 871adff6d6e56b57de33059dec2d7fe38e2366bd)
          User: email@redacted.host <================ not expected
          Connected User: email@redacted.host <====== not expected
      

      Expected fix: Redact only the sensitive fields. Do not redact anything else in the run-time profiles

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            bharathv Bharath Vissapragada
            bharathv Bharath Vissapragada
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment