Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4965

EXPLAIN output blocked by Sentry but appears in query profile

    XMLWordPrintableJSON

Details

    Description

      Scenario:

      • I have a table "my_secret_table" which contains sensitive information
      • I expose this through a view "my_redacted_view" to filter out the sensitive rows
      • I have Sentry privileges to select from the view, but no privileges on the underlying secret table

      First, I try and run an explain plan for a SELECT query on the view.

      [ip-x-x-x-x.eu-west-1.compute.internal:21000] > explain select * from my_redacted_view;
      Query: explain select * from my_redacted_view
      ERROR: AuthorizationException: User 'dbeech@DBEECH.COM' does not have privileges to EXPLAIN this statement.
      

      This is (correctly) blocked by Sentry. I now run the SELECT query itself:

      [ip-x-x-x-x.eu-west-1.compute.internal:21000] > select * from my_redacted_view limit 10;
      Query: select * from my_redacted_view limit 10
      Query submitted at: 2017-02-22 15:49:44 (Coordinator: http://ip-x-x-x-x.eu-west-1.compute.internal:25000)
      Query progress can be monitored at: http://ip-x-x-x-x.eu-west-1.compute.internal:25000/query_plan?query_id=2c4ec5d9b8091994:bef02f400000000
      +----------+-------------+-------------+------------------+
      | id       | name        | telno       | address          |
      +----------+-------------+-------------+------------------+
      ... returned results ...
      +----------+-------------+-------------+------------------+
      Fetched 10 row(s) in 5.86s
      

      The problem is that I can now request the query profile. The explain plan that I was blocked from seeing previously appears here, leaking information about the secret table and the method used to redact the data.

      [ip-x-x-x-x.eu-west-1.compute.internal:21000] > profile;
      Query Runtime Profile:
      Query (id=2c4ec5d9b8091994:bef02f400000000):
        Summary:
      ...
      ----------------
      Estimated Per-Host Requirements: Memory=176.00MB VCores=1
      
      PLAN-ROOT SINK
      |
      01:EXCHANGE [UNPARTITIONED]
      |  limit: 10
      |  hosts=3 per-host-mem=unavailable
      |  tuple-ids=0 row-size=110B cardinality=10
      |
      00:SCAN HDFS [default.my_secret_table, RANDOM]
         partitions=1/1 files=1 size=476.01MB
         predicates: NOT default.is_sensitive('phone_number', my_secret_table.telno)
         table stats: 10000000 rows total
         column stats: all
         limit: 10
         hosts=3 per-host-mem=176.00MB
         tuple-ids=0 row-size=110B cardinality=10
      ----------------
      ...
      

      This profile output should be redacted if the requesting user does not have permission to run the explain plan.

      Attachments

        Activity

          People

            dtsirogiannis Dimitris Tsirogiannis
            dbeech Dave Beech
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: