Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-12737

Include List of Referenced Columns in Query Log Table

Agile BoardAttach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      In the Impala query log table where completed queries are stored, add lists of columns that were referenced in the query. The purpose behind this functionality is to know which columns are part of 

      • Select clause
      • Where clause
      • Join clause
      • Aggegrate clause
      • Order by clause

      There should be a column for each type of clause, so that decisions can be made based on specific usage or on the union of those clauses.

      With this information, we will feed into compute stats command to collect stats only on the required columns that are using in joins / filters and aggegrates and not on all the table columns.

      The information can be collected as an array of 

      [db1.table1.column1,db1.table1.column2]

       

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jasonmfehr Jason Fehr
            myloginid@gmail.com Manish Maheshwari

            Dates

              Created:
              Updated:

              Slack

                Issue deployment