Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4055

Investigate and fix to_date() slowness

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 2.6.0, Impala 2.7.0, Impala 2.8.0
    • Impala 2.9.0
    • Backend

    Description

      Looks like to_date() pays a steep penalty when converting timestamps.

      +-----------------------------------------------------------------------------------------+
      | version()                                                                               |
      +-----------------------------------------------------------------------------------------+
      | impalad version 2.6.0-cdh5.8.0 RELEASE (build 5464d1750381b40a7e7163b12b09f11b891b4de3) |
      | Built on Thu, 16 Jun 2016 12:43:48 PST                                                  |
      +-----------------------------------------------------------------------------------------+
      
      -- single column timestamp parquet table of 100,000,000 rows
      
      select 
        l_shipdate,
        count(*)
      from ts1
      group by 1;
      
      Fetched 2526 row(s) in 11.25s
      
      select 
        trunc(l_shipdate,'DD'),
        count(*)
      from ts1
      group by 1;
      
      Fetched 2526 row(s) in 10.74s
      
      select 
        to_date(l_shipdate),
        count(*)
      from ts1
      group by 1;
      
      Fetched 2526 row(s) in 102.36s  <<< ~10x slower
      

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            alex.behm Alexander Behm
            grahn Greg Rahn
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment