Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-92

Significant performance difference between LIKE = 'x' AND = 'x'

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 0.6
    • Impala 0.7
    • None
    • None

    Description

      I'm running the following two queries. The only difference between them is I'm using "LIKE" in one case and "=" in another, though there is no "%" in the LIKE, so the effect is the same. I was surprised to see approximately a 10x difference in performance between them.

      Query: select v1, c, count(*) FROM xxx b, yyy a  WHERE a.v1 = b.file AND v5 LIKE "hostId" AND v3 = "hosts" GROUP BY v1, c ORDER BY count(*) limit 1000
      Returned 89 row(s) in 10.13s
      
      Query: select v1, c, count(*) FROM xxx b, yyy a  WHERE a.v1 = b.file AND v5 LIKE "hostId" AND v3 = "hosts" GROUP BY v1, c ORDER BY count(*) limit 1000
      Returned 89 row(s) in 93.76s
      

      I'm running

      impalad version 0.6 RELEASE (build e675301a90e370f694d700b395a13f0265b7f09c)
      

      I've attached the two query profiles. The basic difference is in the execution rate:

      -    Averaged Fragment 2:(1m27s 0.00%)
      -      completion times: min:1m19s  max:1m32s  mean: 1m28s  stddev:4s545ms
      -      execution rates: min:35.33 MB/sec  max:41.00 MB/sec  mean:37.37 MB/sec  stddev:1.90 MB/sec
      +         - RowsReturnedRate: 9.00 /sec
      +    Averaged Fragment 2:(7s906ms 0.00%)
      +      completion times: min:7s620ms  max:9s495ms  mean: 8s056ms  stddev:653ms
      +      execution rates: min:342.95 MB/sec  max:436.42 MB/sec  mean:409.84 MB/sec  stddev:31.25 MB/sec
      

      Obviously I've fixed my query.

      Attachments

        1. like-predicate.cc.patch
          1 kB
          Zuo Wang
        2. like-predicate.h.patch
          0.5 kB
          Jim Apple
        3. like-predicate.cc.patch
          1 kB
          Jim Apple
        4. z
          15 kB
          Jim Apple
        5. like-predicate.h.patch
          0.2 kB
          Jim Apple
        6. z2
          15 kB
          Jim Apple
        7. like-predicate.cc.patch
          0.7 kB
          Jim Apple
        8. like-predicate.h.patch
          0.5 kB
          Jim Apple

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            skye Skye Wanderman-Milne
            philip Philip Martin
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment