Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-7412

width_bucket() function overflows too easily

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • Impala 3.1.0
    • None
    • None
    • ghx-label-8

    Description

      I looked at the failing query:

      select width_bucket(cast(-0.51111065802795359821573 as decimal(28,28)), cast(-7.247919472444366553466758690723 as decimal(31,30)), cast(2.6 as decimal(22,17)), 189748626);

      and the problem is that in WidthBucketImpl() we want to cast num_buckets to a decimal with ARG_TYPE_PRECISION and ARG_TYPE_SCALE.

      ARG_TYPE_PRECISION and ARG_TYPE_SCALE are determined by the Frontend in

      FunctionCallExpr.analyzeImpl()->castForFunctionCall()->getResolvedWildCardType()

      getResolvedWildCardType() only looks at the arguments that correspond to wildcard decimal parameters:

      if (!fnArgs[ix].isWildcardDecimal()) continue;

      Therefore it doesn't take the INT argument (num_buckets) into account and determines a decimal with precision 35 and scale 30.

      We could modify getResolvedWildCardType() to consider other arguments as well. This query would fail again because INT needs 10 digits precision with 0 digits scale => the determined decimal would need precision 40 instead of 35. It is an error because Impala decimals can only have precision 38 at most.

      A better approach for this case would be to figure out the exact number of the digits from the literal expression 189748626 => 9. However, that would also fail because it would need precision 39.

      If we want to cast num_buckets to a decimal type we cannot make this query successful without information loss.

       

      The other approach is to modify WidthBucketImpl() to interpret its parameters as integers, because all of them have the same byte size, precision, and scale.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            boroknagyz Zoltán Borók-Nagy
            boroknagyz Zoltán Borók-Nagy
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment