Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19035

rand() function in case when cause failed

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Incomplete
    • Affects Version/s: 2.0.0, 2.0.1, 2.0.2
    • Fix Version/s: None
    • Component/s: SQL
    • Labels:

      Description

      In this case:
      select
      case when a=1 then 1 else concat(a,cast(rand() as string)) end b,count(1)
      from
      yuanfeng1_a
      group by
      case when a=1 then 1 else concat(a,cast(rand() as string)) end;

      Throw error:
      Error in query: expression 'yuanfeng1_a.`a`' is neither present in the group by, nor is it an aggregate function. Add to group by or wrap in first() (or first_value) if you don't care which value you get.;;
      Aggregate CASE WHEN (a#2075 = 1) THEN cast(1 as string) ELSE concat(cast(a#2075 as string), cast(rand(519367429988179997) as string)) END, CASE WHEN (a#2075 = 1) THEN cast(1 as string) ELSE concat(cast(a#2075 as string), cast(rand(8090243936131101651) as string)) END AS b#2074
      +- MetastoreRelation default, yuanfeng1_a
      select case when a=1 then 1 else rand() end b,count(1) from yuanfeng1_a group by case when a=1 then rand() end also output this
      Notice:
      If replace rand() as 1,it work.

      A simpler way to reproduce this bug: `SELECT a + rand() FROM t GROUP BY a + rand()`.

        Attachments

        Issue Links

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              Feng Yuan Feng Yuan

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment