[IMPALA-11411] The random() function got wrong result in "with block". - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: Impala 2.12.0
Fix Version/s: None
Component/s: Distributed Exec
Labels:
- patch
Environment:
Windows 10, Cloudera Data Science Workbench. Base Image v13, docker.repository.cloudera.com/cdsw/engine:13

Flags:

Patch, Important
Target Version:

Impala 2.12.0
Epic Color:
ghx-label-9
Language:
- C++
- Java

Description

In the impala-shell version 2.12, the following script will got wrong result, the value of the rk1 and rk2 are totally different, but the inner join can be applied.

[code]

with
idx_dt_list as (

/* There are 304 unique index date can be found */
select distinct to_date(idx_dt) as idx_dt
from B2848551_step1_unq_pats
order by idx_dt
),
idx_dt_with_rk as (

/* assign the number 1 ~ 304 to those index dates */
select idx_dt, rank() over(order by idx_dt) as rk
from idx_dt_list
),
pat_with_random_num as (

There are 31,808 patients in the table, assign a

random number 1 ~ 304 to each patients

select patient_id, ceil(random(1) * 304) as rk
from B2848551_step6_nonbaom
),
wanted as (
select a.patient_id, b.idx_dt, a.rk as rk1, b.rk as rk2
from pat_with_random_num a, idx_dt_with_rk b

/* The condition is a.rk = b.rk */
where a.rk = b.rk
)
select *
from wanted
limit 5
;

[/code]

[result : the columns rk1 and rk2 are different]
----------------------------+

patient_id

idx_dt

rk1

rk2

----------------------------+

xxxxxxx1	2022-03-22	127	295
xxxxxxx2	2021-08-29	140	90
xxxxxxx3	2021-07-17	183	47
xxxxxxx4	2021-06-08	87	8
xxxxxxx5	2021-07-17	295	47

----------------------------+

[/result]

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Guangyu Hu

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 04/Jul/22 07:13

Updated:: 04/Jul/22 07:13