Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4810 DECIMAL datatype changes for Impala 2.9
  3. IMPALA-4370

DECIMAL divide result type (Impala TPC-DS query 11 result lost one row)

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Impala 2.9.0
    • Component/s: Frontend
    • Labels:

      Description

      The result of TPC-DS query 11 in Impala is wrong. It lost a record :
      "AAAAAAAAOPDLAAAA | Ann | Pence N"

      I'm using tpcds 1g validation dataset generated by command:
      "./dsdgen -sc 1 -terminate N -validate Y"
      with tpc-ds-2.3.0

      Attachments are the query string, result of Impala and tpc-ds answer set.

      Note: I have also run this query in PostgreSQL and it gives the same result as tpc-ds result set.

      1. impala-tpcds-query11.res
        8 kB
        Yuanhao Luo
      2. tpcds-answerset-query11.res
        4 kB
        Yuanhao Luo
      3. tpcds-query11.sql
        3 kB
        Yuanhao Luo

        Issue Links

          Activity

          Hide
          grahn Greg Rahn added a comment -

          Yuanhao Luo - Can you provide more details on how to use the validation files?
          It seems there may be a bug with the data as the vld files contain an extra field (looks like column 1 is repeated).

          head -1 customer.vld | awk -F '|' '{print NF}'
          19
          
          head -1 customer.dat | awk -F '|' '{print NF}'
          18
          
          head /tmp/customer.vld
          9436|9436|AAAAAAAAMNECAAAA||3455|6420|||Mrs.||Cohen|||||KYRGYZSTAN||Anthony.Cohen@sUmj1lL50J.edu|
          94128|94128|AAAAAAAAALPGBAAA|960764|3107|33850|2452529|2452499|Dr.|Daniel|Morales|N|23|5|1936|FRENCH POLYNESIA||Daniel.Morales@OO5nCDIx4oK.com|2452584
          8269|8269|AAAAAAAANEACAAAA|630096||26828|2452667|2452637||Keith|Austin|Y|28|12||DENMARK|||
          72977|72977|AAAAAAAABBNBBAAA|373778|3055|332|2451386|2451356|Mr.|Gerald|Kennedy|Y|5|11|1930|TRINIDAD AND TOBAGO||Gerald.Kennedy@0oz4vcXRsOJzyoI5VH.edu|2452617
          17816|17816|AAAAAAAAIJFEAAAA|363271|553|3482|2449274|2449244|Dr.|Ronald|Currier|Y|18|12|1944|ANTIGUA AND BARBUDA||Ronald.Currier@Mx5ERM7p.org|2452638
          45305|45305|AAAAAAAAJPALAAAA|1722556|856|37956|2449662|2449632|Dr.|Tommie|Ford|Y|13|5|1988|GUATEMALA||Tommie.Ford@nDJNoD4.com|2452467
          76890|76890|AAAAAAAAKFMCBAAA|1476487|3685|13683|2450434|2450404|Mr.|Teddy|Vargas|N|17|7|1969|AUSTRALIA||Teddy.Vargas@6R9llreRRkjX4X.com|2452490
          92394|92394|AAAAAAAAKOIGBAAA|1352259|5812|39812|2452455|2452425|Dr.|Felicitas|Esquivel|N|19|9|1973|MAYOTTE||Felicitas.Esquivel@gJ.edu|2452568
          20614|20614|AAAAAAAAGIAFAAAA|765547|2032|32117|2449364|2449334|Ms.|Jeanne|Burris|N|16|9|1960|WESTERN SAHARA||Jeanne.Burris@o1jQbUy.edu|2452445
          63848|63848|AAAAAAAAIGJPAAAA|1525057|2441|46672|2452134|2452104|Dr.|Sabrina|Mckay|N|26|10|1944|TURKMENISTAN||Sabrina.Mckay@MH.edu|2452614
          

          Also, I do not see the record you cite missing in the customer.vld file.

          grep AAAAAAAAOPDLAAAA /tmp/customer.vld
          

          That said, if I just use the data from scale 1, the results from Impala 2.6.0-cdh5.8.0 RELEASE (build 5464d1750381b40a7e7163b12b09f11b891b4de3) match those from Netezza Release 7.2.1.0 Build 46322.

          Show
          grahn Greg Rahn added a comment - Yuanhao Luo - Can you provide more details on how to use the validation files? It seems there may be a bug with the data as the vld files contain an extra field (looks like column 1 is repeated). head -1 customer.vld | awk -F '|' '{print NF}' 19 head -1 customer.dat | awk -F '|' '{print NF}' 18 head /tmp/customer.vld 9436|9436|AAAAAAAAMNECAAAA||3455|6420|||Mrs.||Cohen|||||KYRGYZSTAN||Anthony.Cohen@sUmj1lL50J.edu| 94128|94128|AAAAAAAAALPGBAAA|960764|3107|33850|2452529|2452499|Dr.|Daniel|Morales|N|23|5|1936|FRENCH POLYNESIA||Daniel.Morales@OO5nCDIx4oK.com|2452584 8269|8269|AAAAAAAANEACAAAA|630096||26828|2452667|2452637||Keith|Austin|Y|28|12||DENMARK||| 72977|72977|AAAAAAAABBNBBAAA|373778|3055|332|2451386|2451356|Mr.|Gerald|Kennedy|Y|5|11|1930|TRINIDAD AND TOBAGO||Gerald.Kennedy@0oz4vcXRsOJzyoI5VH.edu|2452617 17816|17816|AAAAAAAAIJFEAAAA|363271|553|3482|2449274|2449244|Dr.|Ronald|Currier|Y|18|12|1944|ANTIGUA AND BARBUDA||Ronald.Currier@Mx5ERM7p.org|2452638 45305|45305|AAAAAAAAJPALAAAA|1722556|856|37956|2449662|2449632|Dr.|Tommie|Ford|Y|13|5|1988|GUATEMALA||Tommie.Ford@nDJNoD4.com|2452467 76890|76890|AAAAAAAAKFMCBAAA|1476487|3685|13683|2450434|2450404|Mr.|Teddy|Vargas|N|17|7|1969|AUSTRALIA||Teddy.Vargas@6R9llreRRkjX4X.com|2452490 92394|92394|AAAAAAAAKOIGBAAA|1352259|5812|39812|2452455|2452425|Dr.|Felicitas|Esquivel|N|19|9|1973|MAYOTTE||Felicitas.Esquivel@gJ.edu|2452568 20614|20614|AAAAAAAAGIAFAAAA|765547|2032|32117|2449364|2449334|Ms.|Jeanne|Burris|N|16|9|1960|WESTERN SAHARA||Jeanne.Burris@o1jQbUy.edu|2452445 63848|63848|AAAAAAAAIGJPAAAA|1525057|2441|46672|2452134|2452104|Dr.|Sabrina|Mckay|N|26|10|1944|TURKMENISTAN||Sabrina.Mckay@MH.edu|2452614 Also, I do not see the record you cite missing in the customer.vld file. grep AAAAAAAAOPDLAAAA /tmp/customer.vld That said, if I just use the data from scale 1, the results from Impala 2.6.0-cdh5.8.0 RELEASE (build 5464d1750381b40a7e7163b12b09f11b891b4de3) match those from Netezza Release 7.2.1.0 Build 46322.
          Hide
          yhluo_impala_39a4 Yuanhao Luo added a comment -

          Hello, grahn, I generated dataset by running below command with tpcds-2.3.0

          ./dsdgen -sc 1 -terminate N -dir /home/luoyuanhao/tpcds-data/1g_validate/ -verbose y -validate y
          

          and there are only 18 columns in customer.dat, value of first column ranges from 1 to 10000 in asc order. I also find the record mentioned before as below:

          [luoyuanhao@nobida208 Impala]$ head -1 /home/luoyuanhao/tpcds-data/1g_validate/customer.dat |awk -F '|' '{print NF}'
          18
          [luoyuanhao@nobida208 Impala]$
          [luoyuanhao@nobida208 Impala]$
          [luoyuanhao@nobida208 Impala]$ head -10 /home/luoyuanhao/tpcds-data/1g_validate/customer.dat
          1|AAAAAAAABAAAAAAA|980124|7135|32946|2452238|2452208|Mr.|Javier|Lewis|Y|9|12|1936|CHILE||Javier.Lewis@VFAxlnZEvOx.org|2452508
          2|AAAAAAAACAAAAAAA|819667|1461|31655|2452318|2452288|Dr.|Amy|Moses|Y|9|4|1966|TOGO||Amy.Moses@Ovk9KjHH.com|2452318
          3|AAAAAAAADAAAAAAA|1473522|6247|48572|2449130|2449100|Miss|Latisha|Hamilton|N|18|9|1979|NIUE||Latisha.Hamilton@V.com|2452313
          4|AAAAAAAAEAAAAAAA|1703214|3986|39558|2450030|2450000|Dr.|Michael|White|N|7|6|1983|MEXICO||Michael.White@i.org|2452361
          5|AAAAAAAAFAAAAAAA|953372|4470|36368|2449438|2449408|Sir|Robert|Moran|N|8|5|1956|FIJI||Robert.Moran@Hh.edu|2452469
          6|AAAAAAAAGAAAAAAA|213219|6374|27082|2451883|2451853|Ms.|Brunilda|Sharp|N|4|12|1925|SURINAME||Brunilda.Sharp@T3pylZEUQjm.org|2452430
          7|AAAAAAAAHAAAAAAA|68377|3219|44814|2451438|2451408|Ms.|Fonda|Wiles|Y|24|4|1985|GAMBIA||Fonda.Wiles@S9KnyEtz9hv.org|2452360
          8|AAAAAAAAIAAAAAAA|1215897|2471|16598|2449406|2449376|Sir|Ollie|Shipman|N|26|12|1938|KOREA, REPUBLIC OF||Ollie.Shipman@be.org|2452334
          9|AAAAAAAAJAAAAAAA|1168667|1404|49388|2452275|2452245|Sir|Karl|Gilbert|N|26|10|1966|MONTSERRAT||Karl.Gilbert@Crg5KyP2IxX9C4d6.edu|2452454
          10|AAAAAAAAKAAAAAAA|1207553|5143|19580|2451353|2451323|Ms.|Albert|Brunson|N|15|10|1973|JORDAN||Albert.Brunson@62.com|2452641
          [luoyuanhao@nobida208 Impala]$
          [luoyuanhao@nobida208 Impala]$
          [luoyuanhao@nobida208 Impala]$ grep AAAAAAAAOPDLAAAA /home/luoyuanhao/tpcds-data/1g_validate/customer.dat 
          46078|AAAAAAAAOPDLAAAA|1328646|6756|35219|2450592|2450562|Miss|Ann|Pence|N|22|4|1955|JAMAICA||Ann.Pence@NnFib4HYa89ouq7CRC.org|2452423
          [luoyuanhao@nobida208 Impala]$ 
          

          and version of my Impala is:

          impalad version 2.8.0-SNAPSHOT DEBUG (build 2bd9eccebb550811dd25f348ac8f70d57b4341d0) 
          

          What's more, how you get the file /tmp/customer.vld? I have never see it before.

          Show
          yhluo_impala_39a4 Yuanhao Luo added a comment - Hello, grahn, I generated dataset by running below command with tpcds-2.3.0 ./dsdgen -sc 1 -terminate N -dir /home/luoyuanhao/tpcds-data/1g_validate/ -verbose y -validate y and there are only 18 columns in customer.dat, value of first column ranges from 1 to 10000 in asc order. I also find the record mentioned before as below: [luoyuanhao@nobida208 Impala]$ head -1 /home/luoyuanhao/tpcds-data/1g_validate/customer.dat |awk -F '|' '{print NF}' 18 [luoyuanhao@nobida208 Impala]$ [luoyuanhao@nobida208 Impala]$ [luoyuanhao@nobida208 Impala]$ head -10 /home/luoyuanhao/tpcds-data/1g_validate/customer.dat 1|AAAAAAAABAAAAAAA|980124|7135|32946|2452238|2452208|Mr.|Javier|Lewis|Y|9|12|1936|CHILE||Javier.Lewis@VFAxlnZEvOx.org|2452508 2|AAAAAAAACAAAAAAA|819667|1461|31655|2452318|2452288|Dr.|Amy|Moses|Y|9|4|1966|TOGO||Amy.Moses@Ovk9KjHH.com|2452318 3|AAAAAAAADAAAAAAA|1473522|6247|48572|2449130|2449100|Miss|Latisha|Hamilton|N|18|9|1979|NIUE||Latisha.Hamilton@V.com|2452313 4|AAAAAAAAEAAAAAAA|1703214|3986|39558|2450030|2450000|Dr.|Michael|White|N|7|6|1983|MEXICO||Michael.White@i.org|2452361 5|AAAAAAAAFAAAAAAA|953372|4470|36368|2449438|2449408|Sir|Robert|Moran|N|8|5|1956|FIJI||Robert.Moran@Hh.edu|2452469 6|AAAAAAAAGAAAAAAA|213219|6374|27082|2451883|2451853|Ms.|Brunilda|Sharp|N|4|12|1925|SURINAME||Brunilda.Sharp@T3pylZEUQjm.org|2452430 7|AAAAAAAAHAAAAAAA|68377|3219|44814|2451438|2451408|Ms.|Fonda|Wiles|Y|24|4|1985|GAMBIA||Fonda.Wiles@S9KnyEtz9hv.org|2452360 8|AAAAAAAAIAAAAAAA|1215897|2471|16598|2449406|2449376|Sir|Ollie|Shipman|N|26|12|1938|KOREA, REPUBLIC OF||Ollie.Shipman@be.org|2452334 9|AAAAAAAAJAAAAAAA|1168667|1404|49388|2452275|2452245|Sir|Karl|Gilbert|N|26|10|1966|MONTSERRAT||Karl.Gilbert@Crg5KyP2IxX9C4d6.edu|2452454 10|AAAAAAAAKAAAAAAA|1207553|5143|19580|2451353|2451323|Ms.|Albert|Brunson|N|15|10|1973|JORDAN||Albert.Brunson@62.com|2452641 [luoyuanhao@nobida208 Impala]$ [luoyuanhao@nobida208 Impala]$ [luoyuanhao@nobida208 Impala]$ grep AAAAAAAAOPDLAAAA /home/luoyuanhao/tpcds-data/1g_validate/customer.dat 46078|AAAAAAAAOPDLAAAA|1328646|6756|35219|2450592|2450562|Miss|Ann|Pence|N|22|4|1955|JAMAICA||Ann.Pence@NnFib4HYa89ouq7CRC.org|2452423 [luoyuanhao@nobida208 Impala]$ and version of my Impala is: impalad version 2.8.0-SNAPSHOT DEBUG (build 2bd9eccebb550811dd25f348ac8f70d57b4341d0) What's more, how you get the file /tmp/customer.vld? I have never see it before.
          Hide
          grahn Greg Rahn added a comment -

          Yuanhao Luo - Just as an FYI, the flag values are case sensitive. e.g. -force y does nothing -force Y will overwrite.

          In the directory that your flat files reside, can you run an md5sum *.dat and add the results just so I can validate we have the same data?
          I'll also have to use an impala version that matches yours.

          Here are the validation files:

          grahn@ubuntu:~/v2.3.0/tools$ ./dsdgen -sc 1 -terminate N -dir /tmp/ds -verbose Y -validate Y -force Y
          dsdgen Population Generator (Version 2.3.0)
          Copyright Transaction Processing Performance Council (TPC) 2001 - 2016
          Warning: This scale factor is valid for QUALIFICATION ONLY
          
          grahn@ubuntu:~/v2.3.0/tools$ cd /tmp/ds
          
          grahn@ubuntu:/tmp/ds$ ls -l
          total 160
          -rw-rw-r-- 1 grahn grahn  1997 Oct 27 09:51 call_center.vld
          -rw-rw-r-- 1 grahn grahn  7237 Oct 27 09:51 catalog_page.vld
          -rw-rw-r-- 1 grahn grahn   582 Oct 27 09:51 catalog_returns.vld
          -rw-rw-r-- 1 grahn grahn 10118 Oct 27 09:51 catalog_sales.vld
          -rw-rw-r-- 1 grahn grahn  5800 Oct 27 09:51 customer_address.vld
          -rw-rw-r-- 1 grahn grahn  2457 Oct 27 09:51 customer_demographics.vld
          -rw-rw-r-- 1 grahn grahn  6891 Oct 27 09:51 customer.vld
          -rw-rw-r-- 1 grahn grahn  7303 Oct 27 09:51 date_dim.vld
          -rw-rw-r-- 1 grahn grahn  1249 Oct 27 09:51 household_demographics.vld
          -rw-rw-r-- 1 grahn grahn   363 Oct 27 09:51 income_band.vld
          -rw-rw-r-- 1 grahn grahn  1367 Oct 27 09:51 inventory.vld
          -rw-rw-r-- 1 grahn grahn 14247 Oct 27 09:51 item.vld
          -rw-rw-r-- 1 grahn grahn  6349 Oct 27 09:51 promotion.vld
          -rw-rw-r-- 1 grahn grahn  1390 Oct 27 09:51 reason.vld
          -rw-rw-r-- 1 grahn grahn  1155 Oct 27 09:51 ship_mode.vld
          -rw-rw-r-- 1 grahn grahn   784 Oct 27 09:51 store_returns.vld
          -rw-rw-r-- 1 grahn grahn  6746 Oct 27 09:51 store_sales.vld
          -rw-rw-r-- 1 grahn grahn  3445 Oct 27 09:51 store.vld
          -rw-rw-r-- 1 grahn grahn  3159 Oct 27 09:51 time_dim.vld
          -rw-rw-r-- 1 grahn grahn   640 Oct 27 09:51 warehouse.vld
          -rw-rw-r-- 1 grahn grahn  4887 Oct 27 09:51 web_page.vld
          -rw-rw-r-- 1 grahn grahn   559 Oct 27 09:51 web_returns.vld
          -rw-rw-r-- 1 grahn grahn 10221 Oct 27 09:51 web_sales.vld
          -rw-rw-r-- 1 grahn grahn  8797 Oct 27 09:51 web_site.vld
          
          grahn@ubuntu:/tmp/ds$ wc -l *
               6 call_center.vld
              50 catalog_page.vld
               4 catalog_returns.vld
              50 catalog_sales.vld
              50 customer_address.vld
              50 customer_demographics.vld
              50 customer.vld
              50 date_dim.vld
              50 household_demographics.vld
              20 income_band.vld
              50 inventory.vld
              50 item.vld
              50 promotion.vld
              35 reason.vld
              20 ship_mode.vld
               7 store_returns.vld
              50 store_sales.vld
              12 store.vld
              50 time_dim.vld
               5 warehouse.vld
              50 web_page.vld
               4 web_returns.vld
              50 web_sales.vld
              30 web_site.vld
             843 total
          
          Show
          grahn Greg Rahn added a comment - Yuanhao Luo - Just as an FYI, the flag values are case sensitive. e.g. -force y does nothing -force Y will overwrite. In the directory that your flat files reside, can you run an md5sum *.dat and add the results just so I can validate we have the same data? I'll also have to use an impala version that matches yours. Here are the validation files: grahn@ubuntu:~/v2.3.0/tools$ ./dsdgen -sc 1 -terminate N -dir /tmp/ds -verbose Y -validate Y -force Y dsdgen Population Generator (Version 2.3.0) Copyright Transaction Processing Performance Council (TPC) 2001 - 2016 Warning: This scale factor is valid for QUALIFICATION ONLY grahn@ubuntu:~/v2.3.0/tools$ cd /tmp/ds grahn@ubuntu:/tmp/ds$ ls -l total 160 -rw-rw-r-- 1 grahn grahn 1997 Oct 27 09:51 call_center.vld -rw-rw-r-- 1 grahn grahn 7237 Oct 27 09:51 catalog_page.vld -rw-rw-r-- 1 grahn grahn 582 Oct 27 09:51 catalog_returns.vld -rw-rw-r-- 1 grahn grahn 10118 Oct 27 09:51 catalog_sales.vld -rw-rw-r-- 1 grahn grahn 5800 Oct 27 09:51 customer_address.vld -rw-rw-r-- 1 grahn grahn 2457 Oct 27 09:51 customer_demographics.vld -rw-rw-r-- 1 grahn grahn 6891 Oct 27 09:51 customer.vld -rw-rw-r-- 1 grahn grahn 7303 Oct 27 09:51 date_dim.vld -rw-rw-r-- 1 grahn grahn 1249 Oct 27 09:51 household_demographics.vld -rw-rw-r-- 1 grahn grahn 363 Oct 27 09:51 income_band.vld -rw-rw-r-- 1 grahn grahn 1367 Oct 27 09:51 inventory.vld -rw-rw-r-- 1 grahn grahn 14247 Oct 27 09:51 item.vld -rw-rw-r-- 1 grahn grahn 6349 Oct 27 09:51 promotion.vld -rw-rw-r-- 1 grahn grahn 1390 Oct 27 09:51 reason.vld -rw-rw-r-- 1 grahn grahn 1155 Oct 27 09:51 ship_mode.vld -rw-rw-r-- 1 grahn grahn 784 Oct 27 09:51 store_returns.vld -rw-rw-r-- 1 grahn grahn 6746 Oct 27 09:51 store_sales.vld -rw-rw-r-- 1 grahn grahn 3445 Oct 27 09:51 store.vld -rw-rw-r-- 1 grahn grahn 3159 Oct 27 09:51 time_dim.vld -rw-rw-r-- 1 grahn grahn 640 Oct 27 09:51 warehouse.vld -rw-rw-r-- 1 grahn grahn 4887 Oct 27 09:51 web_page.vld -rw-rw-r-- 1 grahn grahn 559 Oct 27 09:51 web_returns.vld -rw-rw-r-- 1 grahn grahn 10221 Oct 27 09:51 web_sales.vld -rw-rw-r-- 1 grahn grahn 8797 Oct 27 09:51 web_site.vld grahn@ubuntu:/tmp/ds$ wc -l * 6 call_center.vld 50 catalog_page.vld 4 catalog_returns.vld 50 catalog_sales.vld 50 customer_address.vld 50 customer_demographics.vld 50 customer.vld 50 date_dim.vld 50 household_demographics.vld 20 income_band.vld 50 inventory.vld 50 item.vld 50 promotion.vld 35 reason.vld 20 ship_mode.vld 7 store_returns.vld 50 store_sales.vld 12 store.vld 50 time_dim.vld 5 warehouse.vld 50 web_page.vld 4 web_returns.vld 50 web_sales.vld 30 web_site.vld 843 total
          Hide
          yhluo_impala_39a4 Yuanhao Luo added a comment -

          Hello, Greg Rahn, you are right and with -validate y option, we didn't generate validate dataset, but 1g dataset.

          Information of tpcds1g dataset shows below:

          [luoyuanhao@nobida208 Impala]$ cd /home/luoyuanhao/tpcds-data/1g_validate/
          [luoyuanhao@nobida208 1g_validate]$ ls -l
          total 1217636
          -rw-rw-r-- 1 luoyuanhao luoyuanhao      1885 Oct 20 10:07 call_center.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao   1620074 Oct 20 10:07 catalog_page.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao  21234304 Oct 20 10:08 catalog_returns.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao 294468836 Oct 20 10:08 catalog_sales.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao   5452165 Oct 20 10:08 customer_address.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao  13110282 Oct 20 10:19 customer.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao  78739296 Oct 20 10:08 customer_demographics.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao  13109372 Oct 20 10:08 customer-gbk.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao  10244389 Oct 20 10:08 date_dim.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao       115 Oct 20 10:09 dbgen_version.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao    144453 Oct 20 10:08 household_demographics.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao       308 Oct 20 10:08 income_band.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao 224675139 Oct 20 10:08 inventory.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao   5033899 Oct 20 10:08 item.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao     36933 Oct 20 10:08 promotion.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao      1304 Oct 20 10:08 reason.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao      1093 Oct 20 10:08 ship_mode.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao      3143 Oct 20 10:08 store.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao  32422491 Oct 20 10:09 store_returns.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao 385565005 Oct 20 10:09 store_sales.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao   5021380 Oct 20 10:09 time_dim.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao       580 Oct 20 10:09 warehouse.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao      5716 Oct 20 10:09 web_page.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao   9734473 Oct 20 10:09 web_returns.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao 146158290 Oct 20 10:09 web_sales.dat
          -rw-rw-r-- 1 luoyuanhao luoyuanhao      8741 Oct 20 10:09 web_site.dat
          [luoyuanhao@nobida208 1g_validate]$ wc -l *
                   6 call_center.dat
               11718 catalog_page.dat
              144067 catalog_returns.dat
             1441548 catalog_sales.dat
               50000 customer_address.dat
              100000 customer.dat
             1920800 customer_demographics.dat
              100000 customer-gbk.dat
               73049 date_dim.dat
                   1 dbgen_version.dat
                7200 household_demographics.dat
                  20 income_band.dat
            11745000 inventory.dat
               18000 item.dat
                 300 promotion.dat
                  35 reason.dat
                  20 ship_mode.dat
                  12 store.dat
              287514 store_returns.dat
             2880404 store_sales.dat
               86400 time_dim.dat
                   5 warehouse.dat
                  60 web_page.dat
               71763 web_returns.dat
              719384 web_sales.dat
                  30 web_site.dat
            19657336 total
          [luoyuanhao@nobida208 1g_validate]$ md5sum *.dat
          86db117a0bb48668acbe63c473e85d96  call_center.dat
          0dc5471e075cd575247a50fea50db9b0  catalog_page.dat
          0dc685e8a08bba187674a85ca83bb08f  catalog_returns.dat
          a3096155410e91fac758411e72b6acfe  catalog_sales.dat
          f05bcf7c4b3f814f6703b7a5789ed763  customer_address.dat
          a8a791088442a7bf33fe781657c807c0  customer.dat
          fe0219784eba694625ccc451da0f1a9e  customer_demographics.dat
          0a7b88a8e9372069b4a332397cd60b8a  customer-gbk.dat
          f4ef03663ab568ddeb16309f493896c0  date_dim.dat
          2a2d9b33af3ad8f3742e1ef31580b6b2  dbgen_version.dat
          8bca3ec214a03ea6df0c45427539ae93  household_demographics.dat
          9dd6b98d416b4d60c98c00b01dfde9ae  income_band.dat
          01b0e4a716740dc80b0e2ee728ba417a  inventory.dat
          0797b4ad4c4766521f6198af29c8bf80  item.dat
          fd198fe8ca20b2f6691fab6f9b77648e  promotion.dat
          9859851c77e846a55dca73f4cbdc2b3d  reason.dat
          7a8209780bb2ad3d706f82227ca3946c  ship_mode.dat
          e747938eaa338ea37970988c0c54f925  store.dat
          724eae386fbd92818930f89d68c82f75  store_returns.dat
          758f9fce7f97c14142767674019af8d0  store_sales.dat
          4bc3c76437eed9abde0b1507a1380a9c  time_dim.dat
          b54252167f4e5dbdb42e163d82ba8c3d  warehouse.dat
          fd9a04d6056363a4a262deb625fafebb  web_page.dat
          28a5b4eeeffb9ce3c1d9116a2972d802  web_returns.dat
          fe063eda4d4d052736fcd6fb245d3693  web_sales.dat
          2880740623d1ea3710af2eb51cc4b49a  web_site.dat
          [luoyuanhao@nobida208 1g_validate]$ cat dbgen_version.dat 
          2.3.0|2016-10-20|10:09:32|-sc 1 -terminate N -dir /home/luoyuanhao/tpcds-data/1g_validate/ -verbose y -validate y 
          
          
          Show
          yhluo_impala_39a4 Yuanhao Luo added a comment - Hello, Greg Rahn, you are right and with -validate y option, we didn't generate validate dataset, but 1g dataset. Information of tpcds1g dataset shows below: [luoyuanhao@nobida208 Impala]$ cd /home/luoyuanhao/tpcds-data/1g_validate/ [luoyuanhao@nobida208 1g_validate]$ ls -l total 1217636 -rw-rw-r-- 1 luoyuanhao luoyuanhao 1885 Oct 20 10:07 call_center.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 1620074 Oct 20 10:07 catalog_page.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 21234304 Oct 20 10:08 catalog_returns.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 294468836 Oct 20 10:08 catalog_sales.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 5452165 Oct 20 10:08 customer_address.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 13110282 Oct 20 10:19 customer.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 78739296 Oct 20 10:08 customer_demographics.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 13109372 Oct 20 10:08 customer-gbk.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 10244389 Oct 20 10:08 date_dim.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 115 Oct 20 10:09 dbgen_version.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 144453 Oct 20 10:08 household_demographics.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 308 Oct 20 10:08 income_band.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 224675139 Oct 20 10:08 inventory.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 5033899 Oct 20 10:08 item.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 36933 Oct 20 10:08 promotion.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 1304 Oct 20 10:08 reason.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 1093 Oct 20 10:08 ship_mode.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 3143 Oct 20 10:08 store.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 32422491 Oct 20 10:09 store_returns.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 385565005 Oct 20 10:09 store_sales.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 5021380 Oct 20 10:09 time_dim.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 580 Oct 20 10:09 warehouse.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 5716 Oct 20 10:09 web_page.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 9734473 Oct 20 10:09 web_returns.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 146158290 Oct 20 10:09 web_sales.dat -rw-rw-r-- 1 luoyuanhao luoyuanhao 8741 Oct 20 10:09 web_site.dat [luoyuanhao@nobida208 1g_validate]$ wc -l * 6 call_center.dat 11718 catalog_page.dat 144067 catalog_returns.dat 1441548 catalog_sales.dat 50000 customer_address.dat 100000 customer.dat 1920800 customer_demographics.dat 100000 customer-gbk.dat 73049 date_dim.dat 1 dbgen_version.dat 7200 household_demographics.dat 20 income_band.dat 11745000 inventory.dat 18000 item.dat 300 promotion.dat 35 reason.dat 20 ship_mode.dat 12 store.dat 287514 store_returns.dat 2880404 store_sales.dat 86400 time_dim.dat 5 warehouse.dat 60 web_page.dat 71763 web_returns.dat 719384 web_sales.dat 30 web_site.dat 19657336 total [luoyuanhao@nobida208 1g_validate]$ md5sum *.dat 86db117a0bb48668acbe63c473e85d96 call_center.dat 0dc5471e075cd575247a50fea50db9b0 catalog_page.dat 0dc685e8a08bba187674a85ca83bb08f catalog_returns.dat a3096155410e91fac758411e72b6acfe catalog_sales.dat f05bcf7c4b3f814f6703b7a5789ed763 customer_address.dat a8a791088442a7bf33fe781657c807c0 customer.dat fe0219784eba694625ccc451da0f1a9e customer_demographics.dat 0a7b88a8e9372069b4a332397cd60b8a customer-gbk.dat f4ef03663ab568ddeb16309f493896c0 date_dim.dat 2a2d9b33af3ad8f3742e1ef31580b6b2 dbgen_version.dat 8bca3ec214a03ea6df0c45427539ae93 household_demographics.dat 9dd6b98d416b4d60c98c00b01dfde9ae income_band.dat 01b0e4a716740dc80b0e2ee728ba417a inventory.dat 0797b4ad4c4766521f6198af29c8bf80 item.dat fd198fe8ca20b2f6691fab6f9b77648e promotion.dat 9859851c77e846a55dca73f4cbdc2b3d reason.dat 7a8209780bb2ad3d706f82227ca3946c ship_mode.dat e747938eaa338ea37970988c0c54f925 store.dat 724eae386fbd92818930f89d68c82f75 store_returns.dat 758f9fce7f97c14142767674019af8d0 store_sales.dat 4bc3c76437eed9abde0b1507a1380a9c time_dim.dat b54252167f4e5dbdb42e163d82ba8c3d warehouse.dat fd9a04d6056363a4a262deb625fafebb web_page.dat 28a5b4eeeffb9ce3c1d9116a2972d802 web_returns.dat fe063eda4d4d052736fcd6fb245d3693 web_sales.dat 2880740623d1ea3710af2eb51cc4b49a web_site.dat [luoyuanhao@nobida208 1g_validate]$ cat dbgen_version.dat 2.3.0|2016-10-20|10:09:32|-sc 1 -terminate N -dir /home/luoyuanhao/tpcds-data/1g_validate/ -verbose y -validate y
          Hide
          grahn Greg Rahn added a comment -

          Alexander Behm - I've narrowed this down to some issue related to DECIMAL. When year_total is cast as decimal(18,2), no rows are returned (wrong result). When year_total is cast as decimal(17,2), one row is returned (correct)

          with year_total as (
              select 
              customer_id,
              customer_first_name,
              customer_last_name,
              customer_preferred_cust_flag,
              dyear,
              cast(year_total as decimal(18,2)) as year_total, -- broken
              -- cast(year_total as decimal(17,2)) as year_total, -- works
              sale_type
          from (
              values 
              ('AAAAAAAAOPDLAAAA' as CUSTOMER_ID,
               'Ann' as CUSTOMER_FIRST_NAME,
               'Pence' as CUSTOMER_LAST_NAME,
               'N' as CUSTOMER_PREFERRED_CUST_FLAG,
               1999 as DYEAR,
               84313.66 as YEAR_TOTAL,
               's' as SALE_TYPE),
              ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2000,60928.23,'s'),
              ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2001,81982.13,'s'),
              ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2002,43259.06,'s'),
              ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2001,29053.32,'w'),
              ('AAAAAAAAOPDLAAAA','Ann','Pence','N',1998,47684.53,'s'),
              ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2002,15338.34,'w')
              ) t
          )
          select 
                t_s_secyear.customer_id
               ,t_s_secyear.customer_first_name
               ,t_s_secyear.customer_last_name
               ,t_s_secyear.customer_preferred_cust_flag
           from year_total t_s_firstyear
               ,year_total t_s_secyear
               ,year_total t_w_firstyear
               ,year_total t_w_secyear
           where t_s_secyear.customer_id = t_s_firstyear.customer_id
               and t_s_firstyear.customer_id = t_w_secyear.customer_id
               and t_s_firstyear.customer_id = t_w_firstyear.customer_id
               and t_s_firstyear.sale_type = 's'
               and t_w_firstyear.sale_type = 'w'
               and t_s_secyear.sale_type = 's'
               and t_w_secyear.sale_type = 'w'
               and t_s_firstyear.dyear = 2001
               and t_s_secyear.dyear = 2001+1
               and t_w_firstyear.dyear = 2001
               and t_w_secyear.dyear = 2001+1
               and t_s_firstyear.year_total > 0
               and t_w_firstyear.year_total > 0
               and case when t_w_firstyear.year_total > 0 then t_w_secyear.year_total / t_w_firstyear.year_total else 0.0 end
                   > case when t_s_firstyear.year_total > 0 then t_s_secyear.year_total / t_s_firstyear.year_total else 0.0 end
          ;
          
          Show
          grahn Greg Rahn added a comment - Alexander Behm - I've narrowed this down to some issue related to DECIMAL. When year_total is cast as decimal(18,2), no rows are returned (wrong result). When year_total is cast as decimal(17,2), one row is returned (correct) with year_total as ( select customer_id, customer_first_name, customer_last_name, customer_preferred_cust_flag, dyear, cast (year_total as decimal(18,2)) as year_total, -- broken -- cast (year_total as decimal(17,2)) as year_total, -- works sale_type from ( values ('AAAAAAAAOPDLAAAA' as CUSTOMER_ID, 'Ann' as CUSTOMER_FIRST_NAME, 'Pence' as CUSTOMER_LAST_NAME, 'N' as CUSTOMER_PREFERRED_CUST_FLAG, 1999 as DYEAR, 84313.66 as YEAR_TOTAL, 's' as SALE_TYPE), ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2000,60928.23,'s'), ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2001,81982.13,'s'), ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2002,43259.06,'s'), ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2001,29053.32,'w'), ('AAAAAAAAOPDLAAAA','Ann','Pence','N',1998,47684.53,'s'), ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2002,15338.34,'w') ) t ) select t_s_secyear.customer_id ,t_s_secyear.customer_first_name ,t_s_secyear.customer_last_name ,t_s_secyear.customer_preferred_cust_flag from year_total t_s_firstyear ,year_total t_s_secyear ,year_total t_w_firstyear ,year_total t_w_secyear where t_s_secyear.customer_id = t_s_firstyear.customer_id and t_s_firstyear.customer_id = t_w_secyear.customer_id and t_s_firstyear.customer_id = t_w_firstyear.customer_id and t_s_firstyear.sale_type = 's' and t_w_firstyear.sale_type = 'w' and t_s_secyear.sale_type = 's' and t_w_secyear.sale_type = 'w' and t_s_firstyear.dyear = 2001 and t_s_secyear.dyear = 2001+1 and t_w_firstyear.dyear = 2001 and t_w_secyear.dyear = 2001+1 and t_s_firstyear.year_total > 0 and t_w_firstyear.year_total > 0 and case when t_w_firstyear.year_total > 0 then t_w_secyear.year_total / t_w_firstyear.year_total else 0.0 end > case when t_s_firstyear.year_total > 0 then t_s_secyear.year_total / t_s_firstyear.year_total else 0.0 end ;
          Hide
          grahn Greg Rahn added a comment -

          This looks like the cause.

          with year_total as (
              select 
              customer_id,
              customer_first_name,
              customer_last_name,
              customer_preferred_cust_flag,
              dyear,
              -- cast(year_total as decimal(18,2)) as year_total, -- broken
              cast(year_total as decimal(17,2)) as year_total, -- works
              sale_type
          from (
              values 
              ('AAAAAAAAOPDLAAAA' as CUSTOMER_ID,
               'Ann' as CUSTOMER_FIRST_NAME,
               'Pence' as CUSTOMER_LAST_NAME,
               'N' as CUSTOMER_PREFERRED_CUST_FLAG,
               1999 as DYEAR,
               84313.66 as YEAR_TOTAL,
               's' as SALE_TYPE),
              ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2000,60928.23,'s'),
              ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2001,81982.13,'s'),
              ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2002,43259.06,'s'),
              ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2001,29053.32,'w'),
              ('AAAAAAAAOPDLAAAA','Ann','Pence','N',1998,47684.53,'s'),
              ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2002,15338.34,'w')
              ) t
          )
          select 
            t_w_firstyear.year_total,
            t_w_secyear.year_total,
            t_s_firstyear.year_total,
            t_s_secyear.year_total,
            t_w_secyear.year_total / t_w_firstyear.year_total t1,
            t_s_secyear.year_total / t_s_firstyear.year_total t2,
            case 
              when t_w_firstyear.year_total > 0 
              then t_w_secyear.year_total / t_w_firstyear.year_total 
              else 0.0 
            end
            > 
            case 
               when t_s_firstyear.year_total > 0 
               then t_s_secyear.year_total / t_s_firstyear.year_total 
               else 0.0 
            end
            as x   
           from year_total t_s_firstyear
               ,year_total t_s_secyear
               ,year_total t_w_firstyear
               ,year_total t_w_secyear
          
          where
          (
          t_w_firstyear.year_total = 81982.13 and
          t_w_secyear.year_total = 15338.34 and
          t_s_firstyear.year_total =  84313.66 and
          t_s_secyear.year_total = 15338.34
          )
          or
          (
          t_w_firstyear.year_total = 29053.32 and
          t_w_secyear.year_total = 15338.34 and  
          t_s_firstyear.year_total =  81982.13 and  
          t_s_secyear.year_total = 43259.06
          )
          or
          (
          t_w_firstyear.year_total = 43259.06   and
          t_w_secyear.year_total = 15338.34   and
          t_s_firstyear.year_total =  81982.13   and
          t_s_secyear.year_total = 29053.32
          )
          or
          (
          t_w_firstyear.year_total = 29053.32   and
          t_w_secyear.year_total = 81982.13   and
          t_s_firstyear.year_total =  15338.34   and
          t_s_secyear.year_total = 43259.06
          )
          or
          (
          t_w_firstyear.year_total = 43259.06   and
          t_w_secyear.year_total = 81982.13   and
          t_s_firstyear.year_total =  15338.34   and
          t_s_secyear.year_total = 29053.32
          )
          ;
          

          incorrect result

          cast(year_total as decimal(18,2)) as year_total, -- incorrect result
          +------------+------------+------------+------------+------+------+-------+
          | year_total | year_total | year_total | year_total | t1   | t2   | x     |
          +------------+------------+------------+------------+------+------+-------+
          | 81982.13   | 15338.34   | 84313.66   | 15338.34   | 0.18 | 0.18 | false |
          | 29053.32   | 15338.34   | 81982.13   | 43259.06   | 0.52 | 0.52 | false |
          | 43259.06   | 15338.34   | 81982.13   | 29053.32   | 0.35 | 0.35 | false |
          | 29053.32   | 81982.13   | 15338.34   | 43259.06   | 2.82 | 2.82 | false |
          | 43259.06   | 81982.13   | 15338.34   | 29053.32   | 1.89 | 1.89 | false |
          +------------+------------+------------+------------+------+------+-------+
          

          correct result

          cast(year_total as decimal(17,2)) as year_total, -- correct result
          +------------+------------+------------+------------+------------------------+------------------------+------+
          | year_total | year_total | year_total | year_total | t1                     | t2                     | x    |
          +------------+------------+------------+------------+------------------------+------------------------+------+
          | 81982.13   | 15338.34   | 84313.66   | 15338.34   | 0.18709369956599078360 | 0.18191998781692076942 | true |
          | 29053.32   | 15338.34   | 81982.13   | 43259.06   | 0.52793759886993982099 | 0.52766450444749361842 | true |
          | 43259.06   | 15338.34   | 81982.13   | 29053.32   | 0.35456942430094412592 | 0.35438601070745539302 | true |
          | 29053.32   | 81982.13   | 15338.34   | 43259.06   | 2.82178181357586671678 | 2.82032214698591894559 | true |
          | 43259.06   | 81982.13   | 15338.34   | 29053.32   | 1.89514358379493220610 | 1.89416325365065580760 | true |
          +------------+------------+------------+------------+------------------------+------------------------+------+
          

          Netezza result

           T_W_FIRSTYEAR | T_W_SECYEAR | T_S_FIRSTYEAR | T_S_SECYEAR | T_W_SECYEAR_DIV_T_W_FIRSTYEAR | T_S_SECYEAR_DIV_T_S_FIRSTYEAR | X
          ---------------+-------------+---------------+-------------+-------------------------------+-------------------------------+---
                29053.32 |    81982.13 |      15338.34 |    43259.06 |                      2.821782 |                      2.820322 | t
                29053.32 |    15338.34 |      81982.13 |    43259.06 |                      0.527938 |                      0.527665 | t
                43259.06 |    81982.13 |      15338.34 |    29053.32 |                      1.895144 |                      1.894163 | t
                43259.06 |    15338.34 |      81982.13 |    29053.32 |                      0.354569 |                      0.354386 | t
                81982.13 |    15338.34 |      84313.66 |    15338.34 |                      0.187094 |                      0.181920 | t
          
          Show
          grahn Greg Rahn added a comment - This looks like the cause. with year_total as ( select customer_id, customer_first_name, customer_last_name, customer_preferred_cust_flag, dyear, -- cast (year_total as decimal(18,2)) as year_total, -- broken cast (year_total as decimal(17,2)) as year_total, -- works sale_type from ( values ('AAAAAAAAOPDLAAAA' as CUSTOMER_ID, 'Ann' as CUSTOMER_FIRST_NAME, 'Pence' as CUSTOMER_LAST_NAME, 'N' as CUSTOMER_PREFERRED_CUST_FLAG, 1999 as DYEAR, 84313.66 as YEAR_TOTAL, 's' as SALE_TYPE), ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2000,60928.23,'s'), ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2001,81982.13,'s'), ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2002,43259.06,'s'), ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2001,29053.32,'w'), ('AAAAAAAAOPDLAAAA','Ann','Pence','N',1998,47684.53,'s'), ('AAAAAAAAOPDLAAAA','Ann','Pence','N',2002,15338.34,'w') ) t ) select t_w_firstyear.year_total, t_w_secyear.year_total, t_s_firstyear.year_total, t_s_secyear.year_total, t_w_secyear.year_total / t_w_firstyear.year_total t1, t_s_secyear.year_total / t_s_firstyear.year_total t2, case when t_w_firstyear.year_total > 0 then t_w_secyear.year_total / t_w_firstyear.year_total else 0.0 end > case when t_s_firstyear.year_total > 0 then t_s_secyear.year_total / t_s_firstyear.year_total else 0.0 end as x from year_total t_s_firstyear ,year_total t_s_secyear ,year_total t_w_firstyear ,year_total t_w_secyear where ( t_w_firstyear.year_total = 81982.13 and t_w_secyear.year_total = 15338.34 and t_s_firstyear.year_total = 84313.66 and t_s_secyear.year_total = 15338.34 ) or ( t_w_firstyear.year_total = 29053.32 and t_w_secyear.year_total = 15338.34 and t_s_firstyear.year_total = 81982.13 and t_s_secyear.year_total = 43259.06 ) or ( t_w_firstyear.year_total = 43259.06 and t_w_secyear.year_total = 15338.34 and t_s_firstyear.year_total = 81982.13 and t_s_secyear.year_total = 29053.32 ) or ( t_w_firstyear.year_total = 29053.32 and t_w_secyear.year_total = 81982.13 and t_s_firstyear.year_total = 15338.34 and t_s_secyear.year_total = 43259.06 ) or ( t_w_firstyear.year_total = 43259.06 and t_w_secyear.year_total = 81982.13 and t_s_firstyear.year_total = 15338.34 and t_s_secyear.year_total = 29053.32 ) ; incorrect result cast (year_total as decimal(18,2)) as year_total, -- incorrect result +------------+------------+------------+------------+------+------+-------+ | year_total | year_total | year_total | year_total | t1 | t2 | x | +------------+------------+------------+------------+------+------+-------+ | 81982.13 | 15338.34 | 84313.66 | 15338.34 | 0.18 | 0.18 | false | | 29053.32 | 15338.34 | 81982.13 | 43259.06 | 0.52 | 0.52 | false | | 43259.06 | 15338.34 | 81982.13 | 29053.32 | 0.35 | 0.35 | false | | 29053.32 | 81982.13 | 15338.34 | 43259.06 | 2.82 | 2.82 | false | | 43259.06 | 81982.13 | 15338.34 | 29053.32 | 1.89 | 1.89 | false | +------------+------------+------------+------------+------+------+-------+ correct result cast (year_total as decimal(17,2)) as year_total, -- correct result +------------+------------+------------+------------+------------------------+------------------------+------+ | year_total | year_total | year_total | year_total | t1 | t2 | x | +------------+------------+------------+------------+------------------------+------------------------+------+ | 81982.13 | 15338.34 | 84313.66 | 15338.34 | 0.18709369956599078360 | 0.18191998781692076942 | true | | 29053.32 | 15338.34 | 81982.13 | 43259.06 | 0.52793759886993982099 | 0.52766450444749361842 | true | | 43259.06 | 15338.34 | 81982.13 | 29053.32 | 0.35456942430094412592 | 0.35438601070745539302 | true | | 29053.32 | 81982.13 | 15338.34 | 43259.06 | 2.82178181357586671678 | 2.82032214698591894559 | true | | 43259.06 | 81982.13 | 15338.34 | 29053.32 | 1.89514358379493220610 | 1.89416325365065580760 | true | +------------+------------+------------+------------+------------------------+------------------------+------+ Netezza result T_W_FIRSTYEAR | T_W_SECYEAR | T_S_FIRSTYEAR | T_S_SECYEAR | T_W_SECYEAR_DIV_T_W_FIRSTYEAR | T_S_SECYEAR_DIV_T_S_FIRSTYEAR | X ---------------+-------------+---------------+-------------+-------------------------------+-------------------------------+--- 29053.32 | 81982.13 | 15338.34 | 43259.06 | 2.821782 | 2.820322 | t 29053.32 | 15338.34 | 81982.13 | 43259.06 | 0.527938 | 0.527665 | t 43259.06 | 81982.13 | 15338.34 | 29053.32 | 1.895144 | 1.894163 | t 43259.06 | 15338.34 | 81982.13 | 29053.32 | 0.354569 | 0.354386 | t 81982.13 | 15338.34 | 84313.66 | 15338.34 | 0.187094 | 0.181920 | t
          Hide
          grahn Greg Rahn added a comment -

          Also

          select (cast(15338.34 as decimal(17,2)) / cast(84313.66 as decimal(17,2))) as c;
          +------------------------+
          | c                      |
          +------------------------+
          | 0.18191998781692076942 |
          +------------------------+
          
          select typeof(cast(15338.34 as decimal(17,2)) / cast(84313.66 as decimal(17,2))) as c;
          +----------------+
          | c              |
          +----------------+
          | DECIMAL(37,20) |
          +----------------+
          
          select (cast(15338.34 as decimal(18,2)) / cast(84313.66 as decimal(18,2))) as c;
          +------+
          | c    |
          +------+
          | 0.18 |
          +------+
          
          select typeof(cast(15338.34 as decimal(18,2)) / cast(84313.66 as decimal(18,2))) as c;
          +---------------+
          | c             |
          +---------------+
          | DECIMAL(38,2) |
          +---------------+
          
          Show
          grahn Greg Rahn added a comment - Also select ( cast (15338.34 as decimal(17,2)) / cast (84313.66 as decimal(17,2))) as c; +------------------------+ | c | +------------------------+ | 0.18191998781692076942 | +------------------------+ select typeof( cast (15338.34 as decimal(17,2)) / cast (84313.66 as decimal(17,2))) as c; +----------------+ | c | +----------------+ | DECIMAL(37,20) | +----------------+ select ( cast (15338.34 as decimal(18,2)) / cast (84313.66 as decimal(18,2))) as c; +------+ | c | +------+ | 0.18 | +------+ select typeof( cast (15338.34 as decimal(18,2)) / cast (84313.66 as decimal(18,2))) as c; +---------------+ | c | +---------------+ | DECIMAL(38,2) | +---------------+
          Hide
          grahn Greg Rahn added a comment -

          Appears that TPC-DS query31 also has this same issue (1 row missing) for the same reason (case/when with division in predicate).

          Show
          grahn Greg Rahn added a comment - Appears that TPC-DS query31 also has this same issue (1 row missing) for the same reason (case/when with division in predicate).
          Show
          grahn Greg Rahn added a comment - This looks like where the calculation is done: https://github.com/apache/incubator-impala/blob/master/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java#L177
          Hide
          grahn Greg Rahn added a comment -

          Here are some notes on how Netezza calculates precision/scale for division.
          http://www-01.ibm.com/support/docview.wss?uid=swg21669116

          Show
          grahn Greg Rahn added a comment - Here are some notes on how Netezza calculates precision/scale for division. http://www-01.ibm.com/support/docview.wss?uid=swg21669116
          Hide
          dhecht Dan Hecht added a comment -

          Splitting the multiply case to IMPALA-4940.

          Show
          dhecht Dan Hecht added a comment - Splitting the multiply case to IMPALA-4940 .
          Hide
          dhecht Dan Hecht added a comment -

          commit a53eeb2068970ff7575f4f3fdf3c383861a75f2d
          Author: Dan Hecht <dhecht@cloudera.com>
          Date: Tue Feb 7 12:08:54 2017 -0800

          IMPALA-4370: Divide and modulo result types for DECIMAL version V2

          Implement the new DECIMAL return type rules for divide and modulo
          expressions, active when query option DECIMAL_V2=1. See the comment
          in the code for more details. A couple of examples that show why new
          return type rules for divide are desirable.

          For modulo, the return types are actually equivalent, though the
          rules are expressed differently to have consistency with how
          precision fixups are handled for each version.

          DECIMAL Version 1:

          -------------------------------------------------------

          cast(1 as decimal(20,0)) / cast(3 as decimal(20,0))

          -----------------------------------------------------

          0

          -------------------------------------------------------

          DECIMAL Version 2:

          -------------------------------------------------------

          cast(1 as decimal(20,0)) / cast(3 as decimal(20,0))

          -----------------------------------------------------

          0.333333333333333333

          -------------------------------------------------------

          DECIMAL Version 1:

          -------------------------------------------------------

          cast(1 as decimal(6,0)) / cast(0.1 as decimal(38,38))

          -------------------------------------------------------

          NULL

          -------------------------------------------------------
          WARNINGS: UDF WARNING: Expression overflowed, returning NULL

          DECIMAL Version 2:

          -------------------------------------------------------

          cast(1 as decimal(6,0)) / cast(0.1 as decimal(38,38))

          -------------------------------------------------------

          10.000000

          -------------------------------------------------------

          Change-Id: I83e7f7787edfa4b4bddc25945090542a0e90881b
          Reviewed-on: http://gerrit.cloudera.org:8080/5952
          Reviewed-by: Dan Hecht <dhecht@cloudera.com>
          Tested-by: Impala Public Jenkins

          Show
          dhecht Dan Hecht added a comment - commit a53eeb2068970ff7575f4f3fdf3c383861a75f2d Author: Dan Hecht <dhecht@cloudera.com> Date: Tue Feb 7 12:08:54 2017 -0800 IMPALA-4370 : Divide and modulo result types for DECIMAL version V2 Implement the new DECIMAL return type rules for divide and modulo expressions, active when query option DECIMAL_V2=1. See the comment in the code for more details. A couple of examples that show why new return type rules for divide are desirable. For modulo, the return types are actually equivalent, though the rules are expressed differently to have consistency with how precision fixups are handled for each version. DECIMAL Version 1: ------------------------------------------------------- cast(1 as decimal(20,0)) / cast(3 as decimal(20,0)) ----------------------------------------------------- 0 ------------------------------------------------------- DECIMAL Version 2: ------------------------------------------------------- cast(1 as decimal(20,0)) / cast(3 as decimal(20,0)) ----------------------------------------------------- 0.333333333333333333 ------------------------------------------------------- DECIMAL Version 1: ------------------------------------------------------- cast(1 as decimal(6,0)) / cast(0.1 as decimal(38,38)) ------------------------------------------------------- NULL ------------------------------------------------------- WARNINGS: UDF WARNING: Expression overflowed, returning NULL DECIMAL Version 2: ------------------------------------------------------- cast(1 as decimal(6,0)) / cast(0.1 as decimal(38,38)) ------------------------------------------------------- 10.000000 ------------------------------------------------------- Change-Id: I83e7f7787edfa4b4bddc25945090542a0e90881b Reviewed-on: http://gerrit.cloudera.org:8080/5952 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Impala Public Jenkins

            People

            • Assignee:
              dhecht Dan Hecht
              Reporter:
              yhluo_impala_39a4 Yuanhao Luo
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development