Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
1.2.0
-
None
-
None
-
spark 1.6
Description
Integer datatype as a long datatype in carbondata on cluster
Steps to reproduce Bug:
In CarbonData:
Create Table:
create table myvmall (imei String,uuid String,MAC String,device_color String,device_shell_color String,device_name String,product_name String,ram String,rom String,cpu_clock String,series String,check_date String,check_year int,check_month int ,check_day int,check_hour int,bom String,inside_name String,packing_date String,packing_year String,packing_month String,packing_day String,packing_hour String,customer_name String,deliveryAreaId String,deliveryCountry String,deliveryProvince String,deliveryCity String,deliveryDistrict String,packing_list_no String,order_no String,Active_check_time String,Active_check_year int,Active_check_month int,Active_check_day int,Active_check_hour int,ActiveAreaId String,ActiveCountry String,ActiveProvince String,Activecity String,ActiveDistrict String,Active_network String,Active_firmware_version String,Active_emui_version String,Active_os_version String,Latest_check_time String,Latest_check_year int,Latest_check_month int,Latest_check_day int,Latest_check_hour int,Latest_areaId String,Latest_country String,Latest_province String,Latest_city String,Latest_district String,Latest_firmware_version String,Latest_emui_version String,Latest_os_version String,Latest_network String,site String,site_desc String,product String,product_desc String) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ('DICTIONARY_INCLUDE'='check_year,check_month,check_day,check_hour,Active_check_year,Active_check_month,Active_check_day,Active_check_hour,Latest_check_year,Latest_check_month,Latest_check_day')
Load Data:
LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/100_VMALL_1_Day_DATA_2015-09-15.csv' INTO table myvmall options('DELIMITER'=',', 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='imei,uuid,MAC,device_color,device_shell_color,device_name,product_name,ram,rom,cpu_clock,series,check_date,check_year,check_month,check_day,check_hour,bom,inside_name,packing_date,packing_year,packing_month,packing_day,packing_hour,customer_name,deliveryAreaId,deliveryCountry,deliveryProvince,deliveryCity,deliveryDistrict,packing_list_no,order_no,Active_check_time,Active_check_year,Active_check_month,Active_check_day,Active_check_hour,ActiveAreaId,ActiveCountry,ActiveProvince,Activecity,ActiveDistrict,Active_network,Active_firmware_version,Active_emui_version,Active_os_version,Latest_check_time,Latest_check_year,Latest_check_month,Latest_check_day,Latest_check_hour,Latest_areaId,Latest_country,Latest_province,Latest_city,Latest_district,Latest_firmware_version,Latest_emui_version,Latest_os_version,Latest_network,site,site_desc,product,product_desc')
description in carbondata:
--------------------------------------------+
col_name | data_type | comment |
--------------------------------------------+
imei | string | |
uuid | string | |
mac | string | |
device_color | string | |
device_shell_color | string | |
device_name | string | |
product_name | string | |
ram | string | |
rom | string | |
cpu_clock | string | |
series | string | |
check_date | string | |
check_year | int | |
check_month | int | |
check_day | int | |
check_hour | int | |
bom | string | |
inside_name | string | |
packing_date | string | |
packing_year | string | |
packing_month | string | |
packing_day | string | |
packing_hour | string | |
customer_name | string | |
deliveryareaid | string | |
deliverycountry | string | |
deliveryprovince | string | |
deliverycity | string | |
deliverydistrict | string | |
packing_list_no | string | |
order_no | string | |
active_check_time | string | |
active_check_year | int | |
active_check_month | int | |
active_check_day | int | |
active_check_hour | int | |
activeareaid | string | |
activecountry | string | |
activeprovince | string | |
activecity | string | |
activedistrict | string | |
active_network | string | |
active_firmware_version | string | |
active_emui_version | string | |
active_os_version | string | |
latest_check_time | string | |
latest_check_year | int | |
latest_check_month | int | |
latest_check_day | int | |
latest_check_hour | bigint | |
latest_areaid | string | |
latest_country | string | |
latest_province | string | |
latest_city | string | |
latest_district | string | |
latest_firmware_version | string | |
latest_emui_version | string | |
latest_os_version | string | |
latest_network | string | |
site | string | |
site_desc | string | |
product | string | |
product_desc | string |
--------------------------------------------+
Query Executed:
select imei,latest_check_hour from myvmall where myvmall.latest_check_hour IN (10,14) and myvmall.imei IN ('imeiA009945257','imeiA009945258') and myvmall.check_year IN (2015)
Result in CarbonData:
-----------------------------------
imei | latest_check_hour |
-----------------------------------
imeiA009945257 | 14 |
imeiA009945258 | 14 |
-----------------------------------
In hive:
Create Table:
create table hivevmall (imei String,uuid String,MAC String,device_color String,device_shell_color String,device_name String,product_name String,ram String,rom String,cpu_clock String,series String,check_date String,check_year int,check_month int ,check_day int,check_hour int,bom String,inside_name String,packing_date String,packing_year String,packing_month String,packing_day String,packing_hour String,customer_name String,deliveryAreaId String,deliveryCountry String,deliveryProvince String,deliveryCity String,deliveryDistrict String,packing_list_no String,order_no String,Active_check_time String,Active_check_year int,Active_check_month int,Active_check_day int,Active_check_hour int,ActiveAreaId String,ActiveCountry String,ActiveProvince String,Activecity String,ActiveDistrict String,Active_network String,Active_firmware_version String,Active_emui_version String,Active_os_version String,Latest_check_time String,Latest_check_year int,Latest_check_month int,Latest_check_day int,Latest_check_hour int,Latest_areaId String,Latest_country String,Latest_province String,Latest_city String,Latest_district String,Latest_firmware_version String,Latest_emui_version String,Latest_os_version String,Latest_network String,site String,site_desc String,product String,product_desc String)ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
Load data:
load data local inpath '/opt/Carbon/CarbonData/TestData/Data/100_VMALL_1_Day_DATA_2015-09-15.csv' OVERWRITE INTO TABLE hivevmall
description in hive:
--------------------------------------------+
col_name | data_type | comment |
--------------------------------------------+
imei | string | NULL |
uuid | string | NULL |
mac | string | NULL |
device_color | string | NULL |
device_shell_color | string | NULL |
device_name | string | NULL |
product_name | string | NULL |
ram | string | NULL |
rom | string | NULL |
cpu_clock | string | NULL |
series | string | NULL |
check_date | string | NULL |
check_year | int | NULL |
check_month | int | NULL |
check_day | int | NULL |
check_hour | int | NULL |
bom | string | NULL |
inside_name | string | NULL |
packing_date | string | NULL |
packing_year | string | NULL |
packing_month | string | NULL |
packing_day | string | NULL |
packing_hour | string | NULL |
customer_name | string | NULL |
deliveryareaid | string | NULL |
deliverycountry | string | NULL |
deliveryprovince | string | NULL |
deliverycity | string | NULL |
deliverydistrict | string | NULL |
packing_list_no | string | NULL |
order_no | string | NULL |
active_check_time | string | NULL |
active_check_year | int | NULL |
active_check_month | int | NULL |
active_check_day | int | NULL |
active_check_hour | int | NULL |
activeareaid | string | NULL |
activecountry | string | NULL |
activeprovince | string | NULL |
activecity | string | NULL |
activedistrict | string | NULL |
active_network | string | NULL |
active_firmware_version | string | NULL |
active_emui_version | string | NULL |
active_os_version | string | NULL |
latest_check_time | string | NULL |
latest_check_year | int | NULL |
latest_check_month | int | NULL |
latest_check_day | int | NULL |
latest_check_hour | int | NULL |
latest_areaid | string | NULL |
latest_country | string | NULL |
latest_province | string | NULL |
latest_city | string | NULL |
latest_district | string | NULL |
latest_firmware_version | string | NULL |
latest_emui_version | string | NULL |
latest_os_version | string | NULL |
latest_network | string | NULL |
site | string | NULL |
site_desc | string | NULL |
product | string | NULL |
product_desc | string | NULL |
--------------------------------------------+
Query Executed:
select imei,latest_check_hour from hivevmall where hivevmall.latest_check_hour IN (10,14) and hivevmall.imei IN ('imeiA009945257','imeiA009945258') and hivevmall.check_year IN (2015)
Result In hive:
-----------------------------------
imei | latest_check_hour |
-----------------------------------
imeiA009945257 | 14 |
imeiA009945258 | 14 |
-----------------------------------
Result On Automation:
1)In CarbonData:
imei StringType,latest_check_hour LongType
imeiA009945257,14
imeiA009945258,14
2)In hive:
imei StringType,latest_check_hour IntegerType
imeiA009945257,14
imeiA009945258,14
camparison failur because of datatype( latest_check_hour int in hive and latest_check_hour bigint in carbondata)
testCase id:1) Pushup_filter_myvmall_tc013
2) Pushup_filter_myvmall_tc019
3) Pushup_filter_myvmall_tc021
4) Pushup_filter_myvmall_tc022