Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Invalid
-
0.4.0, 0.5.0
-
None
-
None
Description
The %sql drops leading zeros from columns with String datatype.
Consider the following test data:
0123,zero one two three 1230,one two three zero 1010,one zero one zero
Created as an external table in Hive using:
create external table lz_test ( id String, description String ) row format delimited fields terminated by ',' location '/pathTo/leadingZero_test' ;
and accessed using the following scala (%livy) code:
val lzDF = sql("select * from lz_Test") lzDF.createOrReplaceTempView("LZT") lzDF.printSchema lzDF.show(false)
and the following sql in the same notebook:
%sql select * from LZT
The result is the following (note the missing zero on the first record):
The output of the scala code does, however, display the leading zero.
Also note the data types from the print schema: ID is a String.
lzDF: org.apache.spark.sql.DataFrame = [id: string, description: string] root |-- id: string (nullable = true) |-- description: string (nullable = true) +----+------------------+ |id |description | +----+------------------+ |0123|zero one two three| |1230|one two three zero| |1010|one zero one zero | +----+------------------+