[SPARK-21263] NumberFormatException is not thrown while converting an invalid string to float/double - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.1.1
Fix Version/s: 2.3.0
Component/s: Java API
Labels:
None

Description

When reading a below-mentioned data by specifying user-defined schema, exception is not thrown. Refer the details :

Data:
'PatientID','PatientName','TotalBill'
'1000','Patient1','10u000'
'1001','Patient2','30000'
'1002','Patient3','40000'
'1003','Patient4','50000'
'1004','Patient5','60000'

Source code:
Dataset dataset = sparkSession.read().schema(schema)
.option(INFER_SCHEMA, "true")
.option(DELIMITER, ",")
.option(QUOTE, "\"")
.option(MODE, Mode.PERMISSIVE)
.csv(sourceFile);

When we collect the dataset data:
dataset.collectAsList();

Schema1:
[StructField(PatientID,IntegerType,true), StructField(PatientName,StringType,true), StructField(TotalBill,IntegerType,true)]
*Result *: Throws NumerFormatException
Caused by: java.lang.NumberFormatException: For input string: "10u000"

Schema2:
[StructField(PatientID,IntegerType,true), StructField(PatientName,StringType,true), StructField(TotalBill,DoubleType,true)]
Actual Result:
"PatientID": 1000,
"NumberOfVisits": "400",
"TotalBill": 10,
Expected Result: Should throw NumberFormatException for input string "10u000"

Attachments

Issue Links

links to

[Github] Pull Request #18532 (HyukjinKwon)

Activity

People

Assignee:: Hyukjin Kwon

Reporter:: Navya Krishnappa

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 30/Jun/17 07:40

Updated:: 12/Dec/22 18:11

Resolved:: 11/Jul/17 10:11