Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21646

Add new type coercion rules to compatible with Hive

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 2.2.0
    • None
    • SQL
    • None

    Description

      How to reproduce:
      hive:

      $ hive -S
      hive> create table spark_21646(c1 string, c2 string);
      hive> insert into spark_21646 values('92233720368547758071', 'a');
      hive> insert into spark_21646 values('21474836471', 'b');
      hive> insert into spark_21646 values('10', 'c');
      hive> select * from spark_21646 where c1 > 0;
      92233720368547758071	a
      10	c
      21474836471	b
      hive>
      

      spark-sql:

      $ spark-sql -S
      spark-sql> select * from spark_21646 where c1 > 0;
      10      c                                                                       
      spark-sql> select * from spark_21646 where c1 > 0L;
      21474836471	b
      10	c
      spark-sql> explain select * from spark_21646 where c1 > 0;
      == Physical Plan ==
      *Project [c1#14, c2#15]
      +- *Filter (isnotnull(c1#14) && (cast(c1#14 as int) > 0))
         +- *FileScan parquet spark_21646[c1#14,c2#15] Batched: true, Format: Parquet, Location: InMemoryFileIndex[viewfs://cluster4/user/hive/warehouse/spark_21646], PartitionFilters: [], PushedFilters: [IsNotNull(c1)], ReadSchema: struct<c1:string,c2:string>
      spark-sql> 
      

      As you can see, spark auto cast c1 to int type, if this value out of integer range, the result is different from Hive.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              yumwang Yuming Wang
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: