[SPARK-22113] Dataset shows in Hive is inconsistent with JDBC - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Not A Problem
Affects Version/s: 2.2.0
Fix Version/s: None
Component/s: SQL
Labels:
None
Environment:

version 2.2.0

Description

I am trying to query data from Hive in spark. According spark-sql document, there're two ways to do this:
The first way is Init session with enableHiveSupport

SparkSession session = SparkSession.builder().enableHiveSupport().getOrCreate();
session.sql("select dw_date from tfdw.dwd_dim_date limit 10").show();

the dataset shows the correct result

The second way is through JDBC

Dataset<Row> ds = session.read()
                  .format("jdbc")
                  .option("driver", "org.apache.hive.jdbc.HiveDriver")
                  .option("url", "jdbc:hive2://iZ11syxr6afZ:21050/;auth=noSasl")
                  .option("dbtable", "tfdw.dwd_dim_date")
                  .load();
ds.select("dw_date").limit(10).show();

But the dataset only show the column name in the result rather than the data in the column

The two pictures should be consistent I think. Any outstanding I missed ? Many thanks!

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Michael Fu

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 25/Sep/17 04:15

Updated:: 26/Sep/17 06:50

Resolved:: 25/Sep/17 07:24