Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
Scenario:
- Create a CSV file from bug data.
- Upload this file to HDFS
- Use spark2 interpreter to read the csv file and convert it to dataframe.
Steps to reproduce:
para1: %sh hdfs dfs -rmr -skipTrash /tmp/jira hdfs dfs -mkdir /tmp/jira hdfs dfs -put /tmp/jira/jira.com-2.csv /tmp/jira hdfs dfs -ls /tmp/jira para2: %spark2 import org.apache.spark.sql.SQLContext val sqlContext = new SQLContext(sc) val df = sqlContext.read .format("csv") .option("header", "true") // Use first line of all files as header .option("inferSchema", "true") // Automatically infer data types .load("/tmp/jira/jira.com-2.csv") df.show()
While converting csv file data to dataframe , spark-shell throws below WARN message.
scala> val df = sqlContext.read.format("csv").option("header", "true").option("inferSchema", "true").load("/tmp/jira/jira.com-2.csv") 17/02/17 00:09:12 WARN DataSource: Error while looking for metadata directory. df: org.apache.spark.sql.DataFrame = [Summary: string, Issue key: string ... 24 more fields]
"WARN DataSource: Error while looking for metadata directory." message is not shown in output of zeppelin notebook. This warn message is a key to debug the notebook failure.
Since zeppelin does not print this message , It becomes hard for an user to debug issue from zeppelin.
Tried with sc.setLogLevel("DEBUG") as well. But this message wasn't printed.