Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
2.3.0
-
None
Description
In short, please use the following shell transcript for the reproducer.
Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.3.0-SNAPSHOT /_/ Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_91) Type in expressions to have them evaluated. Type :help for more information. scala> def printTimeTaken(str: String, f: () => Unit) { val start = System.nanoTime() f() val end = System.nanoTime() val timetaken = end - start import scala.concurrent.duration._ println(s"Time taken for $str is ${timetaken.nanos.toMillis}\n") } | | | | | | | printTimeTaken: (str: String, f: () => Unit)Unit scala> for(i <- 1 to 100000) {printTimeTaken("time to append to hive:", () => { Seq(1, 2).toDF().write.mode("append").saveAsTable("t1"); })} Time taken for time to append to hive: is 284 Time taken for time to append to hive: is 211 ... ... Time taken for time to append to hive: is 2615 ... Time taken for time to append to hive: is 3055 ... Time taken for time to append to hive: is 22425 ....
Why does it matter ?
In a streaming job it is not possible to append to hive using this dataframe operation.