Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Incomplete
-
2.2.1
-
None
Description
FileInputDStream Streaming UI 's records should not be set to the default value of 0, it should be the total number of rows of new files.
^-------------------------------------------in FileInputDStream.scala start-------------------------------------^ val inputInfo = StreamInputInfo(id, {color:#FF0000}0{color}, metadata) {color:#FF0000}// set to the default value of 0{color} ssc.scheduler.inputInfoTracker.reportInfo(validTime, inputInfo) case class StreamInputInfo( inputStreamId: Int, numRecords: Long, metadata: Map[String, Any] = Map.empty) -------------------------------------in FileInputDStream.scala end--------------------------- ^-------------------------------------------in DirectKafkaInputDStream.scala start-------------------------------------^ val inputInfo = StreamInputInfo(id, {color:#FF0000}rdd.count{color}, metadata) {color:#FF0000}//set to rdd count as numRecords{color} ssc.scheduler.inputInfoTracker.reportInfo(validTime, inputInfo) case class StreamInputInfo( inputStreamId: Int, numRecords: Long, metadata: Map[String, Any] = Map.empty) -------------------------------------in DirectKafkaInputDStream.scala end-----------------------
test method:
./bin/spark-submit --class org.apache.spark.examples.streaming.HdfsWordCount examples/jars/spark-examples_2.11-2.4.0-SNAPSHOT.jar /spark/tmp/