Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-25377

spark sql dataframe cache is invalid

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Incomplete
    • Affects Version/s: 2.3.0
    • Fix Version/s: None
    • Component/s: Spark Core
    • Labels:
    • Environment:

      spark version 2.3.0

      scala version 2.1.8

      Description

        When I use SQL dataframe in application, I found that dataframe.cache is invalid, the first time to execute Action like count() took me 40 seconds, and the seconds time to execute Action also.So I use dataframe.rdd.cache, second execution time is less than first execution time. And I think it's SQL dataframe's bug.

         This is my codes and console log, and I have cached the datafame of result before.

       this is my codes

      logger.info("start to consuming result count")
      logger.info(s"consuming ${result.count} output records")
      //result.show(false)
      logger.info("starting go to MysqlSink")
      logger.info(s"consuming ${result.count} output records")
      logger.info("starting go to MysqlSink")

       

      And console log is below

      18/09/08 14:15:17 INFO MySQLRiskScenarioRunner: start to consuming result count
      18/09/08 14:15:49 INFO MySQLRiskScenarioRunner: consuming 5 output records
      18/09/08 14:15:49 INFO MySQLRiskScenarioRunner: starting go to MysqlSink
      18/09/08 14:16:22 INFO MySQLRiskScenarioRunner: consuming 5 output records
      18/09/08 14:16:22 INFO MySQLRiskScenarioRunner: starting go to MysqlSink

       

       

       

       

       

       

        Attachments

          Activity

          $i18n.getText('security.level.explanation', $currentSelection) Viewable by All Users
          Cancel

            People

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment