Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18700

getCached in HiveMetastoreCatalog not thread safe cause driver OOM

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.6.1, 2.0.0, 2.1.1
    • 2.0.3, 2.1.1, 2.2.0
    • SQL
    • None

    Description

      In our spark sql platform, each query use same HiveContext and independent thread, new data will append to tables as new partitions every 30min. After a new partition added to table T, we should call refreshTable to clear T’s cache in cachedDataSourceTables to make the new partition searchable.
      For the table have more partitions and files(much bigger than spark.sql.sources.parallelPartitionDiscovery.threshold), a new query of table T will start a job to fetch all FileStatus in listLeafFiles function. Because of the huge number of files, the job will run several seconds, during the time, new queries of table T will also start new jobs to fetch FileStatus because of the function of getCache is not thread safe. Final cause a driver OOM.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            XuanYuan Yuanjian Li
            XuanYuan Yuanjian Li
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment