Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-9921

Too many open files in Spark SQL

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 1.5.0
    • Fix Version/s: 1.5.0
    • Component/s: SQL
    • Labels:
      None
    • Environment:

      os x

      Description

      Data is table with 300K rows, 16 cols, covers a single year, so there are 12 months and 365 days with roughly similar number of rows (each row is a scheduled flight)

      Error is

      Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ",  : 
        Unable to retrieve JDBC result set for SELECT `year`, `month`, `flights`
      FROM (select `year`, `month`, sum(`flights`) as `flights`
      from (select `year`, `month`, `day`, count(*) as `flights`
      from `flights`
      group by `year`, `month`, `day`) as `_w21`
      group by `year`, `month`) AS `_w22`
      LIMIT 10 (org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 237.0 failed 1 times, most recent failure: Lost task 0.0 in stage 237.0 (TID 8634, localhost): java.io.FileNotFoundException: /user/hive/warehouse/flights/file11ce460c958e (Too many open files)
      	at java.io.FileInputStream.open0(Native Method)
      	at java.io.FileInputStream.open(FileInputStream.java:195)
      	at java.io.FileInputStream.<init>(FileInputStream.java:138)
      	at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.<init>(RawLocalFileSystem.java:103)
      	at org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:195)
      	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<i
      
      

      As you can see the query is not something one would write by hand very easily, because it's computer generated, but it makes perfect sense: it's a count of flights by month. Could be done without the nested query, but that's not the point.

      This query used to work on 1.4, doesn't on 1.5. There has also been a os upgrade to yosemite in the meantime, so it's hard to separate the effects of the two. Following suggestions that default system limits for open files are too low for spark to work properly, I increase hard and soft limits to 32k. For some reason, the error happens when java has about 10250 open files as reported by lsof. Not clear to me where that limit is coming from. Total files open is 16k. If this is not a bug, I would like to ask what a safe number of allowed open files is and if there are other configurations that need to be tuned.

        Attachments

        Issue Links

          Activity

            People

            • Assignee:
              davies Davies Liu
              Reporter:
              piccolbo Antonio Piccolboni

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment