[SPARK-9921] Too many open files in Spark SQL - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Duplicate
Affects Version/s: 1.5.0
Fix Version/s: 1.5.0
Component/s: SQL
Labels:
None
Environment:

os x

Description

Data is table with 300K rows, 16 cols, covers a single year, so there are 12 months and 365 days with roughly similar number of rows (each row is a scheduled flight)

Error is

Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ",  : 
  Unable to retrieve JDBC result set for SELECT `year`, `month`, `flights`
FROM (select `year`, `month`, sum(`flights`) as `flights`
from (select `year`, `month`, `day`, count(*) as `flights`
from `flights`
group by `year`, `month`, `day`) as `_w21`
group by `year`, `month`) AS `_w22`
LIMIT 10 (org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 237.0 failed 1 times, most recent failure: Lost task 0.0 in stage 237.0 (TID 8634, localhost): java.io.FileNotFoundException: /user/hive/warehouse/flights/file11ce460c958e (Too many open files)
	at java.io.FileInputStream.open0(Native Method)
	at java.io.FileInputStream.open(FileInputStream.java:195)
	at java.io.FileInputStream.<init>(FileInputStream.java:138)
	at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.<init>(RawLocalFileSystem.java:103)
	at org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:195)
	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<i

As you can see the query is not something one would write by hand very easily, because it's computer generated, but it makes perfect sense: it's a count of flights by month. Could be done without the nested query, but that's not the point.

This query used to work on 1.4, doesn't on 1.5. There has also been a os upgrade to yosemite in the meantime, so it's hard to separate the effects of the two. Following suggestions that default system limits for open files are too low for spark to work properly, I increase hard and soft limits to 32k. For some reason, the error happens when java has about 10250 open files as reported by lsof. Not clear to me where that limit is coming from. Total files open is 16k. If this is not a bug, I would like to ask what a safe number of allowed open files is and if there are other configurations that need to be tuned.

Attachments

Issue Links

is duplicated by

SPARK-9827 Too many open files in TungstenExchange

Resolved

Activity

People

Assignee:: Davies Liu

Reporter:: Antonio Piccolboni

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 12/Aug/15 23:19

Updated:: 13/Aug/15 23:32

Resolved:: 13/Aug/15 04:59