Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 4.0.0
-
None
-
None
-
None
-
ghx-label-2
Description
IMPALA-9400 added initial support for Ozone (o3fs/ofs) by assuming all Ozone I/O as remote, which is a valid assumption.
However, the Impala's internal logic will assign the I/O to a single local disk I/O thread , severely limiting the I/O parallelism. This is evident when running the debug build, which fails at the following check:
Log file created at: 2022/07/18 18:15:02 Running on machine: rhel05.ozone.cisco.local Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg F0718 18:15:02.269232 105827 disk-io-mgr.cc:561] 004a5f3dd4a34435:f2e0cb9b00000030] Check failed: !IsOzonePath(file) F0718 18:15:02.269235 105832 disk-io-mgr.cc:561] 004a5f3dd4a34435:f2e0cb9b00000003] Check failed: !IsOzonePath(file) F0718 18:15:02.269273 105834 disk-io-mgr.cc:561] 004a5f3dd4a34435:f2e0cb9b00000014] Check failed: !IsOzonePath(file) F0718 18:15:02.269235 105832 disk-io-mgr.cc:561] 004a5f3dd4a34435:f2e0cb9b00000003] Check failed: !IsOzonePath(file) F0718 18:15:02.269273 105834 disk-io-mgr.cc:561] 004a5f3dd4a34435:f2e0cb9b00000014] Check failed: !IsOzonePath(file)
The is_remote parameter of a scan range is always false for Ozone:
TScanRangeParams { 01: scan_range (struct) = TScanRange { 01: hdfs_file_split (struct) = THdfsFileSplit { 01: relative_path (string) = "base_1/2b4595d335caddf5-4c9efd320000000e_1114488364_data.0.parq", 02: offset (i64) = 0, 03: length (i64) = 15982993, 04: partition_id (i64) = 172, 05: file_length (i64) = 15982993, 06: file_compression (i32) = 0, 07: mtime (i64) = 1657968669139, 08: is_erasure_coded (bool) = false, 09: partition_path_hash (i32) = -343315716, }, }, 02: volume_id (i32) = 65535, 03: try_hdfs_cache (bool) = false, 04: is_remote (bool) = false,
Because Ozone does not yet have short circuit read support, I think a quick fix is to always force Ozone to use the remote I/O thread group assigned io it.
Attachments
Attachments
Issue Links
- is caused by
-
IMPALA-9400 Impala Ozone Support
- Resolved
- is fixed by
-
IMPALA-11457 Ozone parallelism reduced when backends are co-located
- Resolved
- is related to
-
IMPALA-11457 Ozone parallelism reduced when backends are co-located
- Resolved
- relates to
-
IMPALA-10375 Lock down which filesystem types use the file handle cache
- Resolved