Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
This Jira tracks the effort to improve the interaction between Hadoop and Windows Server.
- Move away from an external process (winutils.exe) for native code:
- Replace by native Java APIs (e.g., symlinks);
- Replace by something like JNI or so;
- Fix the build system to fully leverage cmake instead of msbuild;
- Possible other improvements;
- Memory and handle leaks.
I did a quick investigation of the performance of WinUtils in YARN. In average NM calls 4.76 times per second and 65.51 per container.
Requests | Requests/sec | Requests/min | Requests/container | |
Sum [WinUtils] | 135354 | 4.761 | 286.160 | 65.51 |
[WinUtils] Execute -help | 4148 | 0.145 | 8.769 | 2.007 |
[WinUtils] Execute -ls | 2842 | 0.0999 | 6.008 | 1.37 |
[WinUtils] Execute -systeminfo | 9153 | 0.321 | 19.35 | 4.43 |
[WinUtils] Execute -symlink | 115096 | 4.048 | 243.33 | 57.37 |
[WinUtils] Execute -task isAlive | 4115 | 0.144 | 8.699 | 2.05 |
Interval: 7 hours, 53 minutes and 48 seconds
Each execution of WinUtils does around 140 IO ops, of which 130 are DDL ops.
This means 666.58 IO ops/second due to WinUtils.
We should start considering to remove WinUtils from Hadoop and creating a JNI interface.
Attachments
Attachments
Issue Links
- incorporates
-
HADOOP-17839 LocalFS to support ability to disable permission get/set; remove need for winutils
- Open
- is duplicated by
-
HDFS-13708 change Files instead of NativeIO
- Open