Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
In case parfor does not consume all the available parallelism, we propagate this parallelism down to individual operations with slight (max 50%) overprovisioning. For example, if we have 80vcores, and parfor is assigned k=47, we still assign k=2 to individual operations.
However, with native DNN operations this causes JVM crashes as follows:
# # A fatal error has been detected by the Java Runtime Environment: # # SIGFPE (0x8) at pc=0x00007f5de21902d6, pid=335027, tid=0x00007f5df8bcb700 # # JRE version: OpenJDK Runtime Environment (8.0_161-b14) (build 1.8.0_161-b14) # Java VM: OpenJDK 64-Bit Server VM (25.161-b14 mixed mode linux-amd64 ) # Problematic frame: # C [libmkl_avx512.so+0x206d2d6][thread 140041622857472 also had an error] mkl_dnn_avx512_bkdGemmDirectConv_F64+0x276
Hence, when native BLAS or DNN libraries are loaded, we should be more conservative and not over-provision at all.