The kms client appears to have no retry logic – at all. It's completely decoupled from the ipc retry logic. This has major impacts if the KMS is unreachable for any reason, including but not limited to network connection issues, timeouts, the .
This has some major ramifications:
- Jobs may fail to submit, although oozie resubmit logic should mask it
- Non-oozie launchers may experience higher rates if they do not already have retry logic.
- Tasks reading EZ files will fail, probably be masked by framework reattempts
- EZ file creation fails after creating a 0-length file – client receives EDEK in the create response, then fails when decrypting the EDEK
- Bulk hadoop fs copies, and maybe distcp, will prematurely fail