Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
When creating placeholder pods, the name generation function is currently defined as:
tg-{tgID:20}-{appID:28}-{placeholderIndex}
This has several drawbacks:
- Duplicates can occur if multiple applications have a common 28-char prefix, or if task group IDs within an app have a common 20-character prefix.
- It is theoretically possible to ask for enough placeholders to overflow the maximum label length of 63 characters.
- The task group ID is probably less significant than appID when attempting to locate placeholder pods in listings.
If we introduce a random suffix instead of the placeholder index, we can ensure that we do not have name collisions or overflows:
tg-{appID:28}-{tgID:20}-{suffix:10}
This ensures a total length of 63 characters or less, places the appID foremost in the list, and if we use [a-z0-9] for each digit of the suffix, 36^10 possible combinations which will ensure a statistical impossibility of collision.
So instead of:
tg-driver-sparkapp-abcde-1
tg-executor-sparkapp-abcde-2
tg-executor-sparkapp-abcde-3
We would have:
tg-sparkapp-abcde-driver-38sh40fk58
tg-sparkapp-abcde-executor-2sg93hal23
tg-sparkapp-abcde-executor-a03xh5dl39
We should also document that placeholders always have unique names starting with "tg-" and may contain portions of the application ID and task group ID but that the specific formatting is subject to change.
Attachments
Issue Links
- causes
-
YUNIKORN-2517 [Yunikorn] Incorrect Placeholder Count for Duplicate Task Groups in Gang scheduling
- Open
- relates to
-
YUNIKORN-1964 Fix missing GetPlaceholderNames function
- Closed
- links to