[HADOOP-18396] Issues running in dynamic / managed environments - ASF JIRA

Details

Type: Improvement
Status: In Progress
Priority: Major
Resolution: Unresolved
Affects Version/s: 3.4.0, 3.3.5, 3.3.4
Fix Version/s: None
Component/s: None
Labels:
None
Environment:

Running an HA configuration in Kubernetes, using Java 11.

Target Version/s:

3.3.9

Description

Running in dynamic or managed environments is a challenge because we can't assume that all services will have DNS entries, will be started in a specific order, will maintain constant IP addresses, etc. I'm using the following assumptions to guide the changes necessary to operate in this kind of environment:

The configuration files are an expression of desired state
If a referenced service instance is not resolvable or reachable at a moment in time, it will be eventually and should be able to participate in the future, as if it had been there originally, without requiring manual intervention
IP address changes should be handled in a way that no only allows distributed calls to continue to function, but avoids having to re-resolve the address over and over
Code that requires resolved names (Kerberos and DataNode registration) should fall back to DNS reverse lookups to work around temporary issues caused by caching. Example: The DataNode registration is only performed at startup, and yet the extra check that allows it to succeed in registering with the NameNode isn’t performed
If an HA system is supposed to only require a quorum, then we shouldn’t require the full set, allowing the called service to bring the remaining instances into compliance
Managing a service should be independent of other services. Example: You should be able to perform a rolling restart of JournalNodes without worrying about causing an issue with NameNodes as long as a quorum is present.

A proof of these concepts would be the ability to:

Start with less that the full replica count of a service, while still providing the required quorum or minimal count, should still allow a cluster to start and function. Example: 2 out of 3 configured JournalNodes should still allow the NameNode to format, function, rollover to the standby, etc.
Introduce missing instances should join the existing cluster without manual intervention. Example: Starting the 3rd JournalNode should automatically be formatted and brought up to date
Perform rolling restarts of individual services without negatively impacting other services (causing failures, restarts, etc.). Example: Rolling restarts of JournalNodes shouldn't cause problems in NameNodes; Rolling restarts of NameNodes shouldn't cause problems with DataNodes
Logs should only report updated IP addresses once (per dependent), avoiding costly re-resolution

Attachments

Issue Links

is a parent of

HDFS-4043 Namenode Kerberos Login does not use proper hostname for host qualified hdfs principal name.

Resolved

HDFS-16685 DataNode registration fails because getHostName returns an IP address

In Progress

HDFS-16688 Unresolved Hosts during startup are not synced by JournalNodes

In Progress

HDFS-16691 Use quorum instead of requiring full JN set for NN format

In Progress

HADOOP-18365 Updated addresses are still accessed using the old IP address

Resolved

HDFS-16684 Exclude self from JournalNodeSyncer when using a bind host

Resolved

HDFS-16686 GetJournalEditServlet fails to authorize valid Kerberos request

Resolved

HDFS-16690 Automatically format new unformatted JournalNodes using JournalNodeSyncer

Resolved

(3 is a parent of)

Activity

Ascending order - Click to sort in descending order

Steve Loughran added a comment - 09/Aug/22 11:08

funny
https://www.slideshare.net/steve_l/farming-hadoop-inthecloud

Steve Loughran added a comment - 09/Aug/22 11:08 funny https://www.slideshare.net/steve_l/farming-hadoop-inthecloud

Steve Vaughan added a comment - 09/Aug/22 11:41

stevel@apache.org Did you have any comments about the individual changes? Even in environments intended to be static, there are circumstances where unplanned changes are required (e.g. hardware failures). In addition, having servers silently ignoring configuration (dropping unresolved servers) because of a hiccup during startup can lead to unexpected behaviors.

Steve Vaughan added a comment - 09/Aug/22 11:41 stevel@apache.org Did you have any comments about the individual changes? Even in environments intended to be static, there are circumstances where unplanned changes are required (e.g. hardware failures). In addition, having servers silently ignoring configuration (dropping unresolved servers) because of a hiccup during startup can lead to unexpected behaviors.

Nick Dimiduk added a comment - 09/Aug/22 17:39

stevel@apache.org your slide "Hadoop's Assumptions" looks like a nice list of milestones. Why didn't you file this in 2010? ::smile::

Nick Dimiduk added a comment - 09/Aug/22 17:39 stevel@apache.org your slide "Hadoop's Assumptions" looks like a nice list of milestones. Why didn't you file this in 2010? ::smile::

Steve Loughran added a comment - 09/Aug/22 17:43

Why didn't you file this in 2010?

nobody cared at that point

Steve Loughran added a comment - 09/Aug/22 17:43 Why didn't you file this in 2010? nobody cared at that point

Steve Loughran added a comment - 10/Aug/22 10:23

oh, and the whole yarn service lifecycle classes is a simplification of the smartfrog distributed component architecture, where you push out a declarative spec of components to deploy, where to bind them, and the system ensures that the requirements are met. as well as predefined config options, e.g ports, components could publish their own state, which could then be lazy evaluated by others

https://dl.acm.org/doi/10.1145/1496909.1496915

while that project is dead, my copy of the source is all there: https://github.com/steveloughran/smartfrog

I'm not going to advocate adoption, as in a k8s first world the units of deployment are now containers, not processes and/or components in processes. but the whole dynamic deployment problem is still there

Steve Loughran added a comment - 10/Aug/22 10:23 oh, and the whole yarn service lifecycle classes is a simplification of the smartfrog distributed component architecture, where you push out a declarative spec of components to deploy, where to bind them, and the system ensures that the requirements are met. as well as predefined config options, e.g ports, components could publish their own state, which could then be lazy evaluated by others https://dl.acm.org/doi/10.1145/1496909.1496915 while that project is dead, my copy of the source is all there: https://github.com/steveloughran/smartfrog I'm not going to advocate adoption, as in a k8s first world the units of deployment are now containers, not processes and/or components in processes. but the whole dynamic deployment problem is still there

Steve Loughran added a comment - 10/Aug/22 10:37

Looking at the list, as well as all the need to cope with moving IP addresses, is the sole bit of the stack today supporting dynamic discovery is based on zookeeper. Note that the yarn registry is designed to support dynamic service discovery, but again, it is Z.K. based. It will be interesting to see if the same registry/look up mechanisms could be supported by other back and such as dynamo DB. Remember, the registry itself can support DNS look up so it would be a matter of changing what it binds to. (oh look! someone has coded in a lot of the dynamicness people need in cloud and nobody else has noticed!). It might be interesting to see if the lower level ZK APIs, directly or through curator, would support a back end which worked with persistent cloud databases.

Steve Loughran added a comment - 10/Aug/22 10:37 Looking at the list, as well as all the need to cope with moving IP addresses, is the sole bit of the stack today supporting dynamic discovery is based on zookeeper. Note that the yarn registry is designed to support dynamic service discovery, but again, it is Z.K. based. It will be interesting to see if the same registry/look up mechanisms could be supported by other back and such as dynamo DB. Remember, the registry itself can support DNS look up so it would be a matter of changing what it binds to. (oh look! someone has coded in a lot of the dynamicness people need in cloud and nobody else has noticed!). It might be interesting to see if the lower level ZK APIs, directly or through curator, would support a back end which worked with persistent cloud databases.

Steve Vaughan added a comment - 10/Aug/22 14:26

I noticed some of those enhancements last night, and was thinking through how these changes could be updated accordingly. Given the investment teams have in existing configuration controls, I think it would aid in adoption if they were able to take advantage of the dynamic updates without committing to changes in configuration controls. This would provide an easier migration path.

I'll look to update the name-based patches today to take advantage of the naming abstraction (~~HDFS-4043~~ and HDFS-16685). The changes will also make unit testing easier since I'll be able to provide a test-specific lookup mechanism.

I believe that several of the other patches will be unaffected (~~HADOOP-18365~~, ~~HDFS-16684~~, ~~HDFS-16686~~, and HDFS-16688), since they don't directly address how lookups are performed.

Steve Vaughan added a comment - 10/Aug/22 14:26 I noticed some of those enhancements last night, and was thinking through how these changes could be updated accordingly. Given the investment teams have in existing configuration controls, I think it would aid in adoption if they were able to take advantage of the dynamic updates without committing to changes in configuration controls. This would provide an easier migration path. I'll look to update the name-based patches today to take advantage of the naming abstraction ( HDFS-4043 and HDFS-16685 ). The changes will also make unit testing easier since I'll be able to provide a test-specific lookup mechanism. I believe that several of the other patches will be unaffected ( HADOOP-18365 , HDFS-16684 , HDFS-16686 , and HDFS-16688 ), since they don't directly address how lookups are performed.

Nick Dimiduk added a comment - 20/Oct/22 10:24

We had a cluster trip on the old IP address issue, after a maintenance incident, the datanode by its old IP is in the dead servers list.

Nick Dimiduk added a comment - 20/Oct/22 10:24 We had a cluster trip on the old IP address issue, after a maintenance incident, the datanode by its old IP is in the dead servers list.

People

Assignee:: Steve Vaughan

Reporter:: Steve Vaughan

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 08/Aug/22 18:54

Updated:: 26/Feb/23 17:58

Hadoop Common