Whirr
  1. Whirr
  2. WHIRR-246

Single place to store/load cluster state

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.5.0
    • Component/s: None
    • Labels:
      None

      Description

      Right now cluster state is written and read from the ~/.whirr/cluster-name/instances file and logic to write and update it is spread between the Service and DestroyInstanceCommand (that I know of).

      Since for WHIRR-214 the file must be updated (in another class) and for WHIRR-238 an altogether different method of storing cluster state will be required I think it might be time to move the read/write state logic to its own file (or factory in the future).

      I'll attach a very preliminary patch just to get some feedback.

      1. WHIRR-246.patch
        7 kB
        David Alves
      2. WHIRR-246.patch
        11 kB
        David Alves
      3. WHIRR-246.patch
        17 kB
        Andrei Savu
      4. WHIRR-246.patch
        27 kB
        Andrei Savu

        Issue Links

          Activity

          Hide
          David Alves added a comment -

          A very preliminary patch without tests just to get some feedback

          Show
          David Alves added a comment - A very preliminary patch without tests just to get some feedback
          Hide
          Andrei Savu added a comment -

          I like the idea of moving the logic for state persistence in a single place. How about storing this info outside the local machine by using the BlobStore API and displaying it as output for ./bin/whirr list-cluster? Should we rename ClusterStore to ClusterState? (more self-explanatory to me)

          Show
          Andrei Savu added a comment - I like the idea of moving the logic for state persistence in a single place. How about storing this info outside the local machine by using the BlobStore API and displaying it as output for ./bin/whirr list-cluster? Should we rename ClusterStore to ClusterState? (more self-explanatory to me)
          Hide
          David Alves added a comment -

          I've been wrestling with names (started with ClusterStateStore), but ClusterState sounds good.

          My reasoning was that first we could start by abstracting state persistence and later on we could introduce a factory that would allow multiple implementations (local file, blobstore, the nodes themselves, etc.).

          What do you think? Simply use the blobstore as the only implementation of have multiple ones?

          Show
          David Alves added a comment - I've been wrestling with names (started with ClusterStateStore), but ClusterState sounds good. My reasoning was that first we could start by abstracting state persistence and later on we could introduce a factory that would allow multiple implementations (local file, blobstore, the nodes themselves, etc.). What do you think? Simply use the blobstore as the only implementation of have multiple ones?
          Hide
          Tom White added a comment -

          +1 to the direction. I think we'll always want to support local files, as not all compute clouds have associated blob stores, and because it's very convenient for short lived clusters to simply use the filesystem. So I think we'll need an abstraction to support local files and blob stores (and others).

          One idea that occurred to me was to implement this as a Service wrapper. Could this work/is it desirable?

          Show
          Tom White added a comment - +1 to the direction. I think we'll always want to support local files, as not all compute clouds have associated blob stores, and because it's very convenient for short lived clusters to simply use the filesystem. So I think we'll need an abstraction to support local files and blob stores (and others). One idea that occurred to me was to implement this as a Service wrapper. Could this work/is it desirable?
          Hide
          Adrian Cole added a comment -

          When you say local files, do you mean local to the machines in the cluster? If so, it is a sort of ssh-based blobstore, right? If you meant local files to the whirr call, blobstore has a persistent "filesystem" provider that can be used to avoid unnecessary code duplication.

          thoughts?

          Show
          Adrian Cole added a comment - When you say local files, do you mean local to the machines in the cluster? If so, it is a sort of ssh-based blobstore, right? If you meant local files to the whirr call, blobstore has a persistent "filesystem" provider that can be used to avoid unnecessary code duplication. thoughts?
          Hide
          Tom White added a comment -

          I meant local to the machine running Whirr, so yes, using the "filesystem" provider would be a simplification.

          Show
          Tom White added a comment - I meant local to the machine running Whirr, so yes, using the "filesystem" provider would be a simplification.
          Hide
          David Alves added a comment -

          When you say local files, do you mean local to the machines in the cluster?

          I'm pretty sure Tom meant local files as they are now (stored locally in the command-issuing machine).

          I envision at least three alternatives, storing them locally, storing them in a blobstore (e.g. S3) storing them in the nodes themselves (the ssh-based blob-store).

          I think we could start by introducing indirection and the local storage facilities and introduce blobstore and/or node storage later on when required.

          One idea that occurred to me was to implement this as a Service wrapper. Could this work/is it desirable?

          What do you mean by this Tom?

          Show
          David Alves added a comment - When you say local files, do you mean local to the machines in the cluster? I'm pretty sure Tom meant local files as they are now (stored locally in the command-issuing machine). I envision at least three alternatives, storing them locally, storing them in a blobstore (e.g. S3) storing them in the nodes themselves (the ssh-based blob-store). I think we could start by introducing indirection and the local storage facilities and introduce blobstore and/or node storage later on when required. One idea that occurred to me was to implement this as a Service wrapper. Could this work/is it desirable? What do you mean by this Tom?
          Hide
          David Alves added a comment - - edited

          Btw I just remembered something that might be relevant.

          When updating a cluster we will do so based on the current state, up to now using the local filesystem was not a problem because no decisions (that I know of) were made based on this file. This will change with WHIRR-214 and so a problem might arise where if the cluster is started on one machine and updated on another, update decisions might be based on a non-existing or stale state.

          Show
          David Alves added a comment - - edited Btw I just remembered something that might be relevant. When updating a cluster we will do so based on the current state, up to now using the local filesystem was not a problem because no decisions (that I know of) were made based on this file. This will change with WHIRR-214 and so a problem might arise where if the cluster is started on one machine and updated on another, update decisions might be based on a non-existing or stale state.
          Hide
          Adrian Cole added a comment -

          agreed. when there are 2 heads, local filesystem doesn't seem to help. However, in the case that there aren't, it should be ok.

          Show
          Adrian Cole added a comment - agreed. when there are 2 heads, local filesystem doesn't seem to help. However, in the case that there aren't, it should be ok.
          Hide
          David Alves added a comment -

          Another patch.

          • Changed names.
          • Introduced Factory.
          • Moved to o.a.w.cluster.

          Still preliminary

          Show
          David Alves added a comment - Another patch. Changed names. Introduced Factory. Moved to o.a.w.cluster. Still preliminary
          Hide
          David Alves added a comment -

          Another issue I just remembered is locking. StateStores should probably lock state pending updates so that no concurrent updates are performed on stale state.

          Show
          David Alves added a comment - Another issue I just remembered is locking. StateStores should probably lock state pending updates so that no concurrent updates are performed on stale state.
          Hide
          Tom White added a comment -

          What I meant by the Service wrapper was to have an implementation of service that takes a Service and persists its cluster state for it. This was just one way of implementing it, and I'm not wedded to the idea. Another thing to consider is that some users of the API may not need any persistence of state at all (e.g. when running a single VM test - like the integration tests), so we should make it optional, either by having a no-op ClusterState, or by not using a wrapper.

          Show
          Tom White added a comment - What I meant by the Service wrapper was to have an implementation of service that takes a Service and persists its cluster state for it. This was just one way of implementing it, and I'm not wedded to the idea. Another thing to consider is that some users of the API may not need any persistence of state at all (e.g. when running a single VM test - like the integration tests), so we should make it optional, either by having a no-op ClusterState, or by not using a wrapper.
          Hide
          Andrei Savu added a comment -

          Updated patch for the current trunk and tested on EC2. Next: create a blob store aware cluster state store.

          Show
          Andrei Savu added a comment - Updated patch for the current trunk and tested on EC2. Next: create a blob store aware cluster state store.
          Hide
          Tom White added a comment -

          I like this change. A few comments, mainly to do with naming and organization:

          • The new by ID instance predicates really belong in RolePredicates with the other ones.
          • I would add ClusterStateStore in the org.apache.whirr.service package. It seems odd having a org.apache.whirr.cluster without the Cluster class being in it.
          • Change ClusterStateStore to be an abstract class (like Service is) so we have more flexibility over evolving it in the future.
          • For consistency with ClusterControllerFactory, ClusterStateStoreFactory should not have static methods. This will make it easier to use the service loader pattern, which we'll need when there are more implementations of ClusterStateStore (to be done in later JIRA?).
          Show
          Tom White added a comment - I like this change. A few comments, mainly to do with naming and organization: The new by ID instance predicates really belong in RolePredicates with the other ones. I would add ClusterStateStore in the org.apache.whirr.service package. It seems odd having a org.apache.whirr.cluster without the Cluster class being in it. Change ClusterStateStore to be an abstract class (like Service is) so we have more flexibility over evolving it in the future. For consistency with ClusterControllerFactory, ClusterStateStoreFactory should not have static methods. This will make it easier to use the service loader pattern, which we'll need when there are more implementations of ClusterStateStore (to be done in later JIRA?).
          Hide
          Andrei Savu added a comment -

          Updated patch and added the missing test for WHIRR-173. I will open a new JIRA for adding blob store state persistence.

          Show
          Andrei Savu added a comment - Updated patch and added the missing test for WHIRR-173 . I will open a new JIRA for adding blob store state persistence.
          Hide
          Andrei Savu added a comment -

          Tom, did you also took a look at this code while reviewing WHIRR-289? (I want to commit to simplify patch merging)

          Show
          Andrei Savu added a comment - Tom, did you also took a look at this code while reviewing WHIRR-289 ? (I want to commit to simplify patch merging)
          Hide
          Tom White added a comment -

          Looks good. +1

          Couple of nits:

          • The withIds() method on RolePredicates has a method that takes an @Nullable argument, but doesn't check the case for it being null.
          • MemoryClusterStateStore might be useful in the main tree, e.g. for clients that use Whirr to run tests from a single JVM. But there's no mechanism to hook it up yet, so could be left until later.
          Show
          Tom White added a comment - Looks good. +1 Couple of nits: The withIds() method on RolePredicates has a method that takes an @Nullable argument, but doesn't check the case for it being null. MemoryClusterStateStore might be useful in the main tree, e.g. for clients that use Whirr to run tests from a single JVM. But there's no mechanism to hook it up yet, so could be left until later.
          Hide
          Andrei Savu added a comment -

          > The withIds() method on RolePredicates has a method that takes an @Nullable argument, but doesn't check the case for it being null.

          I will fix it before committing.

          > MemoryClusterStateStore might be useful in the main tree

          I agree. Let's handle this in WHIRR-288.

          Show
          Andrei Savu added a comment - > The withIds() method on RolePredicates has a method that takes an @Nullable argument, but doesn't check the case for it being null. I will fix it before committing. > MemoryClusterStateStore might be useful in the main tree I agree. Let's handle this in WHIRR-288 .
          Hide
          Andrei Savu added a comment -

          I've just committed this. Thanks Tom for reviewing.

          Show
          Andrei Savu added a comment - I've just committed this. Thanks Tom for reviewing.

            People

            • Assignee:
              Andrei Savu
              Reporter:
              David Alves
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development