Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-4997

Current approach to protobuf enums does not support upgrades.

    XMLWordPrintableJSON

Details

    • Epic
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • None
    • None
    • Enum Upgrade

    Description

      Some users were opting in to the recently introduced TASK_KILLING_STATE capability introduced in 0.28.0. When the scheduler ties to register with the TASK_KILLING_STATE capability against a 0.27.0 master, the master drops the message and prints the following:

      [libprotobuf ERROR google/protobuf/message_lite.cc:123] Can't parse message of type "mesos.scheduler.Call" because it is missing required fields: subscribe.framework_info.capabilities[0].type
      

      It turns out that our approach to enums in general does not allow for backwards compatibility. For example:

        message Capability {
          enum Type {
            REVOCABLE_RESOURCES = 1;
            TASK_KILLING_STATE = 2; // New!
          }
          required Type type = 1;
        }
      

      Using a required enum is problematic because protobuf will strip unknown enum values during de-serialization:

      https://developers.google.com/protocol-buffers/docs/proto#updating

      enum is compatible with int32, uint32, int64, and uint64 in terms of wire format (note that values will be truncated if they don't fit), but be aware that client code may treat them differently when the message is deserialized. Notably, unrecognized enum values are discarded when the message is deserialized, which makes the field's has.. accessor return false and its getter return the first value listed in the enum definition. However, an integer field will always preserve its value. Because of this, you need to be very careful when upgrading an integer to an enum in terms of receiving out of bounds enum values on the wire.

      The suggestion on the protobuf mailing list is to use optional enum fields and include an UNKNOWN value as the first entry in the enum list (and/or explicitly specifying it as the default):

      https://groups.google.com/forum/#!msg/protobuf/NhUjBfDyGmY/pf294zMi2bIJ

      The updated version of Capability would be:

        message Capability {
          enum Type {
            UNKNOWN = 0;
            REVOCABLE_RESOURCES = 1;
            TASK_KILLING_STATE = 2; // New!
          }
          optional Type type = 1;
        }
      

      Note that the first entry in an enum list is the default value, even if it's number is not the lowest (unless [default = <value>] is explicitly specified).


      Clarification on the intent of the epic: we hope to fix all required enum}}s in Mesos. When doing the change (incrementally or sweeping) we need to be careful to make sure {{switch statements are using when enumerating them to make sure we do not miss default UNKNOWN cases in Mesos logic and fix any logic that implicitly rely on a valid default value (change it to be explicit).

      Attachments

        Activity

          People

            Unassigned Unassigned
            bmahler Benjamin Mahler
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: