Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-3771

Mesos JSON API creates invalid JSON due to lack of binary data / non-ASCII handling

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 0.24.1, 0.26.0
    • 0.26.0
    • HTTP API
    • Mesosphere Sprint 21
    • 2

    Description

      Spark encodes some binary data into the ExecutorInfo.data field. This field is sent as a "bytes" Protobuf value, which can have arbitrary non-UTF8 data.

      If you have such a field, it seems that it is splatted out into JSON without any regards to proper character encoding:

      0006b0b0  2e 73 70 61 72 6b 2e 65  78 65 63 75 74 6f 72 2e  |.spark.executor.|
      0006b0c0  4d 65 73 6f 73 45 78 65  63 75 74 6f 72 42 61 63  |MesosExecutorBac|
      0006b0d0  6b 65 6e 64 22 7d 2c 22  64 61 74 61 22 3a 22 ac  |kend"},"data":".|
      0006b0e0  ed 5c 75 30 30 30 30 5c  75 30 30 30 35 75 72 5c  |.\u0000\u0005ur\|
      0006b0f0  75 30 30 30 30 5c 75 30  30 30 66 5b 4c 73 63 61  |u0000\u000f[Lsca|
      0006b100  6c 61 2e 54 75 70 6c 65  32 3b 2e cc 5c 75 30 30  |la.Tuple2;..\u00|
      

      I suspect this is because the HTTP api emits the executorInfo.data directly:

      JSON::Object model(const ExecutorInfo& executorInfo)
      {
        JSON::Object object;
        object.values["executor_id"] = executorInfo.executor_id().value();
        object.values["name"] = executorInfo.name();
        object.values["data"] = executorInfo.data();
        object.values["framework_id"] = executorInfo.framework_id().value();
        object.values["command"] = model(executorInfo.command());
        object.values["resources"] = model(executorInfo.resources());
        return object;
      }
      

      I think this may be because the custom JSON processing library in stout seems to not have any idea of what a byte array is. I'm guessing that some implicit conversion makes it get written as a String instead, but:

      inline std::ostream& operator<<(std::ostream& out, const String& string)
      {
        // TODO(benh): This escaping DOES NOT handle unicode, it encodes as ASCII.
        // See RFC4627 for the JSON string specificiation.
        return out << picojson::value(string.value).serialize();
      }
      

      Thank you for any assistance here. Our cluster is currently entirely down – the frameworks cannot handle parsing the invalid JSON produced (it is not even valid utf-8)

      Attachments

        Issue Links

          Activity

            People

              kaysoky Joseph Wu
              stevenschlansker Steven Schlansker
              Benjamin Mahler Benjamin Mahler
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: