I don't understand how this will fix the bug. By calling tmp.string(), the field will be added as a string. The JsonXContentGenerator will then call jackson.JsonGenerator.writeString and the string will become invalid Json, as it will be escaped. The jackson.JsonGenerator.writeString documentation is clear:
"Method for outputting a String value. Depending on context this means either array element, (object) field value or a stand alone String; but in all cases, String will be surrounded in double quotes, and contents will be properly escaped as required by Json specification."
I tried the patch and this is exactly what I get. For example, when the body of an event is
, the resulting document in ES will contain "
". ES view the field as plain text and not Json.
I really don't understand what the elasticsearch sink is trying to do. If it detects that the field is Json, it will parse it to make sure it's valid Json, but it will then be added as plain text. That's almost the same as if all fields were added by using the addSimpleField method, minus the Json validation! The original code would have been fine if the ES Java API documentation was right. They say: "By the way, the field method accepts many object types. You can directly pass numbers, dates and even other XContentBuilder objects". But looking at the source code, this is clearly wrong, there's no field method accepting an XContentBuilder as value. To get around this issue, I think the sink should call rawField when detecting a field as Json. This will ensure that the string won't be escaped and will be treated as a Json field by ES.
Does it make sense or I'm missing something here?