Dave, I've missed your comment with the exception trace, sorry about it.
After seeing a comment from Jeremy I've tested the JAX-RS server and I can confirm all works as expected.
Note, "curl -T somefile targetURI" does not set Content-Type which explains the exception you are seeing. TikaServer has two resource methods accepting PUT payloads on the same path, one - specifically the multipart/form-data ones and another - all other types of payloads, and it uses a wildcard to match all possible types. Thus a method with a more specific JAX-RS Consumes value (multipart/form-data) is chosen when no Content-Type is available: the error actually mentions an octet-stream - this is due to the fact that the spec says that if no CT is available then use application/octet-stream when trying to read the stream - after the method selection has been completed.
Two fixes are possible:
1. Use -H curl parameter, for example, I've started a server (using a newly added -Pserver profile) and posted a pom.xml to it, adding '-H "Content-Type: text/xml"' and all worked fine. So the actual 'fix' is to update the docs and recommend to set up Content-Type when no multiparts are used.
2. Have a TikaServer resource method accepting multiparts listen on a unique path, say on "http://localhost:9998/tika/form"
Option 2 is less 'disruptive' but option 1 is marginally cleaner IMHO as the clients PUT-ing something into the server are expected to set Content-Type.
I'm fine with implementing Option 2 though too - perhaps it can be done anyway but users should be encouraged to set content types anyway - this can optimize the parsing, aka, avoid doing the detection at the parser level and optionally use a Content-Type
So, will we add a "/form" to a multipart/form-data accepting resource method or keep things as is ?