>> I created separate corpora and qrels for the test and train parts of the original collection.
I am not familiar with this collection, except that from your README and the original file naming it appears like this is the right thing to do?
>> the Mesh and OHSU topics are very different - e.g. from my experience Mesh topics converted to Lucene queries must include the description, because quite often the most relevant docs don't contain the Mesh term itself. This however makes for very long queries ...
OK, I think the best way to handle this is to instead make it easier to run T, T+D, T+D+N, etc queries from the benchmark package. I'll open an issue with an initial patch for you to look over (but I dont think this is an ORP problem, just a problem that the benchmark pkg is really only setup to run Title queries right now).
>> AFAIU the definition of the filtering track is that qrels are NOT ranked, they just list relevant docs in random order. For calculation of metrics that depend on position (such as NDCG) this needs to be taken into account, e.g. by first sorting the qrels by relevance and calculating an Ideal DCG@N, where N is the number of available qrels.
I thought DCG etc were only based on the '2' versus '1' value in the qrels? I am only vaguely familiar with these so I could be wrong?