Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Component/s: Collections
    • Labels:
      None
    1. ohsumed.patch
      26 kB
      Andrzej Bialecki

      Activity

      Hide
      Robert Muir added a comment -

      Thanks Andrzej for your work here. I opened LUCENE-2254 for the lucene benchmark issue.

      Show
      Robert Muir added a comment - Thanks Andrzej for your work here. I opened LUCENE-2254 for the lucene benchmark issue.
      Hide
      Andrzej Bialecki added a comment -

      OK, I think the best way to handle this is to instead make it easier to run T, T+D, T+D+N, etc queries from the benchmark package.

      That would be cool - yes, it's a Lucene benchmark issue.

      I thought DCG etc were only based on the '2' versus '1' value in the qrels? I am only vaguely familiar with these so I could be wrong?

      http://en.wikipedia.org/wiki/Discounted_Cumulative_Gain unlike the plain Cumulative Gain, discounts the importance of a result by its position on the list of results (rank).

      I guess the latest version does not support this metric, are people using this patch or is there some other NDCG calculator that does not do this sort???

      No, I stumbled upon this issue when implementing NDCG myself for another project.

      Ok, I'll add these remarks and commit. Thanks!

      Show
      Andrzej Bialecki added a comment - OK, I think the best way to handle this is to instead make it easier to run T, T+D, T+D+N, etc queries from the benchmark package. That would be cool - yes, it's a Lucene benchmark issue. I thought DCG etc were only based on the '2' versus '1' value in the qrels? I am only vaguely familiar with these so I could be wrong? http://en.wikipedia.org/wiki/Discounted_Cumulative_Gain unlike the plain Cumulative Gain, discounts the importance of a result by its position on the list of results (rank). I guess the latest version does not support this metric, are people using this patch or is there some other NDCG calculator that does not do this sort??? No, I stumbled upon this issue when implementing NDCG myself for another project. Ok, I'll add these remarks and commit. Thanks!
      Hide
      Robert Muir added a comment -

      >> For calculation of metrics that depend on position (such as NDCG) this needs to be taken into account, e.g. by first sorting the qrels by relevance and calculating an Ideal DCG@N, where N is the number of available qrels.

      Andrzej, i looked at a patch to trec_eval to support NDCG and it appears to do this sort itself: http://cio.nist.gov/esd/emaildir/lists/ireval/msg00037.html
      I guess the latest version does not support this metric, are people using this patch or is there some other NDCG calculator that does not do this sort???

      Show
      Robert Muir added a comment - >> For calculation of metrics that depend on position (such as NDCG) this needs to be taken into account, e.g. by first sorting the qrels by relevance and calculating an Ideal DCG@N, where N is the number of available qrels. Andrzej, i looked at a patch to trec_eval to support NDCG and it appears to do this sort itself: http://cio.nist.gov/esd/emaildir/lists/ireval/msg00037.html I guess the latest version does not support this metric, are people using this patch or is there some other NDCG calculator that does not do this sort???
      Hide
      Robert Muir added a comment -

      >> I created separate corpora and qrels for the test and train parts of the original collection.

      I am not familiar with this collection, except that from your README and the original file naming it appears like this is the right thing to do?

      >> the Mesh and OHSU topics are very different - e.g. from my experience Mesh topics converted to Lucene queries must include the description, because quite often the most relevant docs don't contain the Mesh term itself. This however makes for very long queries ...

      OK, I think the best way to handle this is to instead make it easier to run T, T+D, T+D+N, etc queries from the benchmark package. I'll open an issue with an initial patch for you to look over (but I dont think this is an ORP problem, just a problem that the benchmark pkg is really only setup to run Title queries right now).

      >> AFAIU the definition of the filtering track is that qrels are NOT ranked, they just list relevant docs in random order. For calculation of metrics that depend on position (such as NDCG) this needs to be taken into account, e.g. by first sorting the qrels by relevance and calculating an Ideal DCG@N, where N is the number of available qrels.

      I thought DCG etc were only based on the '2' versus '1' value in the qrels? I am only vaguely familiar with these so I could be wrong?

      Show
      Robert Muir added a comment - >> I created separate corpora and qrels for the test and train parts of the original collection. I am not familiar with this collection, except that from your README and the original file naming it appears like this is the right thing to do? >> the Mesh and OHSU topics are very different - e.g. from my experience Mesh topics converted to Lucene queries must include the description, because quite often the most relevant docs don't contain the Mesh term itself. This however makes for very long queries ... OK, I think the best way to handle this is to instead make it easier to run T, T+D, T+D+N, etc queries from the benchmark package. I'll open an issue with an initial patch for you to look over (but I dont think this is an ORP problem, just a problem that the benchmark pkg is really only setup to run Title queries right now). >> AFAIU the definition of the filtering track is that qrels are NOT ranked, they just list relevant docs in random order. For calculation of metrics that depend on position (such as NDCG) this needs to be taken into account, e.g. by first sorting the qrels by relevance and calculating an Ideal DCG@N, where N is the number of available qrels. I thought DCG etc were only based on the '2' versus '1' value in the qrels? I am only vaguely familiar with these so I could be wrong?
      Hide
      Andrzej Bialecki added a comment -

      Sure, why not. But there are some points that I'm not sure about yet:

      • I created separate corpora and qrels for the test and train parts of the original collection.
      • the Mesh and OHSU topics are very different - e.g. from my experience Mesh topics converted to Lucene queries must include the description, because quite often the most relevant docs don't contain the Mesh term itself. This however makes for very long queries ...
      • AFAIU the definition of the filtering track is that qrels are NOT ranked, they just list relevant docs in random order. For calculation of metrics that depend on position (such as NDCG) this needs to be taken into account, e.g. by first sorting the qrels by relevance and calculating an Ideal DCG@N, where N is the number of available qrels.

      I could add these remarks to the README.

      Show
      Andrzej Bialecki added a comment - Sure, why not. But there are some points that I'm not sure about yet: I created separate corpora and qrels for the test and train parts of the original collection. the Mesh and OHSU topics are very different - e.g. from my experience Mesh topics converted to Lucene queries must include the description, because quite often the most relevant docs don't contain the Mesh term itself. This however makes for very long queries ... AFAIU the definition of the filtering track is that qrels are NOT ranked, they just list relevant docs in random order. For calculation of metrics that depend on position (such as NDCG) this needs to be taken into account, e.g. by first sorting the qrels by relevance and calculating an Ideal DCG@N, where N is the number of available qrels. I could add these remarks to the README.
      Hide
      Robert Muir added a comment -

      +1 (built and ran evaluation with training corpus/qrels)

      Andrzej, wanna commit this?

      Show
      Robert Muir added a comment - +1 (built and ran evaluation with training corpus/qrels) Andrzej, wanna commit this?
      Hide
      Andrzej Bialecki added a comment -

      This patch adds support for creating collections from TREC9 / OHSUMED corpus, queries and qrels.

      Show
      Andrzej Bialecki added a comment - This patch adds support for creating collections from TREC9 / OHSUMED corpus, queries and qrels.

        People

        • Assignee:
          Unassigned
          Reporter:
          Andrzej Bialecki
        • Votes:
          0 Vote for this issue
          Watchers:
          0 Start watching this issue

          Dates

          • Created:
            Updated:

            Development