Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-2347

Dump WordNet to SOLR Synonym format

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Not A Problem
    • Affects Version/s: 3.0.1
    • Fix Version/s: 3.4, 4.0-ALPHA
    • Component/s: modules/analysis
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      This enhancement allows you to dump v2 of WordNet to SOLR synonym format! Get all your syns loaded easily.

      1. You can load all synonyms from http://wordnetcode.princeton.edu/2.0/ WordNet V2 to SOLR by first using the Sys2Index program
      http://lucene.apache.org/java/2_2_0/api/org/apache/lucene/wordnet/Syns2Index.html

      Get WNprolog from http://wordnetcode.princeton.edu/2.0/

      2. We modified this program to work with SOLR (See attached) on amidev.kaango.com in /vol/src/lucene/contrib/wordnet
      vi /vol/src/lucene/contrib/wordnet/src/java/org/apache/lucene/wordnet/Syns2Solr.java

      3. Run ant

      4. java -classpath /vol/src/lucene/build/contrib/wordnet/lucene-wordnet-3.1-dev.jar org.apache.lucene.wordnet.Syns2Solr prolog/wn_s.pl solr > index_synonyms.txt

        Attachments

        1. Syns2Solr.java
          9 kB
          Bill Bell

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                billnbell Bill Bell
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: