Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7785

Move dictionary for Ukrainian analyzer to external dependency

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 7.0, 6.6
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Currently the dictionary for Ukrainian analyzer is a blob in the source tree. We should move it out to external dependency, this allows:

      • to have less binaries in the source
      • easier to update the dictionary and track updates

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user arysin opened a pull request:

          https://github.com/apache/lucene-solr/pull/187

          LUCENE-7785: Move dictionary for Ukrainan analyzer to external dependency

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/arysin/lucene-solr master

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/lucene-solr/pull/187.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #187


          commit f7291006b378e11cb67d9078efe52e9ace9e3f47
          Author: Andriy Rysin <arysin@gmail.com>
          Date: 2017-03-25T22:15:56Z

          LUCENE-7785: Move dictionary for Ukrainan analyzer to external dependency


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user arysin opened a pull request: https://github.com/apache/lucene-solr/pull/187 LUCENE-7785 : Move dictionary for Ukrainan analyzer to external dependency You can merge this pull request into a Git repository by running: $ git pull https://github.com/arysin/lucene-solr master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/187.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #187 commit f7291006b378e11cb67d9078efe52e9ace9e3f47 Author: Andriy Rysin <arysin@gmail.com> Date: 2017-03-25T22:15:56Z LUCENE-7785 : Move dictionary for Ukrainan analyzer to external dependency
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user arysin commented on the issue:

          https://github.com/apache/lucene-solr/pull/187

          It would be also nice to merge this into 6x branch

          Show
          githubbot ASF GitHub Bot added a comment - Github user arysin commented on the issue: https://github.com/apache/lucene-solr/pull/187 It would be also nice to merge this into 6x branch
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user dweiss commented on a diff in the pull request:

          https://github.com/apache/lucene-solr/pull/187#discussion_r111553388

          — Diff: lucene/analysis/morfologik/src/java/org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer.java —
          @@ -107,11 +107,18 @@ public UkrainianMorfologikAnalyzer(CharArraySet stopwords, CharArraySet stemExcl
          @Override
          protected Reader initReader(String fieldName, Reader reader) {
          NormalizeCharMap.Builder builder = new NormalizeCharMap.Builder();
          + // different apostrophes
          builder.add("\u2019", "'");
          + builder.add("\u0218", "'");
          builder.add("\u02BC", "'");
          + builder.add("`", "'");
          + builder.add("´", "'");
          + // ignored characters
          builder.add("\u0301", "");

          • NormalizeCharMap normMap = builder.build();
            + builder.add("\u00AD", "");
            + builder.add("\uFEFF", "");
              • End diff –

          byte order mark shouldn't be replaced to nothing... if you have a byte order mark in your character input (reader) then your conversion from bytes is screwed up somewhere before.

          Show
          githubbot ASF GitHub Bot added a comment - Github user dweiss commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/187#discussion_r111553388 — Diff: lucene/analysis/morfologik/src/java/org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer.java — @@ -107,11 +107,18 @@ public UkrainianMorfologikAnalyzer(CharArraySet stopwords, CharArraySet stemExcl @Override protected Reader initReader(String fieldName, Reader reader) { NormalizeCharMap.Builder builder = new NormalizeCharMap.Builder(); + // different apostrophes builder.add("\u2019", "'"); + builder.add("\u0218", "'"); builder.add("\u02BC", "'"); + builder.add("`", "'"); + builder.add("´", "'"); + // ignored characters builder.add("\u0301", ""); NormalizeCharMap normMap = builder.build(); + builder.add("\u00AD", ""); + builder.add("\uFEFF", ""); End diff – byte order mark shouldn't be replaced to nothing... if you have a byte order mark in your character input (reader) then your conversion from bytes is screwed up somewhere before.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user dweiss commented on a diff in the pull request:

          https://github.com/apache/lucene-solr/pull/187#discussion_r111553610

          — Diff: lucene/analysis/morfologik/src/java/org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer.java —
          @@ -145,7 +152,7 @@ protected TokenStreamComponents createComponents(String fieldName) {

          private static Dictionary getDictionary() {
          try {

          • return Dictionary.read(UkrainianMorfologikAnalyzer.class.getResource("ukrainian.dict"));
            + return Dictionary.read(UkrainianMorfologikAnalyzer.class.getResource("/ua/net/nlp/ukrainian.dict"));
              • End diff –

          This shouldn't be a class-relative read looking for an absolute resource. Rather, I think, it should be classloader-relative UkrainianMorfologikAnalyzer.class.getClassLoader().getResource("ua/net/nlp/ukrainian.dict").

          Show
          githubbot ASF GitHub Bot added a comment - Github user dweiss commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/187#discussion_r111553610 — Diff: lucene/analysis/morfologik/src/java/org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer.java — @@ -145,7 +152,7 @@ protected TokenStreamComponents createComponents(String fieldName) { private static Dictionary getDictionary() { try { return Dictionary.read(UkrainianMorfologikAnalyzer.class.getResource("ukrainian.dict")); + return Dictionary.read(UkrainianMorfologikAnalyzer.class.getResource("/ua/net/nlp/ukrainian.dict")); End diff – This shouldn't be a class-relative read looking for an absolute resource. Rather, I think, it should be classloader-relative UkrainianMorfologikAnalyzer.class.getClassLoader().getResource("ua/net/nlp/ukrainian.dict").
          Hide
          dweiss Dawid Weiss added a comment - - edited

          Hi Andriy. The patch looks good overall, but run ant precommit; most likely you're missing jar checksums and licensing information (see top-level ant tasks and regenerate).

          A separate patch file for master and 6x would be handy.

          Show
          dweiss Dawid Weiss added a comment - - edited Hi Andriy. The patch looks good overall, but run ant precommit ; most likely you're missing jar checksums and licensing information (see top-level ant tasks and regenerate). A separate patch file for master and 6x would be handy.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user sarowe commented on a diff in the pull request:

          https://github.com/apache/lucene-solr/pull/187#discussion_r111574446

          — Diff: lucene/ivy-versions.properties —
          @@ -203,6 +203,8 @@ org.carrot2.morfologik.version = 2.1.1
          /org.carrot2/morfologik-polish = $

          {org.carrot2.morfologik.version}
          /org.carrot2/morfologik-stemming = ${org.carrot2.morfologik.version}

          +ua.net.nlp.morfologik-ukrainian-search.version = 3.7.4
          — End diff –

          This is fine, but you're missing the `/org/name=version` thing that's required to be used in `ivy.xml` files.

          If you look at the rest of this file, you can see that some `/org/name` keys share the same version, and so a variable is introduced to simplify maintenance. In your case, when there is only one `/org/name` key using this version, you should IMO eliminate the `ua.net.nlp.morfologik-ukrainian-search.version` variable entirely and add `/ua.net.nlp/morfologik-ukrainian-search = 3.7.4`.

          Show
          githubbot ASF GitHub Bot added a comment - Github user sarowe commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/187#discussion_r111574446 — Diff: lucene/ivy-versions.properties — @@ -203,6 +203,8 @@ org.carrot2.morfologik.version = 2.1.1 /org.carrot2/morfologik-polish = $ {org.carrot2.morfologik.version} /org.carrot2/morfologik-stemming = ${org.carrot2.morfologik.version} +ua.net.nlp.morfologik-ukrainian-search.version = 3.7.4 — End diff – This is fine, but you're missing the `/org/name=version` thing that's required to be used in `ivy.xml` files. If you look at the rest of this file, you can see that some `/org/name` keys share the same version, and so a variable is introduced to simplify maintenance. In your case, when there is only one `/org/name` key using this version, you should IMO eliminate the `ua.net.nlp.morfologik-ukrainian-search.version` variable entirely and add `/ua.net.nlp/morfologik-ukrainian-search = 3.7.4`.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user sarowe commented on a diff in the pull request:

          https://github.com/apache/lucene-solr/pull/187#discussion_r111573637

          — Diff: lucene/analysis/morfologik/ivy.xml —
          @@ -25,6 +25,7 @@
          <dependency org="org.carrot2" name="morfologik-polish" rev="$

          {/org.carrot2/morfologik-polish}

          " conf="compile"/>
          <dependency org="org.carrot2" name="morfologik-fsa" rev="$

          {/org.carrot2/morfologik-fsa}

          " conf="compile"/>
          <dependency org="org.carrot2" name="morfologik-stemming" rev="$

          {/org.carrot2/morfologik-stemming}

          " conf="compile"/>
          + <dependency org="ua.net.nlp" name="morfologik-ukrainian-search" rev="$

          {ua.net.nlp.morfologik-ukrainian-search.version}

          " conf="compile"/>
          — End diff –

          The `rev` attribute value has to be of the form `"$

          {/org/name}

          "` - this will not pass precommit. You have to make such an entry in `ivy-versions.properties` - look at all the other examples there.

          Show
          githubbot ASF GitHub Bot added a comment - Github user sarowe commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/187#discussion_r111573637 — Diff: lucene/analysis/morfologik/ivy.xml — @@ -25,6 +25,7 @@ <dependency org="org.carrot2" name="morfologik-polish" rev="$ {/org.carrot2/morfologik-polish} " conf="compile"/> <dependency org="org.carrot2" name="morfologik-fsa" rev="$ {/org.carrot2/morfologik-fsa} " conf="compile"/> <dependency org="org.carrot2" name="morfologik-stemming" rev="$ {/org.carrot2/morfologik-stemming} " conf="compile"/> + <dependency org="ua.net.nlp" name="morfologik-ukrainian-search" rev="$ {ua.net.nlp.morfologik-ukrainian-search.version} " conf="compile"/> — End diff – The `rev` attribute value has to be of the form `"$ {/org/name} "` - this will not pass precommit. You have to make such an entry in `ivy-versions.properties` - look at all the other examples there.
          Hide
          steve_rowe Steve Rowe added a comment -

          -1 to commit this change.

          The licenses referenced in the POM for ua.net.nlp:morfologik-ukrainian-search:3.7.4 are:

           <licenses>
              <license>
                <name>GNU Lesser General Public License</name>
                <url>http://www.gnu.org/licenses/lgpl.txt</url>
              </license>
              <license>
                <name>Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0)</name>
                <url>http://creativecommons.org/licenses/by-nc-sa/4.0</url>
              </license>
            </licenses>
          

          Neither of those licenses are acceptable for inclusion in an Apache software distribution.

          Specifically, the link http://creativecommons.org/licenses/by-nc-sa/4.0 is to the non-commercial CC-BY-SA variant, which is not among those listed as acceptable here: https://www.apache.org/legal/resolved.html#cc-sa.

          Show
          steve_rowe Steve Rowe added a comment - -1 to commit this change. The licenses referenced in the POM for ua.net.nlp:morfologik-ukrainian-search:3.7.4 are: <licenses> <license> <name> GNU Lesser General Public License </name> <url> http://www.gnu.org/licenses/lgpl.txt </url> </license> <license> <name> Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0) </name> <url> http://creativecommons.org/licenses/by-nc-sa/4.0 </url> </license> </licenses> Neither of those licenses are acceptable for inclusion in an Apache software distribution. Specifically, the link http://creativecommons.org/licenses/by-nc-sa/4.0 is to the non-commercial CC-BY-SA variant, which is not among those listed as acceptable here: https://www.apache.org/legal/resolved.html#cc-sa .
          Hide
          dweiss Dawid Weiss added a comment -

          Steven is right – we can't accept these licenses, Andriy.

          Show
          dweiss Dawid Weiss added a comment - Steven is right – we can't accept these licenses, Andriy.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user arysin commented on a diff in the pull request:

          https://github.com/apache/lucene-solr/pull/187#discussion_r111595706

          — Diff: lucene/analysis/morfologik/src/java/org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer.java —
          @@ -107,11 +107,18 @@ public UkrainianMorfologikAnalyzer(CharArraySet stopwords, CharArraySet stemExcl
          @Override
          protected Reader initReader(String fieldName, Reader reader) {
          NormalizeCharMap.Builder builder = new NormalizeCharMap.Builder();
          + // different apostrophes
          builder.add("\u2019", "'");
          + builder.add("\u0218", "'");
          builder.add("\u02BC", "'");
          + builder.add("`", "'");
          + builder.add("´", "'");
          + // ignored characters
          builder.add("\u0301", "");

          • NormalizeCharMap normMap = builder.build();
            + builder.add("\u00AD", "");
            + builder.add("\uFEFF", "");
              • End diff –

          That was from the note [Wikimedia guys suggested](https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/Ukrainian_Morfologik_Analysis#Recommendations_.26_Plan), but agree it does not make sense here, I'll remove it

          Show
          githubbot ASF GitHub Bot added a comment - Github user arysin commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/187#discussion_r111595706 — Diff: lucene/analysis/morfologik/src/java/org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer.java — @@ -107,11 +107,18 @@ public UkrainianMorfologikAnalyzer(CharArraySet stopwords, CharArraySet stemExcl @Override protected Reader initReader(String fieldName, Reader reader) { NormalizeCharMap.Builder builder = new NormalizeCharMap.Builder(); + // different apostrophes builder.add("\u2019", "'"); + builder.add("\u0218", "'"); builder.add("\u02BC", "'"); + builder.add("`", "'"); + builder.add("´", "'"); + // ignored characters builder.add("\u0301", ""); NormalizeCharMap normMap = builder.build(); + builder.add("\u00AD", ""); + builder.add("\uFEFF", ""); End diff – That was from the note [Wikimedia guys suggested] ( https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/Ukrainian_Morfologik_Analysis#Recommendations_.26_Plan ), but agree it does not make sense here, I'll remove it
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user arysin commented on a diff in the pull request:

          https://github.com/apache/lucene-solr/pull/187#discussion_r111596732

          — Diff: lucene/analysis/morfologik/src/java/org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer.java —
          @@ -145,7 +152,7 @@ protected TokenStreamComponents createComponents(String fieldName) {

          private static Dictionary getDictionary() {
          try {

          • return Dictionary.read(UkrainianMorfologikAnalyzer.class.getResource("ukrainian.dict"));
            + return Dictionary.read(UkrainianMorfologikAnalyzer.class.getResource("/ua/net/nlp/ukrainian.dict"));
              • End diff –

          I am a bit confused on why for class loader getResource() we need relative path (surprisingly the full one starting with "/" does not work for me), but otherwise it makes sense so I'll push the fix shortly.

          Show
          githubbot ASF GitHub Bot added a comment - Github user arysin commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/187#discussion_r111596732 — Diff: lucene/analysis/morfologik/src/java/org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer.java — @@ -145,7 +152,7 @@ protected TokenStreamComponents createComponents(String fieldName) { private static Dictionary getDictionary() { try { return Dictionary.read(UkrainianMorfologikAnalyzer.class.getResource("ukrainian.dict")); + return Dictionary.read(UkrainianMorfologikAnalyzer.class.getResource("/ua/net/nlp/ukrainian.dict")); End diff – I am a bit confused on why for class loader getResource() we need relative path (surprisingly the full one starting with "/" does not work for me), but otherwise it makes sense so I'll push the fix shortly.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user arysin commented on the issue:

          https://github.com/apache/lucene-solr/pull/187

          Thanks for the prompt feedback guys, I think I've addressed all the issues with the latest commits but please let me know if something is still wrong.
          Once we're good with master branch I'll prepare the 6x changes.

          Show
          githubbot ASF GitHub Bot added a comment - Github user arysin commented on the issue: https://github.com/apache/lucene-solr/pull/187 Thanks for the prompt feedback guys, I think I've addressed all the issues with the latest commits but please let me know if something is still wrong. Once we're good with master branch I'll prepare the 6x changes.
          Hide
          steve_rowe Steve Rowe added a comment -

          I see that the POM for ua.net.nlp:morfologik-ukrainian-search:3.7.5 adds the v2 Apache license. I hereby remove my -1.

          Please let me know if something is still wrong.

          I'm pretty sure the current patch won't pass ant precommit - have you tried running that yet Andriy? There are missing files in lucene/licenses/ for the new dependency: a checksum file (*.sha1), a LICENSE file and a NOTICE file. You can generate the checksum file using ant jar-checksums, but you'll need to manually create the other two. You can create morfologik-ukrainian-search.LICENSE-ASL.txt by copying from another file in that directory that contains a clean ALv2 license, e.g. commons-codec-LICENSE-ASL.txt. I'm fuzzy on the requirements for the NOTICE file, but look at the other NOTICE files in that dir to get an idea of the range of things they contain. Some info here: http://www.apache.org/dev/licensing-howto.html#overview-of-files and here: http://www.apache.org/licenses/LICENSE-2.0.html#redistribution.

          Show
          steve_rowe Steve Rowe added a comment - I see that the POM for ua.net.nlp:morfologik-ukrainian-search:3.7.5 adds the v2 Apache license. I hereby remove my -1. Please let me know if something is still wrong. I'm pretty sure the current patch won't pass ant precommit - have you tried running that yet Andriy? There are missing files in lucene/licenses/ for the new dependency: a checksum file ( *.sha1 ), a LICENSE file and a NOTICE file. You can generate the checksum file using ant jar-checksums , but you'll need to manually create the other two. You can create morfologik-ukrainian-search.LICENSE-ASL.txt by copying from another file in that directory that contains a clean ALv2 license, e.g. commons-codec-LICENSE-ASL.txt . I'm fuzzy on the requirements for the NOTICE file, but look at the other NOTICE files in that dir to get an idea of the range of things they contain. Some info here: http://www.apache.org/dev/licensing-howto.html#overview-of-files and here: http://www.apache.org/licenses/LICENSE-2.0.html#redistribution .
          Hide
          arysin Andriy Rysin added a comment -

          Ok, thanks for the suggestions, I was able to run `ant precommit` and I've added/adjusted the files to make it happier.
          It still fails for me but dues to some issue with `/org.ccil.cowan.tagsoup/tagsoup`, hopefully files related to this issue are good now.

          Show
          arysin Andriy Rysin added a comment - Ok, thanks for the suggestions, I was able to run `ant precommit` and I've added/adjusted the files to make it happier. It still fails for me but dues to some issue with `/org.ccil.cowan.tagsoup/tagsoup`, hopefully files related to this issue are good now.
          Hide
          steve_rowe Steve Rowe added a comment -

          It still fails for me but dues to some issue with `/org.ccil.cowan.tagsoup/tagsoup`, hopefully files related to this issue are good now.

          Looking at your changes to ivy-versions.properties, I think the new dependency is in the wrong order (as the comment at the top of the file says, the /org/name keys must be lexically ordered) - likely the error you're seeing is telling you that. If I'm right, perhaps the error message should be fixed. Please post it here.

          Show
          steve_rowe Steve Rowe added a comment - It still fails for me but dues to some issue with `/org.ccil.cowan.tagsoup/tagsoup`, hopefully files related to this issue are good now. Looking at your changes to ivy-versions.properties , I think the new dependency is in the wrong order (as the comment at the top of the file says, the /org/name keys must be lexically ordered) - likely the error you're seeing is telling you that. If I'm right, perhaps the error message should be fixed. Please post it here.
          Hide
          arysin Andriy Rysin added a comment -

          Here's what I get:
          check-lib-versions:
          [echo] Lib versions check under: /home/master/work/ukr/spelling/lucene-workspace/lucene-solr/lucene/..
          [libversions] :: loading settings :: file = /home/master/work/ukr/spelling/lucene-workspace/lucene-solr/lucene/top-level-ivy-settings.xml
          [libversions] OUT-OF-ORDER coordinate key '/org.ccil.cowan.tagsoup/tagsoup' in ivy-versions.properties
          [libversions] Checked that ivy-versions.properties and ivy-ignore-conflicts.properties have lexically sorted '/org/name' keys and no duplicates or orphans.
          [libversions] Scanned 46 ivy.xml files for rev="$

          {/org/name}

          " format.
          [libversions] Found 0 indirect dependency version conflicts.
          [libversions] Completed in 1.24s., 1 error(s).

          Show
          arysin Andriy Rysin added a comment - Here's what I get: check-lib-versions: [echo] Lib versions check under: /home/master/work/ukr/spelling/lucene-workspace/lucene-solr/lucene/.. [libversions] :: loading settings :: file = /home/master/work/ukr/spelling/lucene-workspace/lucene-solr/lucene/top-level-ivy-settings.xml [libversions] OUT-OF-ORDER coordinate key '/org.ccil.cowan.tagsoup/tagsoup' in ivy-versions.properties [libversions] Checked that ivy-versions.properties and ivy-ignore-conflicts.properties have lexically sorted '/org/name' keys and no duplicates or orphans. [libversions] Scanned 46 ivy.xml files for rev="$ {/org/name} " format. [libversions] Found 0 indirect dependency version conflicts. [libversions] Completed in 1.24s., 1 error(s).
          Hide
          steve_rowe Steve Rowe added a comment -

          I think the error message is okay - it's hard to be precise about the cause of out-of-order problems. The out-of-order checker can only tell that there is a problem after seeing the key following your new one. Here's your patch in context:

           +ua.net.nlp.morfologik-ukrainian-search.version = 3.7.5
           +/ua.net.nlp/morfologik-ukrainian-search = ${ua.net.nlp.morfologik-ukrainian-search.version}
           +
            /org.ccil.cowan.tagsoup/tagsoup = 1.2.1
          

          FYI, we work pretty hard to make sure that ant precommit works properly all the time, so if you see a failure, it's very likely due to your changes.

          Show
          steve_rowe Steve Rowe added a comment - I think the error message is okay - it's hard to be precise about the cause of out-of-order problems. The out-of-order checker can only tell that there is a problem after seeing the key following your new one. Here's your patch in context: +ua.net.nlp.morfologik-ukrainian-search.version = 3.7.5 +/ua.net.nlp/morfologik-ukrainian-search = ${ua.net.nlp.morfologik-ukrainian-search.version} + /org.ccil.cowan.tagsoup/tagsoup = 1.2.1 FYI, we work pretty hard to make sure that ant precommit works properly all the time, so if you see a failure, it's very likely due to your changes.
          Hide
          arysin Andriy Rysin added a comment -

          Ahh, I see what you mean, I'll push the fix for the order once my `ant precommit` succeeds.

          Show
          arysin Andriy Rysin added a comment - Ahh, I see what you mean, I'll push the fix for the order once my `ant precommit` succeeds.
          Hide
          arysin Andriy Rysin added a comment -

          `ant precommit` is happy now

          Show
          arysin Andriy Rysin added a comment - `ant precommit` is happy now
          Hide
          steve_rowe Steve Rowe added a comment -

          +1, LGTM

          Show
          steve_rowe Steve Rowe added a comment - +1, LGTM
          Hide
          elyograg Shawn Heisey added a comment -

          Interesting. I could have sworn I had read that LGPL was OK for binary dependencies, but I can see it right there on the legal page that it isn't. Learn something new every day.

          Show
          elyograg Shawn Heisey added a comment - Interesting. I could have sworn I had read that LGPL was OK for binary dependencies, but I can see it right there on the legal page that it isn't. Learn something new every day.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user dweiss commented on a diff in the pull request:

          https://github.com/apache/lucene-solr/pull/187#discussion_r111922455

          — Diff: lucene/licenses/morfologik-ukrainian-search-NOTICE.txt —
          @@ -0,0 +1,6 @@
          +morfologik-ukrainian-search is a POS tag dictionary in morfolgik format adjusted for searching.
          — End diff –

          typo in morfolgik

          Show
          githubbot ASF GitHub Bot added a comment - Github user dweiss commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/187#discussion_r111922455 — Diff: lucene/licenses/morfologik-ukrainian-search-NOTICE.txt — @@ -0,0 +1,6 @@ +morfologik-ukrainian-search is a POS tag dictionary in morfolgik format adjusted for searching. — End diff – typo in morfolgik
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user dweiss commented on a diff in the pull request:

          https://github.com/apache/lucene-solr/pull/187#discussion_r111922745

          — Diff: lucene/licenses/morfologik-ukrainian-search-NOTICE.txt —
          @@ -0,0 +1,6 @@
          +morfologik-ukrainian-search is a POS tag dictionary in morfolgik format adjusted for searching.
          +It's part of dict_uk project (https://github.com/brown-uk/dict_uk)
          +
          +Note: to better fit into full-text search model this dictionary has all word forms in lower case but keeps lemmas for proper nouns in upper case.
          +
          +Licensed under GPL/LGPL, CC BY-NC-SA 4.0, and Apache License 2.0.
          — End diff –

          Hmm; not a lawyer, but this should better state "Licensed under Apache License 2.0"... Those multiple licenses don't agree with each other – at least it should state or, not and.

          Show
          githubbot ASF GitHub Bot added a comment - Github user dweiss commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/187#discussion_r111922745 — Diff: lucene/licenses/morfologik-ukrainian-search-NOTICE.txt — @@ -0,0 +1,6 @@ +morfologik-ukrainian-search is a POS tag dictionary in morfolgik format adjusted for searching. +It's part of dict_uk project ( https://github.com/brown-uk/dict_uk ) + +Note: to better fit into full-text search model this dictionary has all word forms in lower case but keeps lemmas for proper nouns in upper case. + +Licensed under GPL/LGPL, CC BY-NC-SA 4.0, and Apache License 2.0. — End diff – Hmm; not a lawyer, but this should better state "Licensed under Apache License 2.0"... Those multiple licenses don't agree with each other – at least it should state or , not and .
          Hide
          dweiss Dawid Weiss added a comment -

          Will merge this in.

          Show
          dweiss Dawid Weiss added a comment - Will merge this in.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 230b3e6e4d27e2b17764814e461c7d96e947f762 in lucene-solr's branch refs/heads/branch_6x from Dawid Weiss
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=230b3e6 ]

          LUCENE-7785: Move dictionary for Ukrainian analyzer to external dependency. (Andriy Rysin via Dawid Weiss)

          Show
          jira-bot ASF subversion and git services added a comment - Commit 230b3e6e4d27e2b17764814e461c7d96e947f762 in lucene-solr's branch refs/heads/branch_6x from Dawid Weiss [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=230b3e6 ] LUCENE-7785 : Move dictionary for Ukrainian analyzer to external dependency. (Andriy Rysin via Dawid Weiss)
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit e62a3ff46c855f2b0fab7c1b78895118514e23f4 in lucene-solr's branch refs/heads/master from Dawid Weiss
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e62a3ff ]

          LUCENE-7785: Move dictionary for Ukrainian analyzer to external dependency. (Andriy Rysin via Dawid Weiss)

          Show
          jira-bot ASF subversion and git services added a comment - Commit e62a3ff46c855f2b0fab7c1b78895118514e23f4 in lucene-solr's branch refs/heads/master from Dawid Weiss [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e62a3ff ] LUCENE-7785 : Move dictionary for Ukrainian analyzer to external dependency. (Andriy Rysin via Dawid Weiss)
          Hide
          dweiss Dawid Weiss added a comment -

          Thanks Andriy, Steve!

          Show
          dweiss Dawid Weiss added a comment - Thanks Andriy, Steve!
          Hide
          arysin Andriy Rysin added a comment -

          Thanks Dawid! Thanks everybody for your help and feedback!

          Show
          arysin Andriy Rysin added a comment - Thanks Dawid! Thanks everybody for your help and feedback!

            People

            • Assignee:
              dweiss Dawid Weiss
              Reporter:
              arysin Andriy Rysin
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development