Details

    • Type: Improvement
    • Status: Open
    • Priority: Trivial
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Lucene Fields:
      New

      Description

      Added a new Arabic snowball stemmer based on https://github.com/snowballstem/snowball/blob/master/algorithms/arabic.sbl

      As well an Arabic test dataset in `TestSnowballVocabData.zip` from the snowball-data generated from the input file available here https://github.com/snowballstem/snowball-data/tree/master/arabic

      https://github.com/ibnmalik/golden-corpus-arabic/blob/develop/core/words.txt

       

      It also updates the ant patch-snowball target to be compatible with
      the java classes generated by the last snowball version (tree:
      1964ce688cbeca505263c8f77e16ed923296ce7a). The ant patch-snowball target
      is retro-compatible with the version of snowball stemmers used in
      lucene 7.x and ignores already patched classes.

       

      Link to the corresponding Github PR:
      https://github.com/apache/lucene-solr/pull/449

       Edited: updated the corpus link, PR link and description

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                ryadh Ryadh Dahimene
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h