Details

    • Lucene Fields:
      Patch Available
    • Flags:
      Patch

      Description

      Currently the Lucene Classification module supports the classification for an input text using the Lucene index as a trained model.

      This improvement is adding to the module a set of components to provide Document classification ( where the Document is a Lucene document ).
      All selected fields from the Document will have their part in the classification ( including the use of the proper Analyzer per field).

      1. LUCENE-6631.patch
        68 kB
        Alessandro Benedetti
      2. LUCENE-6631.patch
        72 kB
        Alessandro Benedetti

        Issue Links

          Activity

          Hide
          alessandro.benedetti Alessandro Benedetti added a comment -

          I will provide the patch in a couple of days.
          it's almost ready, just few modifications needed.

          The algorithms supported for Document Classification will be :

          • KNearestNeighborClassifier
          • SimpleNaiveBayesClassifier
          Show
          alessandro.benedetti Alessandro Benedetti added a comment - I will provide the patch in a couple of days. it's almost ready, just few modifications needed. The algorithms supported for Document Classification will be : KNearestNeighborClassifier SimpleNaiveBayesClassifier
          Hide
          alessandro.benedetti Alessandro Benedetti added a comment -

          Introduced :

          • field boosting for knnClassifier ( text)
          • module for Document Classification including
            -knnDocumentClassifier
            -SimpleNaivesDocumentClassifier
          Show
          alessandro.benedetti Alessandro Benedetti added a comment - Introduced : field boosting for knnClassifier ( text) module for Document Classification including -knnDocumentClassifier -SimpleNaivesDocumentClassifier
          Hide
          alessandro.benedetti Alessandro Benedetti added a comment -
          Show
          alessandro.benedetti Alessandro Benedetti added a comment - First introduction about this extended module : http://alexbenedetti.blogspot.co.uk/2015/07/lucene-document-classification.html
          Hide
          alessandro.benedetti Alessandro Benedetti added a comment -

          JavaDocs fixed

          Show
          alessandro.benedetti Alessandro Benedetti added a comment - JavaDocs fixed
          Hide
          alessandro.benedetti Alessandro Benedetti added a comment -
          • improved seed document analysis to happen only one time
          Show
          alessandro.benedetti Alessandro Benedetti added a comment - improved seed document analysis to happen only one time
          Hide
          alessandro.benedetti Alessandro Benedetti added a comment -

          Tommaso Teofili when you are back to the business, take a look to this patch

          Cheers

          Show
          alessandro.benedetti Alessandro Benedetti added a comment - Tommaso Teofili when you are back to the business, take a look to this patch Cheers
          Hide
          alessandro.benedetti Alessandro Benedetti added a comment -

          Any feedback on this guys ?

          Cheers

          Show
          alessandro.benedetti Alessandro Benedetti added a comment - Any feedback on this guys ? Cheers
          Hide
          teofili Tommaso Teofili added a comment -

          Alessandro, thanks for your patch, I'll take a look and let you know shortly.

          Show
          teofili Tommaso Teofili added a comment - Alessandro, thanks for your patch, I'll take a look and let you know shortly.
          Hide
          alessandro.benedetti Alessandro Benedetti added a comment -

          Any news on this part as well ?

          Cheers

          Show
          alessandro.benedetti Alessandro Benedetti added a comment - Any news on this part as well ? Cheers
          Hide
          alessandro.benedetti Alessandro Benedetti added a comment -

          Patch updated to current trunk

          Show
          alessandro.benedetti Alessandro Benedetti added a comment - Patch updated to current trunk
          Hide
          alessandro.benedetti Alessandro Benedetti added a comment -

          I just updated the patch to comply with the current trunk.
          Any feedback is welcome.

          Cheers

          Show
          alessandro.benedetti Alessandro Benedetti added a comment - I just updated the patch to comply with the current trunk. Any feedback is welcome. Cheers
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1709522 from Tommaso Teofili in branch 'dev/trunk'
          [ https://svn.apache.org/r1709522 ]

          LUCENE-6631 - added document classification api and impls

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1709522 from Tommaso Teofili in branch 'dev/trunk' [ https://svn.apache.org/r1709522 ] LUCENE-6631 - added document classification api and impls
          Hide
          teofili Tommaso Teofili added a comment -

          I've committed the updated patch for document classification, thanks Alessandro Benedetti for your contribution.

          Show
          teofili Tommaso Teofili added a comment - I've committed the updated patch for document classification, thanks Alessandro Benedetti for your contribution.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1709529 from Tommaso Teofili in branch 'dev/trunk'
          [ https://svn.apache.org/r1709529 ]

          LUCENE-6631 - added missing javadoc for kNN classifier

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1709529 from Tommaso Teofili in branch 'dev/trunk' [ https://svn.apache.org/r1709529 ] LUCENE-6631 - added missing javadoc for kNN classifier
          Hide
          teofili Tommaso Teofili added a comment -

          one minor comment Alessandro Benedetti: please make sure to run ant precommit or ant documentation-lint on your patches to make sure that also javadocs and all the build in general is fine once your changes get included (I had to add some javadoc to kNN (document) classifier to make the build not fail).

          Show
          teofili Tommaso Teofili added a comment - one minor comment Alessandro Benedetti : please make sure to run ant precommit or ant documentation-lint on your patches to make sure that also javadocs and all the build in general is fine once your changes get included (I had to add some javadoc to kNN (document) classifier to make the build not fail).
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1709533 from Tommaso Teofili in branch 'dev/trunk'
          [ https://svn.apache.org/r1709533 ]

          LUCENE-6631 - added missing javadoc for kNN classifier

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1709533 from Tommaso Teofili in branch 'dev/trunk' [ https://svn.apache.org/r1709533 ] LUCENE-6631 - added missing javadoc for kNN classifier
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1709536 from Tommaso Teofili in branch 'dev/trunk'
          [ https://svn.apache.org/r1709536 ]

          LUCENE-6631 - added entry in CHANGES.txt

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1709536 from Tommaso Teofili in branch 'dev/trunk' [ https://svn.apache.org/r1709536 ] LUCENE-6631 - added entry in CHANGES.txt
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1709563 from shalin@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1709563 ]

          LUCENE-6631: Added missing ASL header

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1709563 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1709563 ] LUCENE-6631 : Added missing ASL header
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1709575 from Tommaso Teofili in branch 'dev/trunk'
          [ https://svn.apache.org/r1709575 ]

          LUCENE-6631 - fixed svn eol-style

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1709575 from Tommaso Teofili in branch 'dev/trunk' [ https://svn.apache.org/r1709575 ] LUCENE-6631 - fixed svn eol-style
          Hide
          alessandro.benedetti Alessandro Benedetti added a comment -

          Thanks Tommaso Teofili , I thought to have fixed the javadocs problem months ago, probably I didn't check when I updated the patch, thanks for the observation, I will make sure all the javadocs are fine for the next contributions before the patch submission.

          Cheers

          Show
          alessandro.benedetti Alessandro Benedetti added a comment - Thanks Tommaso Teofili , I thought to have fixed the javadocs problem months ago, probably I didn't check when I updated the patch, thanks for the observation, I will make sure all the javadocs are fine for the next contributions before the patch submission. Cheers

            People

            • Assignee:
              teofili Tommaso Teofili
              Reporter:
              alessandro.benedetti Alessandro Benedetti
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development