Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-9221

Remove Solr contribs: map-reduce, morphlines-core and morphlines-cell

    Details

      Description

      The Solr contribs map-reduce, morphlines-cell and morphlines-core contain tests that are not being fixed: SOLR-6489 and SOLR-9220.

      (Some subset of?) these components live in the Kite SDK: http://kitesdk.org - why are they also hosted in Solr?

      1. SOLR-9221.patch
        2.94 MB
        Steve Rowe
      2. SOLR-9221.patch
        2.20 MB
        Steve Rowe
      3. SOLR-9221.patch
        3.02 MB
        Steve Rowe
      4. SOLR-9221-review.patch
        114 kB
        Steve Rowe

        Issue Links

          Activity

          Hide
          steve_rowe Steve Rowe added a comment - - edited

          Ridiculously large patch to remove these contribs.

          Precommit passes.

          Show
          steve_rowe Steve Rowe added a comment - - edited Ridiculously large patch to remove these contribs. Precommit passes.
          Hide
          thetaphi Uwe Schindler added a comment -

          +1 to remove it. I think Noble Paul already said in another issue that KiteSDK includes the whole Solr support, so we don't need the contribs. It just brings problems:

          • We cannot change some Solr APIs (removing deprecations), because the code in morphlines uses those Solr APIs (circular dependency).
          • We cannot update TIKA, because it depends on a really antique version of it.
          • There seems to be no maintainer of this contrib anymore. There are tons of issues open, and none of them are fixed. I have no idea how to fix them, as the whole thing (kite moprphlines) is a black box.
          Show
          thetaphi Uwe Schindler added a comment - +1 to remove it. I think Noble Paul already said in another issue that KiteSDK includes the whole Solr support, so we don't need the contribs. It just brings problems: We cannot change some Solr APIs (removing deprecations), because the code in morphlines uses those Solr APIs (circular dependency). We cannot update TIKA, because it depends on a really antique version of it. There seems to be no maintainer of this contrib anymore. There are tons of issues open, and none of them are fixed. I have no idea how to fix them, as the whole thing (kite moprphlines) is a black box.
          Hide
          markrmiller@gmail.com Mark Miller added a comment -

          I'm still -1 on removing. People email me and use it, if you have an issue you want resolved and it's not ping me, but I remain a veto on removal.

          Show
          markrmiller@gmail.com Mark Miller added a comment - I'm still -1 on removing. People email me and use it, if you have an issue you want resolved and it's not ping me, but I remain a veto on removal.
          Hide
          markrmiller@gmail.com Mark Miller added a comment -

          I willl remove my veto of we decide not to put the effort in to fix the issues, but it remains until that's determined.

          Show
          markrmiller@gmail.com Mark Miller added a comment - I willl remove my veto of we decide not to put the effort in to fix the issues, but it remains until that's determined.
          Hide
          noble.paul Noble Paul added a comment -

          I'm not sure morphlines stuff should be a part of Solr. It should be a part of morphlines and not Solr. There are so many tools/frameworks that provide Solr integration. If we put that code in Solr contribs it is too painful for us to manage. Most of the Solr devs have no idea about the correctness of that code because we don't know those frameworks.

          +1 to remove these

          Show
          noble.paul Noble Paul added a comment - I'm not sure morphlines stuff should be a part of Solr. It should be a part of morphlines and not Solr. There are so many tools/frameworks that provide Solr integration. If we put that code in Solr contribs it is too painful for us to manage. Most of the Solr devs have no idea about the correctness of that code because we don't know those frameworks. +1 to remove these
          Hide
          thetaphi Uwe Schindler added a comment -

          Hi,
          I see the following possibilities:

          1. Include the whole source code of morphlines into Solr (not just a part of it). - I don't like that option, because we have missing expertise. I am also sure that Cloudera is not happy to donate all code.
          2. Fix morphlines to not depend on Solr and TIKA directly. We have no control about this, but I am sure Mark might be able to open issues about that on kitesdk. If we would need to add the TIKA adaptors into Solr then, I would be fine. But the current dependency-hell is a no-go.
          3. Remove the contrib and ship the (full) Solr support with Kitesdk. I'd prefer this. The three contribs are just some client for solr, so why does it need to be inside Solr's repository?

          I still think that #3 is the only option to go.

          Show
          thetaphi Uwe Schindler added a comment - Hi, I see the following possibilities: Include the whole source code of morphlines into Solr (not just a part of it). - I don't like that option, because we have missing expertise. I am also sure that Cloudera is not happy to donate all code. Fix morphlines to not depend on Solr and TIKA directly. We have no control about this, but I am sure Mark might be able to open issues about that on kitesdk. If we would need to add the TIKA adaptors into Solr then, I would be fine. But the current dependency-hell is a no-go. Remove the contrib and ship the (full) Solr support with Kitesdk. I'd prefer this. The three contribs are just some client for solr, so why does it need to be inside Solr's repository? I still think that #3 is the only option to go.
          Hide
          dsmiley David Smiley added a comment -

          3. Remove the contrib and ship the (full) Solr support with Kitesdk. I'd prefer this. The three contribs are just some client for solr, so why does it need to be inside Solr's repository?

          +1. There are plenty of systems out there that integrate with Solr... why should Morphlines in particular have this here?

          Show
          dsmiley David Smiley added a comment - 3. Remove the contrib and ship the (full) Solr support with Kitesdk. I'd prefer this. The three contribs are just some client for solr, so why does it need to be inside Solr's repository? +1. There are plenty of systems out there that integrate with Solr... why should Morphlines in particular have this here?
          Hide
          markrmiller@gmail.com Mark Miller added a comment -

          Sorry, a tool to build indexes on hdfs is a perfect Solr contrib. Won't remove my veto for flimsy arguments. I will if we are not going to put effort into the current issues.

          Show
          markrmiller@gmail.com Mark Miller added a comment - Sorry, a tool to build indexes on hdfs is a perfect Solr contrib. Won't remove my veto for flimsy arguments. I will if we are not going to put effort into the current issues.
          Hide
          noble.paul Noble Paul added a comment -

          a tool to build indexes on hdfs is a perfect Solr contrib.

          As long as at least one committer owns up maintenance of a component we can keep it around. It is a problem when it is not actively maintained.

          Show
          noble.paul Noble Paul added a comment - a tool to build indexes on hdfs is a perfect Solr contrib. As long as at least one committer owns up maintenance of a component we can keep it around. It is a problem when it is not actively maintained.
          Hide
          markrmiller@gmail.com Mark Miller added a comment -

          If you go look at all the JIRA's I've worked on, I don't think you can say it's not actively maintained. Every issue may not get addressed or addressed on your timeline, but that is perfectly fine and it's BS that it's not maintained.

          I have worked a bit on a couple of the latest issues.

          We have discussed what we want to do with the contrib and it seems most likely we will either pull the tika integration contrib (morphlines-cell) or pull both morphlines contribs and have a more generic plugin point.

          Show
          markrmiller@gmail.com Mark Miller added a comment - If you go look at all the JIRA's I've worked on, I don't think you can say it's not actively maintained. Every issue may not get addressed or addressed on your timeline, but that is perfectly fine and it's BS that it's not maintained. I have worked a bit on a couple of the latest issues. We have discussed what we want to do with the contrib and it seems most likely we will either pull the tika integration contrib (morphlines-cell) or pull both morphlines contribs and have a more generic plugin point.
          Hide
          markrmiller@gmail.com Mark Miller added a comment -

          Looks like no progress on this, so I am removing my veto to this issue.

          Show
          markrmiller@gmail.com Mark Miller added a comment - Looks like no progress on this, so I am removing my veto to this issue.
          Hide
          steve_rowe Steve Rowe added a comment -

          Here's an up-to-date removal patch. I also made a (much shorter but non-git apply-able) review patch with git diff -D: SOLR-9221-review.patch.

          I've run a few checks with the patch so far: ant resolve, ant idea, ant get-maven-poms, ant check-lib-versions, and ant check-licenses. I'll run ant nightly-smoke shortly - it runs all the validation stuff and unit tests.

          If there are no objections (and assuming no problems come up with further testing), I'll commit 2 days from now.

          Show
          steve_rowe Steve Rowe added a comment - Here's an up-to-date removal patch. I also made a (much shorter but non- git apply -able) review patch with git diff -D : SOLR-9221-review.patch . I've run a few checks with the patch so far: ant resolve , ant idea , ant get-maven-poms , ant check-lib-versions , and ant check-licenses . I'll run ant nightly-smoke shortly - it runs all the validation stuff and unit tests. If there are no objections (and assuming no problems come up with further testing), I'll commit 2 days from now.
          Hide
          steve_rowe Steve Rowe added a comment -

          git apply-able patch (the previous patch couldn't remove binary files; this one was generated with git diff --binary and applies for me.)

          Show
          steve_rowe Steve Rowe added a comment - git apply -able patch (the previous patch couldn't remove binary files; this one was generated with git diff --binary and applies for me.)
          Hide
          steve_rowe Steve Rowe added a comment -

          No objections => committing removal patch shortly.

          Show
          steve_rowe Steve Rowe added a comment - No objections => committing removal patch shortly.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 53e5f34f66d264c8f0ea2861e77389902b2a36c4 in lucene-solr's branch refs/heads/master from Steve Rowe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=53e5f34 ]

          SOLR-9221: Remove Solr contribs: map-reduce, morphlines-core and morphlines-cell

          Show
          jira-bot ASF subversion and git services added a comment - Commit 53e5f34f66d264c8f0ea2861e77389902b2a36c4 in lucene-solr's branch refs/heads/master from Steve Rowe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=53e5f34 ] SOLR-9221 : Remove Solr contribs: map-reduce, morphlines-core and morphlines-cell
          Hide
          steve_rowe Steve Rowe added a comment -

          the git bot failed to post the branch_6x commit - here it is from the commit notification email:

          SOLR-9221: Remove Solr contribs: map-reduce, morphlines-core and morphlines-cell
          
          Conflicts:
          	solr/contrib/map-reduce/src/java/org/apache/solr/hadoop/TreeMergeOutputFormat.java
          	solr/contrib/map-reduce/src/test/org/apache/solr/hadoop/MorphlineMapperTest.java
          	solr/contrib/map-reduce/src/test/org/apache/solr/hadoop/MorphlineReducerTest.java
          	solr/contrib/morphlines-core/src/java/org/apache/solr/morphlines/solr/LoadSolrBuilder.java
          	solr/contrib/morphlines-core/src/test/org/apache/solr/morphlines/solr/AbstractSolrMorphlineTestBase.java
          	solr/contrib/morphlines-core/src/test/org/apache/solr/morphlines/solr/AbstractSolrMorphlineZkTestBase.java
          
          
          Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
          Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/ac221b96
          Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/ac221b96
          Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/ac221b96
          
          Branch: refs/heads/branch_6x
          Commit: ac221b96b6d16569ca17e37cbebe717f7b86484c
          Parents: 2adbd76
          Author: Steve Rowe <sarowe@apache.org>
          Authored: Fri Mar 24 12:31:16 2017 -0400
          Committer: Steve Rowe <sarowe@apache.org>
          Committed: Fri Mar 24 12:35:46 2017 -0400
          
          Show
          steve_rowe Steve Rowe added a comment - the git bot failed to post the branch_6x commit - here it is from the commit notification email: SOLR-9221: Remove Solr contribs: map-reduce, morphlines-core and morphlines-cell Conflicts: solr/contrib/map-reduce/src/java/org/apache/solr/hadoop/TreeMergeOutputFormat.java solr/contrib/map-reduce/src/test/org/apache/solr/hadoop/MorphlineMapperTest.java solr/contrib/map-reduce/src/test/org/apache/solr/hadoop/MorphlineReducerTest.java solr/contrib/morphlines-core/src/java/org/apache/solr/morphlines/solr/LoadSolrBuilder.java solr/contrib/morphlines-core/src/test/org/apache/solr/morphlines/solr/AbstractSolrMorphlineTestBase.java solr/contrib/morphlines-core/src/test/org/apache/solr/morphlines/solr/AbstractSolrMorphlineZkTestBase.java Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/ac221b96 Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/ac221b96 Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/ac221b96 Branch: refs/heads/branch_6x Commit: ac221b96b6d16569ca17e37cbebe717f7b86484c Parents: 2adbd76 Author: Steve Rowe <sarowe@apache.org> Authored: Fri Mar 24 12:31:16 2017 -0400 Committer: Steve Rowe <sarowe@apache.org> Committed: Fri Mar 24 12:35:46 2017 -0400

            People

            • Assignee:
              steve_rowe Steve Rowe
              Reporter:
              steve_rowe Steve Rowe
            • Votes:
              2 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development