Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • ManifoldCF 1.5
    • ManifoldCF next
    • None

    Description

      Support for writing documents crawled by ManifoldCF to RabbitMQ. This will decouple manifold from any search engine and gives a natural integration point for a content processing framework. This outupt connector has been written with logstash in mind.

      Attachments

        1. CONNECTORS-818.patch
          17 kB
          Christian R
        2. CONNECTORS-818.patch
          47 kB
          Christian R

        Activity

          kwright@metacarta.com Karl Wright added a comment -

          r1544167 creates branches/CONNECTORS-818 to work on this issue.

          kwright@metacarta.com Karl Wright added a comment - r1544167 creates branches/ CONNECTORS-818 to work on this issue.
          kwright@metacarta.com Karl Wright added a comment -

          Hi Christian,

          Please feel free to submit patches against the current version of the CONNECTORS-818 branch at any time, and I will commit them.

          Thanks!

          kwright@metacarta.com Karl Wright added a comment - Hi Christian, Please feel free to submit patches against the current version of the CONNECTORS-818 branch at any time, and I will commit them. Thanks!
          kwright@metacarta.com Karl Wright added a comment -

          Also, FWIW, a connector like this does not invalidate at all earlier discussions we've had about including a pipeline in ManifoldCF itself. But this may be a good way to assess how important such efforts will be to people in the long run.

          kwright@metacarta.com Karl Wright added a comment - Also, FWIW, a connector like this does not invalidate at all earlier discussions we've had about including a pipeline in ManifoldCF itself. But this may be a good way to assess how important such efforts will be to people in the long run.
          christianmr Christian R added a comment -

          Is there a quick tutorial on submitting patches somewhere?

          christianmr Christian R added a comment - Is there a quick tutorial on submitting patches somewhere?
          christianmr Christian R added a comment -

          Also, FWIW, this approach let us use other non-ManifoldCF connectors and point it to the same RabbitMQ, letting us use logstash as the only place to do content processing.

          christianmr Christian R added a comment - Also, FWIW, this approach let us use other non-ManifoldCF connectors and point it to the same RabbitMQ, letting us use logstash as the only place to do content processing.
          kwright@metacarta.com Karl Wright added a comment - - edited

          There is stuff in the wiki, but basically all you do is the following: at the root level of your svn checkout, do this:

          svn diff >CONNECTORS-818.patch

          ... and then attach that file to the ticket under the "More" pulldown in Jira.

          Obviously, you'd first make all the changes you want to include in the patch in that workarea, including adding files and directories (svn add, svn mkdir) etc.

          kwright@metacarta.com Karl Wright added a comment - - edited There is stuff in the wiki, but basically all you do is the following: at the root level of your svn checkout, do this: svn diff > CONNECTORS-818 .patch ... and then attach that file to the ticket under the "More" pulldown in Jira. Obviously, you'd first make all the changes you want to include in the patch in that workarea, including adding files and directories (svn add, svn mkdir) etc.
          christianmr Christian R added a comment -

          Initial commit, hopefully I got all the necessary files included.
          Still missing: Better configuration support, deletes, error handling. Tests. General improvements.

          christianmr Christian R added a comment - Initial commit, hopefully I got all the necessary files included. Still missing: Better configuration support, deletes, error handling. Tests. General improvements.
          kwright@metacarta.com Karl Wright added a comment - - edited

          Patch committed. I won't have much of a chance to review it though until the weekend.

          kwright@metacarta.com Karl Wright added a comment - - edited Patch committed. I won't have much of a chance to review it though until the weekend.
          gseaton Graeme Seaton added a comment -

          Hi Karl/Christian,

          I have developed a custom output connector which connect to a repository AND adds messages to Rabbit. Is there a issue related to creating a pipeline in Manifold?

          gseaton Graeme Seaton added a comment - Hi Karl/Christian, I have developed a custom output connector which connect to a repository AND adds messages to Rabbit. Is there a issue related to creating a pipeline in Manifold?
          kwright@metacarta.com Karl Wright added a comment -

          Hi Graeme,

          See CONNECTORS-475. That's for Hydra, which is yet another pipeline framework. But there is no ticket for an internal MCF pipeline at this time.

          If you have a Rabbit-based internal MCF implementation, or something similar, please create a ticket and attach code. I have some ideas about how an internal pipeline should operate, but it's always good to have a practical implementation available.

          kwright@metacarta.com Karl Wright added a comment - Hi Graeme, See CONNECTORS-475 . That's for Hydra, which is yet another pipeline framework. But there is no ticket for an internal MCF pipeline at this time. If you have a Rabbit-based internal MCF implementation, or something similar, please create a ticket and attach code. I have some ideas about how an internal pipeline should operate, but it's always good to have a practical implementation available.
          christianmr Christian R added a comment -

          Removed much unused code from earlier copy-paste.
          Added some todos.
          Added operation delete/add to document sent to rabbitmq.

          christianmr Christian R added a comment - Removed much unused code from earlier copy-paste. Added some todos. Added operation delete/add to document sent to rabbitmq.
          kwright@metacarta.com Karl Wright added a comment -

          Second patch committed.

          kwright@metacarta.com Karl Wright added a comment - Second patch committed.
          gseaton Graeme Seaton added a comment -

          Hi Karl,

          Would be far too embarrashed to publish the code - basically involves hard-coded steps for various operations. Hydra looks interesting (the main trick would be removing the MongoDB dependency).

          Our current architecture involves reading a the repository file, parsing and then writing the results to a permanent store (can't be more specific than that ). We then carry out additional manipulation in separate agents which are triggered by messages in Rabbit (which we can scale ala Hydra). For me, one of the dependencies for a pipeline in MCF (from a scaleability perspective) is CONNECTORS-780.

          More than happy to discuss further (but outside of JIRA).

          Regards,

          Graeme

          gseaton Graeme Seaton added a comment - Hi Karl, Would be far too embarrashed to publish the code - basically involves hard-coded steps for various operations. Hydra looks interesting (the main trick would be removing the MongoDB dependency). Our current architecture involves reading a the repository file, parsing and then writing the results to a permanent store (can't be more specific than that ). We then carry out additional manipulation in separate agents which are triggered by messages in Rabbit (which we can scale ala Hydra). For me, one of the dependencies for a pipeline in MCF (from a scaleability perspective) is CONNECTORS-780 . More than happy to discuss further (but outside of JIRA). Regards, Graeme
          kwright@metacarta.com Karl Wright added a comment -

          Hi Graeme,

          As you may have noted, CONNECTORS-781 is actually complete (aside from connector-specific throttling, which I've put in a separate ticket for resolution). At the moment, that's exhausted my time available for major architectural changes, but I do hope you guys give the new code the workout it needs. Please keep me informed.

          kwright@metacarta.com Karl Wright added a comment - Hi Graeme, As you may have noted, CONNECTORS-781 is actually complete (aside from connector-specific throttling, which I've put in a separate ticket for resolution). At the moment, that's exhausted my time available for major architectural changes, but I do hope you guys give the new code the workout it needs. Please keep me informed.
          kwright@metacarta.com Karl Wright added a comment -

          I am going to rebase this branch, since I believe it is now schema-incompatible with trunk. Rebasing will bring it up to date with the schema that will ship with MCF 1.5.

          kwright@metacarta.com Karl Wright added a comment - I am going to rebase this branch, since I believe it is now schema-incompatible with trunk. Rebasing will bring it up to date with the schema that will ship with MCF 1.5.
          christianmr Christian R added a comment -

          That's fine. I was sidetracked by another customer so this project fell to the side. I fully plan to get it done even though it goes a bit slow right now.

          christianmr Christian R added a comment - That's fine. I was sidetracked by another customer so this project fell to the side. I fully plan to get it done even though it goes a bit slow right now.
          gseaton Graeme Seaton added a comment -

          Hi Karl,

          Really chuffed with what has been achieved and planning on testing today - will let you know how I get on (will be using an existing Zookeeper cluster though).

          Regards,

          Graeme

          gseaton Graeme Seaton added a comment - Hi Karl, Really chuffed with what has been achieved and planning on testing today - will let you know how I get on (will be using an existing Zookeeper cluster though). Regards, Graeme
          kwright@metacarta.com Karl Wright added a comment -

          Ok, CONNECTORS-818 branch rebased - 1.5 schema compatible.

          kwright@metacarta.com Karl Wright added a comment - Ok, CONNECTORS-818 branch rebased - 1.5 schema compatible.
          kwright@metacarta.com Karl Wright added a comment -

          This ticket has been awfully quiet. Any news?

          kwright@metacarta.com Karl Wright added a comment - This ticket has been awfully quiet. Any news?
          christianmr Christian R added a comment -

          The intended project never materialized and I have been assigned to something else. Since an old ticket is a nuisance to everybody I will find the time to get this done. At what time do you need to close this ticket if it is to be part of the 1.5.1 release? (Or the 1.6)

          christianmr Christian R added a comment - The intended project never materialized and I have been assigned to something else. Since an old ticket is a nuisance to everybody I will find the time to get this done. At what time do you need to close this ticket if it is to be part of the 1.5.1 release? (Or the 1.6)
          kwright@metacarta.com Karl Wright added a comment -

          You have 2 months if you want it to be part of 1.6. But as long as the ticket is active, it won't be closed; I'll move it forward from release to release.

          kwright@metacarta.com Karl Wright added a comment - You have 2 months if you want it to be part of 1.6. But as long as the ticket is active, it won't be closed; I'll move it forward from release to release.
          kwright@metacarta.com Karl Wright added a comment -

          Rebased the CONNECTORS-818 branch to make it compatible with 1.6 schema and build structure.

          kwright@metacarta.com Karl Wright added a comment - Rebased the CONNECTORS-818 branch to make it compatible with 1.6 schema and build structure.
          kwright@metacarta.com Karl Wright added a comment -

          Rebased the CONNECTORS-818 branch to make it compatible with 1.7 schema.

          kwright@metacarta.com Karl Wright added a comment - Rebased the CONNECTORS-818 branch to make it compatible with 1.7 schema.
          kwright@metacarta.com Karl Wright added a comment -

          Rebased again for multiple outputs.

          kwright@metacarta.com Karl Wright added a comment - Rebased again for multiple outputs.
          nederhrj Rene Nederhand added a comment -

          I am looking into adding RabbitMQ as an output connector. What's the current status on this issue? Are there any particular things that need to be done before inclusion in the next release of ManifoldCF?

          nederhrj Rene Nederhand added a comment - I am looking into adding RabbitMQ as an output connector. What's the current status on this issue? Are there any particular things that need to be done before inclusion in the next release of ManifoldCF?
          kwright@metacarta.com Karl Wright added a comment -

          This connector was never implemented; the contributor essentially created a directory hierarchy with a stub connector and got no further than that.

          I don't have experience with RabbitMQ, but if you know enough about it to understand how to use whatever APIs are available, I can certainly help with the details of how to write a ManifoldCF connector.

          kwright@metacarta.com Karl Wright added a comment - This connector was never implemented; the contributor essentially created a directory hierarchy with a stub connector and got no further than that. I don't have experience with RabbitMQ, but if you know enough about it to understand how to use whatever APIs are available, I can certainly help with the details of how to write a ManifoldCF connector.
          kwright@metacarta.com Karl Wright added a comment -

          Postponing due to lack of interest

          kwright@metacarta.com Karl Wright added a comment - Postponing due to lack of interest

          People

            kwright@metacarta.com Karl Wright
            christianmr Christian R
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: