Nutch
  1. Nutch
  2. NUTCH-978

A Plugin for extracting certain element of a web page on html page parsing.

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 1.2
    • Fix Version/s: 2.4
    • Component/s: parser
    • Labels:
    • Environment:

      Ubuntu Linux 10.10; JDK 1.6; Netbeans 6.9

      Description

      Nutch use parse-html plugin to parse web pages, it process the contents of the web page by removing html tags and component like javascript and css and leaving the extracted text to be stored on the index. Nutch by default doesn't have the capability to select certain atomic element on an html page, like certain tags, certain content, some part of the page, etc.

      A html page have a tree-like xml pattern with html tag as its branch and text as its node. This branch and node could be extracted using XPath. XPath allowing us to select a certain branch or node of an XML and therefore could be used to extract certain information and treat it differently based on its content and the user requirements. Furthermore a web domain like news website usually have a same html code structure for storing the information on its web pages. This same html code structure could be parsed using the same XPath query and retrieve the same content information element. All of the XPath query for selecting various content could be stored on a XPath Configuration File.

      The purpose of nutch are for various web source, not all of the web page retrieved from those various source have the same html code structure, thus have to be threated differently using the correct XPath Configuration. The selection of the correct XPath configuration could be done automatically using regex by matching the url of the web page with valid url pattern for that xpath configuration.

      This automatic mechanism allow the user of nutch to process various web page and get only certain information that user wants therefore making the index more accurate and its content more flexible.

      The component for this idea have been tested on nutch 1.2 for selecting certain elements on various news website for the purpose of document clustering. This includes a Configuration Editor Application build using NetBeans 6.9 Application Framework. though its need a few debugging.

      http://dl.dropbox.com/u/2642087/For_GSoC/for_GSoc.zip

      1. [Nutch-GSoC-2011-Proposal]Web_Page_Scrapper_Parser_Plugin.pdf
        51 kB
        Ammar Shadiq
      2. app_guardian_ivory_coast_news_exmpl.png
        199 kB
        Ammar Shadiq
      3. app_screenshoot_configuration_result_anchor.png
        323 kB
        Ammar Shadiq
      4. app_screenshoot_configuration_result.png
        200 kB
        Ammar Shadiq
      5. app_screenshoot_source_view.png
        157 kB
        Ammar Shadiq
      6. app_screenshoot_url_regex_filter.png
        205 kB
        Ammar Shadiq
      7. for_GSoc.zip
        2.22 MB
        Lewis John McGibbney
      8. version_alpha2.zip
        7.80 MB
        Ammar Shadiq

        Activity

        Hide
        Ammar Shadiq added a comment - - edited

        Proposal for Google Summer of Code 2011
        http://www.google-melange.com/gsoc/homepage/google/gsoc2011

        Show
        Ammar Shadiq added a comment - - edited Proposal for Google Summer of Code 2011 http://www.google-melange.com/gsoc/homepage/google/gsoc2011
        Hide
        Markus Jelsma added a comment -

        If a mentor has been identified then please assign the issue to that mentor.

        http://community.apache.org/guide-to-being-a-mentor.html

        Show
        Markus Jelsma added a comment - If a mentor has been identified then please assign the issue to that mentor. http://community.apache.org/guide-to-being-a-mentor.html
        Hide
        Ammar Shadiq added a comment - - edited

        Wow, thank you very much Mr. Jelsma

        Show
        Ammar Shadiq added a comment - - edited Wow, thank you very much Mr. Jelsma
        Hide
        Ammar Shadiq added a comment -

        Proposal Updated

        Show
        Ammar Shadiq added a comment - Proposal Updated
        Hide
        Thomas Koch added a comment -

        If it is about main text extraction then there's a plugin in Tika for this (boilerpipe) and there's the readability bookmarklet that has an alternative algorithm to determine the main text.

        Show
        Thomas Koch added a comment - If it is about main text extraction then there's a plugin in Tika for this (boilerpipe) and there's the readability bookmarklet that has an alternative algorithm to determine the main text.
        Hide
        Ammar Shadiq added a comment -

        Hi Thomas, thank you for the question and the information for boilerplate and readability bookmarklet.

        I think it's different.

        It's not just about main text extraction, but also specific information like the value of some <meta> tags, picture illustration link of a news page article (a news page article usually have only one picture), or certain links (in form of anchors) for the next crawling iteration that you want to get and process. I think this type of specific configuration would be useful.

        It's not using any training, from what i read on the boilerplate paper, boilerplate use training data for the algorithm and focusing mainly on the number of words to determine the main text. I myself wonder what if the input page are japanese or chinese. I think they developed a custom tokenizer for that, I haven't exploring the component more thoroughly, so I'm not sure. I myself use this component I'm working on to parse pages in Bahasa Indonesia.

        Show
        Ammar Shadiq added a comment - Hi Thomas, thank you for the question and the information for boilerplate and readability bookmarklet. I think it's different. It's not just about main text extraction, but also specific information like the value of some <meta> tags, picture illustration link of a news page article (a news page article usually have only one picture), or certain links (in form of anchors) for the next crawling iteration that you want to get and process. I think this type of specific configuration would be useful. It's not using any training, from what i read on the boilerplate paper, boilerplate use training data for the algorithm and focusing mainly on the number of words to determine the main text. I myself wonder what if the input page are japanese or chinese. I think they developed a custom tokenizer for that, I haven't exploring the component more thoroughly, so I'm not sure. I myself use this component I'm working on to parse pages in Bahasa Indonesia.
        Hide
        Julien Nioche added a comment -

        Can you please explain how your proposal differs from the HTMLParseFilter mechanism that Nutch already has?

        Show
        Julien Nioche added a comment - Can you please explain how your proposal differs from the HTMLParseFilter mechanism that Nutch already has?
        Hide
        Ammar Shadiq added a comment - - edited

        Please correct me if I'm wrong.
        In my limited understanding, Nutch using plugin system, one of those are for parsing html pages (HTMLParseFilter class) whose later selected appropriate plugin based on the configuration and runs it.

        Inside parse-html the main thing it's extract are : Content, Title, and Outlinks.

        The problem that I'm trying to solve are, for adding custom field like on : http://sujitpal.blogspot.com/2009/07/nutch-custom-plugin-to-parse-and-add.html
        for various type of content and for various sites and add it to the index fields. Instead of creating a new plugin for each site, nutch user could simply create the xpath configuration file, put it on the configuration folder and the parsing of custom fields could be done automatically without writing/compiling any code.

        In addition, user could also bypass Content, Title and Outlinks with a different result, for example,
        Set the title of page's from a news site (example: http://www.guardian.co.uk/world/2011/apr/08/ivory-coast-horror-recounted),

        instead the value of <head><title> :
        =Ivory Coast horror recounted by victims and perpetrators | World news | The Guardian

        get the title only, by using xpath of : /html/body/div[@id='wrapper']/div[@id='box']/div[@id='article-header']/div[@id='main-article-info']/h1/text(), and get:
        =Ivory Coast horror recounted by victims and perpetrators

        or only follow outlinks of related news, ignore the rest:

        -Ouattara calls for Ivory Coast sanctions to be lifted
        -Ivory Coast crisis: Q&A
        -After Gbagbo, what next for Ivory Coast?
        -Ivory Coast: The final battle

        like the screenshoot here : https://issues.apache.org/jira/secure/attachment/12475860/app_guardian_ivory_coast_news_exmpl.png

        Since the default parser are parse-html. I add the handler there, some kind of if-else bypass, if the parsed page have URL that match one of those Configuration, it's parsed by it, if there's no configuration matched with the URL, it's uses the default parser mechanism.

        I'm sorry for my English and if I'm not presenting my idea well enough.

        Show
        Ammar Shadiq added a comment - - edited Please correct me if I'm wrong. In my limited understanding, Nutch using plugin system, one of those are for parsing html pages (HTMLParseFilter class) whose later selected appropriate plugin based on the configuration and runs it. Inside parse-html the main thing it's extract are : Content, Title, and Outlinks. The problem that I'm trying to solve are, for adding custom field like on : http://sujitpal.blogspot.com/2009/07/nutch-custom-plugin-to-parse-and-add.html for various type of content and for various sites and add it to the index fields. Instead of creating a new plugin for each site, nutch user could simply create the xpath configuration file, put it on the configuration folder and the parsing of custom fields could be done automatically without writing/compiling any code. In addition, user could also bypass Content, Title and Outlinks with a different result, for example, Set the title of page's from a news site (example: http://www.guardian.co.uk/world/2011/apr/08/ivory-coast-horror-recounted ), instead the value of <head><title> : =Ivory Coast horror recounted by victims and perpetrators | World news | The Guardian get the title only, by using xpath of : /html/body/div [@id='wrapper'] /div [@id='box'] /div [@id='article-header'] /div [@id='main-article-info'] /h1/text(), and get: =Ivory Coast horror recounted by victims and perpetrators or only follow outlinks of related news, ignore the rest: -Ouattara calls for Ivory Coast sanctions to be lifted -Ivory Coast crisis: Q&A -After Gbagbo, what next for Ivory Coast? -Ivory Coast: The final battle like the screenshoot here : https://issues.apache.org/jira/secure/attachment/12475860/app_guardian_ivory_coast_news_exmpl.png Since the default parser are parse-html. I add the handler there, some kind of if-else bypass, if the parsed page have URL that match one of those Configuration, it's parsed by it, if there's no configuration matched with the URL, it's uses the default parser mechanism. I'm sorry for my English and if I'm not presenting my idea well enough.
        Hide
        Lewis John McGibbney added a comment -

        If there has been a plugin written for this, would it be possible to get the code added to the wiki? As we have both parse-tika and parse-html for text and outlink extraction for html format, I don't think that this plugin serves much purpose for the average user of Nutch. It really only adds value for users looking for a solution to the specific problem addressed... this is rare.

        It would be disappointing if we were not able to harness and share the code from this small project with other members of the Nutch community via the wiki.

        Show
        Lewis John McGibbney added a comment - If there has been a plugin written for this, would it be possible to get the code added to the wiki? As we have both parse-tika and parse-html for text and outlink extraction for html format, I don't think that this plugin serves much purpose for the average user of Nutch. It really only adds value for users looking for a solution to the specific problem addressed... this is rare. It would be disappointing if we were not able to harness and share the code from this small project with other members of the Nutch community via the wiki.
        Hide
        Lewis John McGibbney added a comment -

        Hi Chris did you mentor this project through GSoC? I've downloaded the .zip available in the description (which I've also attached in case the link goes AWOL) and I'm going to play about with it. I'll attach it as a patch if I get anywhere.

        Show
        Lewis John McGibbney added a comment - Hi Chris did you mentor this project through GSoC? I've downloaded the .zip available in the description (which I've also attached in case the link goes AWOL) and I'm going to play about with it. I'll attach it as a patch if I get anywhere.
        Hide
        Lewis John McGibbney added a comment -

        In it's present form this is quite literally all over the place and is merely for safe keeping.

        Show
        Lewis John McGibbney added a comment - In it's present form this is quite literally all over the place and is merely for safe keeping.
        Hide
        Chris A. Mattmann added a comment -

        Hey Lewis,

        I didn't end up mentoring this project b/c the proposal came too late and the GSoC Apache folks mentioned the program was already over by that time.

        +1 to continuing work on it though!

        Cheers,
        Chris

        Show
        Chris A. Mattmann added a comment - Hey Lewis, I didn't end up mentoring this project b/c the proposal came too late and the GSoC Apache folks mentioned the program was already over by that time. +1 to continuing work on it though! Cheers, Chris
        Hide
        Ammar Shadiq added a comment -

        Hi Lewis,

        Since the proposal is not accepted, I'm using my summer time to work on my undergrad thesis. I'm graduated from collage recently, and the time has freed up, so I'd love to help, and it's awesome if we could collaborate.

        thanks,
        Ammar

        Show
        Ammar Shadiq added a comment - Hi Lewis, Since the proposal is not accepted, I'm using my summer time to work on my undergrad thesis. I'm graduated from collage recently, and the time has freed up, so I'd love to help, and it's awesome if we could collaborate. thanks, Ammar
        Hide
        Lewis John McGibbney added a comment -

        I think it's best if we talk off list for the time being, please get in touch with me lewismc@apache.org and we can take this forward. GSoC expressions of interest need to be made by the end of the month and this would be great as a project for Nutch.

        Show
        Lewis John McGibbney added a comment - I think it's best if we talk off list for the time being, please get in touch with me lewismc@apache.org and we can take this forward. GSoC expressions of interest need to be made by the end of the month and this would be great as a project for Nutch.
        Hide
        Ammar Shadiq added a comment -

        I'll send you an email.

        Show
        Ammar Shadiq added a comment - I'll send you an email.
        Hide
        Chris A. Mattmann added a comment -

        Guys, I think it's fine to keep the conversation on list, in fact, I'd favor it unless there is a specific reason to take it there?

        Show
        Chris A. Mattmann added a comment - Guys, I think it's fine to keep the conversation on list, in fact, I'd favor it unless there is a specific reason to take it there?
        Hide
        Lewis John McGibbney added a comment -

        No bother Chris. So far questions that have been asked
        1. provide a quick run down on the issue, summarizing all of the above
        2. what were the motivations, purpose and technical challenges encountered whilst working on it?
        3. Why did the issue drop away and what do you think is required to get it back on track and possibly in the codebase?

        Show
        Lewis John McGibbney added a comment - No bother Chris. So far questions that have been asked 1. provide a quick run down on the issue, summarizing all of the above 2. what were the motivations, purpose and technical challenges encountered whilst working on it? 3. Why did the issue drop away and what do you think is required to get it back on track and possibly in the codebase?
        Hide
        Lewis John McGibbney added a comment -

        Replies:

        1 & 2. The main motivation of this issue is for processing news document
        required for my undergrad thesis of Bahasa Indonesia news text
        clustering, it's needed a prepossessing to extract the title, news
        content, date, related news link separately.

        2. The most biggest technical challenge for me is processing the web page
        so it could be parsered as an XML document and could be queried by
        XPath.

        3. The issue is drop away, because with a small tweak a could get it
        working for "only" my thesis requirements, i haven't tested it with
        web page other than the web pages i used for my thesis so i think it's
        not anyway nearly finished yet. And since the proposal is not accepted
        as a GSOC project, i lost motivation to continue to work on this issue
        and decided to work on my thesis instead.

        related issue : https://issues.apache.org/jira/browse/NUTCH-185

        Show
        Lewis John McGibbney added a comment - Replies: 1 & 2. The main motivation of this issue is for processing news document required for my undergrad thesis of Bahasa Indonesia news text clustering, it's needed a prepossessing to extract the title, news content, date, related news link separately. 2. The most biggest technical challenge for me is processing the web page so it could be parsered as an XML document and could be queried by XPath. 3. The issue is drop away, because with a small tweak a could get it working for "only" my thesis requirements, i haven't tested it with web page other than the web pages i used for my thesis so i think it's not anyway nearly finished yet. And since the proposal is not accepted as a GSOC project, i lost motivation to continue to work on this issue and decided to work on my thesis instead. related issue : https://issues.apache.org/jira/browse/NUTCH-185
        Hide
        Lewis John McGibbney added a comment -

        Generally speaking the plugin sounds sounds really useful, the only problem I see is that it is very specific and for it to be integrated into the code base usually we need to make it specific enough to address some given task fully and in a well defined and well justified manner, but we also need to make it general enough to be used in many different contexts. This increases usability and user feedback as well engagement.

        4. With regards to the biggest technical challenge being the processing of web page's, how far did you get with this? We're you able to process it with enough precision to satisfy your requirements?

        5. How were you querying it with XPath? You cannot query with XPath, but instead with XQuery. Do you maybe mean that this enabled you to navigate the document and address various parts of it is XPath?

        6. Ok I understand why it has crumbled slightly, but I think if the code is there is would be a huge waster if we didn't try to revive it, possibly getting it integrated into the code base, and maybe getting it added as a contrib component but not shipping it within the core codebase if the former was not a viable option.

        I've had a look at NUTCH-185, but I think we can discard this as it was addressed a very long time ago, it's also already integrated into the codebase. I was referring more to Jira issues which were currently open, which we could maybe merge or combine to give this a more general and possibly more justified arguement for inclusion in the codebase... what do you think? Does NUTCH-585 fit this?

        Show
        Lewis John McGibbney added a comment - Generally speaking the plugin sounds sounds really useful, the only problem I see is that it is very specific and for it to be integrated into the code base usually we need to make it specific enough to address some given task fully and in a well defined and well justified manner, but we also need to make it general enough to be used in many different contexts. This increases usability and user feedback as well engagement. 4. With regards to the biggest technical challenge being the processing of web page's, how far did you get with this? We're you able to process it with enough precision to satisfy your requirements? 5. How were you querying it with XPath? You cannot query with XPath, but instead with XQuery. Do you maybe mean that this enabled you to navigate the document and address various parts of it is XPath? 6. Ok I understand why it has crumbled slightly, but I think if the code is there is would be a huge waster if we didn't try to revive it, possibly getting it integrated into the code base, and maybe getting it added as a contrib component but not shipping it within the core codebase if the former was not a viable option. I've had a look at NUTCH-185 , but I think we can discard this as it was addressed a very long time ago, it's also already integrated into the codebase. I was referring more to Jira issues which were currently open, which we could maybe merge or combine to give this a more general and possibly more justified arguement for inclusion in the codebase... what do you think? Does NUTCH-585 fit this?
        Hide
        Ammar Shadiq added a comment -

        >>4.With regards to the biggest technical challenge being the processing of web page's, how far did you get with this? We're you able to process it with enough precision to satisfy your requirements?

        I get it to work for my text clustering algorithm, the application screenshoot provided here: http://www.facebook.com/media/set/?set=a.2075564646205.124550.1157621543&type=3&l=7313965254\. Yes, it's quite satisfactory.

        >> 5. How were you querying it with XPath? You cannot query with XPath, but instead with XQuery. Do you maybe mean that this enabled you to navigate the document and address various parts of it is XPath?

        In my understanding there are 3 ways to query an XML document, that is using XPath, XQuery and XLST, I'm sorry if i get it wrong. For navigating various parts of the page i uses java HTML parse lister extending HTMLEditorKit.ParserCallback and then displaying it on the editor application (some kind of chromium Inspect element), this makes the web page structure visible and thus making the XPath expression easier to make.

        >> 6. Ok I understand why it has crumbled slightly, but I think if the code is there is would be a huge waster if we didn't try to revive it, possibly getting it integrated into the code base, and maybe getting it added as a contrib component but not shipping it within the core codebase if the former was not a viable option.

        I totally agree

        As for Nutch 585, i think it's different in the idea that is Nutch 585 trying to block certain parts. This idea instead, only retrieve certain parts and in addition store it in certain lucene field (i havent looked into the Solr implementation yet) thus automatically discarding the rest.

        Show
        Ammar Shadiq added a comment - >>4.With regards to the biggest technical challenge being the processing of web page's, how far did you get with this? We're you able to process it with enough precision to satisfy your requirements? I get it to work for my text clustering algorithm, the application screenshoot provided here: http://www.facebook.com/media/set/?set=a.2075564646205.124550.1157621543&type=3&l=7313965254\ . Yes, it's quite satisfactory. >> 5. How were you querying it with XPath? You cannot query with XPath, but instead with XQuery. Do you maybe mean that this enabled you to navigate the document and address various parts of it is XPath? In my understanding there are 3 ways to query an XML document, that is using XPath, XQuery and XLST, I'm sorry if i get it wrong. For navigating various parts of the page i uses java HTML parse lister extending HTMLEditorKit.ParserCallback and then displaying it on the editor application (some kind of chromium Inspect element), this makes the web page structure visible and thus making the XPath expression easier to make. >> 6. Ok I understand why it has crumbled slightly, but I think if the code is there is would be a huge waster if we didn't try to revive it, possibly getting it integrated into the code base, and maybe getting it added as a contrib component but not shipping it within the core codebase if the former was not a viable option. I totally agree As for Nutch 585, i think it's different in the idea that is Nutch 585 trying to block certain parts. This idea instead, only retrieve certain parts and in addition store it in certain lucene field (i havent looked into the Solr implementation yet) thus automatically discarding the rest.
        Hide
        Ammar Shadiq added a comment -

        upload latest version, worked on 1.2

        Show
        Ammar Shadiq added a comment - upload latest version, worked on 1.2
        Hide
        Lewis John McGibbney added a comment -

        Great Ammar. Are you wanting to add this as a GSoC2012 project? I am already mentoring one project, and time/work restrictions mean that I can't step up to take on another mentoring role. If you don't wish to make this a project this year, at least the code is on here for guys to pick it up in the future.

        Show
        Lewis John McGibbney added a comment - Great Ammar. Are you wanting to add this as a GSoC2012 project? I am already mentoring one project, and time/work restrictions mean that I can't step up to take on another mentoring role. If you don't wish to make this a project this year, at least the code is on here for guys to pick it up in the future.
        Hide
        Ammar Shadiq added a comment -

        I don't think i could participate in this year GSoC Lewis, my status is not a student anymore. I put it here so it could be freely used/developed further by anyone.

        cheers
        Ammar

        Show
        Ammar Shadiq added a comment - I don't think i could participate in this year GSoC Lewis, my status is not a student anymore. I put it here so it could be freely used/developed further by anyone. cheers Ammar
        Hide
        Lewis John McGibbney added a comment -

        This is as I thought. Look I've marked it for this years GSoC, students can apply up until April 6th iirc so if there is any interest then we can progress with it. Thanks Ammar

        Show
        Lewis John McGibbney added a comment - This is as I thought. Look I've marked it for this years GSoC, students can apply up until April 6th iirc so if there is any interest then we can progress with it. Thanks Ammar
        Hide
        Ammar Shadiq added a comment -

        Sweet, thanks Lewis.

        Show
        Ammar Shadiq added a comment - Sweet, thanks Lewis.
        Hide
        Lewis John McGibbney added a comment -

        Set and Classify

        Show
        Lewis John McGibbney added a comment - Set and Classify
        Hide
        Emmanuel Colin added a comment -

        Hi everyone,

        I was interested in such a plugin and found a working implementation, described here: http://www.atlantbh.com/precise-data-extraction-with-apache-nutch/

        Maybe it would be a good candidate for integration to the main distribution? (I think adressing this problematic would be pretty useful: as soon as one needs to crawl a specific subset - say, an intranet - one wants to index specific information for better search)

        Show
        Emmanuel Colin added a comment - Hi everyone, I was interested in such a plugin and found a working implementation, described here: http://www.atlantbh.com/precise-data-extraction-with-apache-nutch/ Maybe it would be a good candidate for integration to the main distribution? (I think adressing this problematic would be pretty useful: as soon as one needs to crawl a specific subset - say, an intranet - one wants to index specific information for better search)
        Hide
        Lewis John McGibbney added a comment -

        Hi Emmanuel, do you wish to address this issue. From memory, the dilemma we are having with this issue is to make it configurable enough for general use. Do you have some suggestion(s)?

        Show
        Lewis John McGibbney added a comment - Hi Emmanuel, do you wish to address this issue. From memory, the dilemma we are having with this issue is to make it configurable enough for general use. Do you have some suggestion(s)?
        Hide
        Emmanuel Colin added a comment -

        Hi Lewis, thanks for your reply.

        In fact I was suggesting that - in my opinion - the issue has been (very well) addressed by the filter-xpath plugin described in the blog entry I pointed to. Its approach for configuration leveraging the power of XPath queries is pretty flexible (and well described by comments in the config file). I also tested the plugin and it works fine.

        What I would add for some more flexibility on the 'field' configuration markup is: a 'locale' attribute to enable parsing of non English date formats; a 'fixedValue' attribute enabling to index fixed values into Solr for better categorization when indexing specific datasets; a 'regexPattern' and a 'regexReplace' attributes to enable some regex based postprocessing of the data extracted (for example to remove some unwanted prefix before the interesting data).

        With this, the plugin would be pretty powerful and useful to anyone wanting to extract information from semistructured websites without writing their own plugin.

        Show
        Emmanuel Colin added a comment - Hi Lewis, thanks for your reply. In fact I was suggesting that - in my opinion - the issue has been (very well) addressed by the filter-xpath plugin described in the blog entry I pointed to. Its approach for configuration leveraging the power of XPath queries is pretty flexible (and well described by comments in the config file). I also tested the plugin and it works fine. What I would add for some more flexibility on the 'field' configuration markup is: a 'locale' attribute to enable parsing of non English date formats; a 'fixedValue' attribute enabling to index fixed values into Solr for better categorization when indexing specific datasets; a 'regexPattern' and a 'regexReplace' attributes to enable some regex based postprocessing of the data extracted (for example to remove some unwanted prefix before the interesting data). With this, the plugin would be pretty powerful and useful to anyone wanting to extract information from semistructured websites without writing their own plugin.
        Hide
        Lewis John McGibbney added a comment -

        It sounds like this is nearly read for a review. I would suggest you to please incorporate your suggestions into a fresh patch against the head version you are working on. Is it the 2.x branch or 1.x trunk?
        As you said, it always seems that a plugin of this nature is sought after, therefore a reasonably documented and easily configurable implementation would be very welcome.

        Show
        Lewis John McGibbney added a comment - It sounds like this is nearly read for a review. I would suggest you to please incorporate your suggestions into a fresh patch against the head version you are working on. Is it the 2.x branch or 1.x trunk? As you said, it always seems that a plugin of this nature is sought after, therefore a reasonably documented and easily configurable implementation would be very welcome.
        Hide
        Emmanuel Colin added a comment - - edited

        It looks like there is a small misunderstanding, I must not have expressed myself very clearly: I am not the author of this plugin.
        The author of the code is the writer of the blog post I pointed to, so I would not presume to submit his code
        What I can do is leave a comment on his blog suggesting him to submit his code, since from our exchange it looks like such a submission would be welcome. (EDIT : done)

        Show
        Emmanuel Colin added a comment - - edited It looks like there is a small misunderstanding, I must not have expressed myself very clearly: I am not the author of this plugin. The author of the code is the writer of the blog post I pointed to, so I would not presume to submit his code What I can do is leave a comment on his blog suggesting him to submit his code, since from our exchange it looks like such a submission would be welcome. (EDIT : done)
        Hide
        Lewis John McGibbney added a comment -

        +1 Emmanuel

        Show
        Lewis John McGibbney added a comment - +1 Emmanuel

          People

          • Assignee:
            Chris A. Mattmann
            Reporter:
            Ammar Shadiq
          • Votes:
            3 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Time Tracking

              Estimated:
              Original Estimate - 1,680h
              1,680h
              Remaining:
              Remaining Estimate - 1,680h
              1,680h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development