Nutch
  1. Nutch
  2. NUTCH-841

Create a Wicket-based Web Application for Nutch

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: nutchgora
    • Fix Version/s: 2.4
    • Component/s: web gui
    • Labels:
    • Environment:

      Should work in both Nutch trunk and 2.0 branches.

      Description

      In light of the conversation on NUTCH-837, we are removing the old Nutch webapp and will replace it with a 2.0 one that works with GORA + Solr.

      Apache Nutch versions prior to 1.3 used to ship with a web application that allowed basic search, and browse of the information captured in the Nutch index. Since 1.3, we deprecated and removed the webapp mainly due to the fact that the segment API changed (we moved to Solr), and also due to the fact that we didn't want to maintain a webapp b/c those JSPs were a pain.

      I am going to propose having a Nutch web application using Apache Wicket http://wicket.apache.org/. This would be very cool and since I know Wicket, I'm willing to help maintain it.

      The webapp should implement all of the old web pages and functionality, and also should support the basic views, and connection to Solr instead of to Lucene, and of should also consider both the trunk branch, and the 2.0 branch (Gora based).

      I'm putting this out there as a GSoC project for 2013.

        Issue Links

          Activity

          Hide
          Lewis John McGibbney added a comment -

          Hey Fjodor Vershinin, this is excellent news. Chris A. Mattmann and myself are interested in kicking this project off so we need to decide between ourselves the mentoring position. We will do this and get back to you

          ...where I can start to explore Nutch

          Please sign up to user@nutch and dev@nutch mailing lists. This will keep you up-to-date will everything that is going on.
          Finally, please see the official tutorial [0] for running Nutch and learning about its design and caveats.

          [0] https://wiki.apache.org/nutch/NutchTutorial

          Show
          Lewis John McGibbney added a comment - Hey Fjodor Vershinin , this is excellent news. Chris A. Mattmann and myself are interested in kicking this project off so we need to decide between ourselves the mentoring position. We will do this and get back to you ...where I can start to explore Nutch Please sign up to user@nutch and dev@nutch mailing lists. This will keep you up-to-date will everything that is going on. Finally, please see the official tutorial [0] for running Nutch and learning about its design and caveats. [0] https://wiki.apache.org/nutch/NutchTutorial
          Hide
          Fjodor Vershinin added a comment -

          Hello all!
          My name is Fedor Vershinin, and I study computer science in Tallinn University of Technology in Estonia.
          My brother shared some info about Google Summer of Code and I see that ASF takes part in this project, so I’d decided to get in touch. I am quite interested in development using Apache Wicket framework, and even more interested in contribution to open-source projects.
          My background: some Java and python development, can use git/mercurial, eclipse, maven, restful, html, SQL and so on. I dont have much experience, but I have desire to learn.
          Also, brother said he can help me with initial setup, brief picture of whole project and some architectural decisions.
          So, it would be nice if you give me some advice, where I can start to explore Nutch, and I wish to join your community in future.
          Best regards,
          Fedor

          Show
          Fjodor Vershinin added a comment - Hello all! My name is Fedor Vershinin, and I study computer science in Tallinn University of Technology in Estonia. My brother shared some info about Google Summer of Code and I see that ASF takes part in this project, so I’d decided to get in touch. I am quite interested in development using Apache Wicket framework, and even more interested in contribution to open-source projects. My background: some Java and python development, can use git/mercurial, eclipse, maven, restful, html, SQL and so on. I dont have much experience, but I have desire to learn. Also, brother said he can help me with initial setup, brief picture of whole project and some architectural decisions. So, it would be nice if you give me some advice, where I can start to explore Nutch, and I wish to join your community in future. Best regards, Fedor
          Hide
          Chris A. Mattmann added a comment -

          Yuan Yun : yes we should expose and leverage Nutch REST APIs, and extend them using JAX-RS.

          Show
          Chris A. Mattmann added a comment - Yuan Yun : yes we should expose and leverage Nutch REST APIs, and extend them using JAX-RS.
          Hide
          Chris A. Mattmann added a comment -

          Thanks Ivan. Unfortunately the deadline to participate in GSoC 2013 is behind us.

          However if you are still interested in the project, you are welcome to work on it just not through GSoC.

          Show
          Chris A. Mattmann added a comment - Thanks Ivan. Unfortunately the deadline to participate in GSoC 2013 is behind us. However if you are still interested in the project, you are welcome to work on it just not through GSoC.
          Hide
          Ivan Vershinin added a comment -

          Hi Chris,
          I am student from Estonia (Tartu University). I have experience in Java web application development.
          Tools and frameworks: Wicket, Spring, JUnit, Mockito, RESTful services, git, mercurial, linux.
          I am looking forward to participate in this project during Google Summer of Code 2013.
          Could you give me some advice concerning next steps to continue proposal?

          Thanks,
          Ivan

          Show
          Ivan Vershinin added a comment - Hi Chris, I am student from Estonia (Tartu University). I have experience in Java web application development. Tools and frameworks: Wicket, Spring, JUnit, Mockito, RESTful services, git, mercurial, linux. I am looking forward to participate in this project during Google Summer of Code 2013. Could you give me some advice concerning next steps to continue proposal? Thanks, Ivan
          Hide
          yuanyun.cn added a comment -

          As usually Nutch and the web application which manages Nutch are not in same machine. So could we expose some Rest API to call Nutch to crawl webpages or other tasks: just like call nutch/crawl scripts remotely? Users are able to enable or disable these functions.

          Thanks...

          Show
          yuanyun.cn added a comment - As usually Nutch and the web application which manages Nutch are not in same machine. So could we expose some Rest API to call Nutch to crawl webpages or other tasks: just like call nutch/crawl scripts remotely? Users are able to enable or disable these functions. Thanks...
          Hide
          Chris A. Mattmann added a comment -

          Yep not a blocker!

          Show
          Chris A. Mattmann added a comment - Yep not a blocker!
          Hide
          Lewis John McGibbney added a comment -

          unfortunately the links provided (although they would have been terribly outdated anyway) at the bottom of the wiki entry either return 404's or else the tar files seem to be corrupted. Quite a shame, however I think although this is a blocker, it will be one of the latter tasks which needs to be addressed prior to a 2.0 release.

          Show
          Lewis John McGibbney added a comment - unfortunately the links provided (although they would have been terribly outdated anyway) at the bottom of the wiki entry either return 404's or else the tar files seem to be corrupted. Quite a shame, however I think although this is a blocker, it will be one of the latter tasks which needs to be addressed prior to a 2.0 release.
          Hide
          Chris A. Mattmann added a comment -

          Yep, agreed. Thanks for the reminder Lewis!

          Show
          Chris A. Mattmann added a comment - Yep, agreed. Thanks for the reminder Lewis!
          Hide
          Lewis John McGibbney added a comment -

          Seems to be the best existing resource from our wiki.

          http://wiki.apache.org/nutch/NutchAdministrationUserInterface

          Show
          Lewis John McGibbney added a comment - Seems to be the best existing resource from our wiki. http://wiki.apache.org/nutch/NutchAdministrationUserInterface

            People

            • Assignee:
              Chris A. Mattmann
              Reporter:
              Chris A. Mattmann
            • Votes:
              6 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:

                Development