I have further developed the code, which was once part of NutchBase for handling object to hbase mapping into a new project as per the above discussion.
The project is named Gora, and it is hosted at GitHub.
The project is hosted at
A short design document is at http://wiki.github.com/enis/gora/design, and a quick start guide is at http://wiki.github.com/enis/gora/quick-start.
You can check out the code using
$ git clone git://github.com/enis/gora.git
What it means for Nutch?
Gora started as a part of Dogacan's NutchBase implementation, but the goals for the project are clearly different. However, Gora is primarily developed to handle Nutch's use cases. Specifically, Gora will handle the HBase integration layer for nutchbase, and later a Hadoop Mapfile or TFile based persistency will be developed.
In the short term, we plan to use Gora's artifacts as a library in Nutch. Either me or Dogacan will switch the current NutchBase branch to using Gora shortly.
Gora is still in very early stages and needs your support. We would be more than happy if the Nutch community could share comments, feedbacks, use cases and feature requests, or even patches. I suppose we can use this issue or the mailing list for this task.