Description
See also: https://issues.apache.org/jira/browse/NUTCH-907
The StorageUtils class exposes a createDataStore method which uses the default schema for a persistent class specified in the Gora configuration.
This method ignores Nutch' storage.schema property and the notion of a crawlId.
Two tools use this method instead of the createWebStore method (which does support the storage.schema property and a crawlId):
o.a.n.indexer.IndexerReducer (IndexerJob)
o.a.n.util.domain.DomainStatistics
I propose that these two start using the createWebStore method and that we make remove the createDataStore method from the StorageUtils.
Also, these two tools should support the crawlId command line parameter.
Attachments
Attachments
Issue Links
- is superceded by
-
NUTCH-882 Design a Host table in GORA
- Closed