Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
1.19
-
None
Description
There is a large degree of variability when we set the job name{}
Job job = NutchJob.getInstance(getConf());
job.setJobName("read " + segment);
Some examples mention the job name, others don't. Some use upper case, others don't, etc.
I think we can standardize the NutchJob job names. This would help when filtering jobs in YARN ResourceManager UI as well.
I propose we implement the following convention
- Nutch (mandatory) - static value which prepends the job name, assists with distinguishing the Job as a NutchJob and making it easily findable.
- ${ClassName} (mandatory) - literally the name of the Class the job is encoded in
- ${additional info} (optional) - value could further distinguish the type of job (LinkRank Counter, LinkRank Initializer, LinkRank Inverter, etc.)
Nutch ${ClassName}: ${additional info}
Examples:
- Nutch LinkRank: Inverter
- Nutch CrawlDb: + $crawldb
- Nutch LinkDbReader: + $linkdb
Thanks for any suggestions/comments.
Attachments
Issue Links
- links to