Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-3014

Standardize Job names

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 1.19
    • 1.20
    • configuration, runtime
    • None

    Description

      There is a large degree of variability when we set the job name{}

       

      Job job = NutchJob.getInstance(getConf());

      job.setJobName("read " + segment);

       

      Some examples mention the job name, others don't. Some use upper case, others don't, etc.

      I think we can standardize the NutchJob job names. This would help when filtering jobs in YARN ResourceManager UI as well.

      I propose we implement the following convention

      • Nutch (mandatory) - static value which prepends the job name, assists with distinguishing the Job as a NutchJob and making it easily findable.
      • ${ClassName} (mandatory) - literally the name of the Class the job is encoded in
      • ${additional info} (optional) - value could further distinguish the type of job (LinkRank Counter, LinkRank Initializer, LinkRank Inverter, etc.)

      Nutch ${ClassName}: ${additional info}

      Examples:

      • Nutch LinkRank: Inverter
      • Nutch CrawlDb: + $crawldb
      • Nutch LinkDbReader: + $linkdb

      Thanks for any suggestions/comments.

      Attachments

        Issue Links

          Activity

            People

              lewismc Lewis John McGibbney
              lewismc Lewis John McGibbney
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: