Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.18
    • Fix Version/s: 1.19
    • Component/s: dmoz, generator, indexer, segment
    • Labels:
      None

      Description

      In class org.apache.nutch.crawl.Generator
      In method org.apache.nutch.crawl.Generator.partitionSegment(Path, Path, int)
      Called method java.util.Random.nextInt()
      At Generator.java:[line 1016]
      Random object created and used only once in org.apache.nutch.crawl.Generator.partitionSegment(Path, Path, int)

      This code creates a java.util.Random object, uses it to generate one random number, and then discards the Random object. This produces mediocre quality random numbers and is inefficient. If possible, rewrite the code so that the Random object is created once and saved, and each time a new random number is required invoke a method on the existing Random object to obtain it.

      If it is important that the generated Random numbers not be guessable, you must not create a new Random for each random number; the values are too easily guessable. You should strongly consider using a java.security.SecureRandom instead (and avoid allocating a new SecureRandom for each random number needed).

      This bad practice also affects the following

      org.apache.nutch.indexer.IndexingJob since first historized release
      org.apache.nutch.segment.SegmentReader since first historized release
      org.apache.nutch.tools.DmozParser$RDFProcessor since first historized release

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                lewismc Lewis John McGibbney
                Reporter:
                lewismc Lewis John McGibbney
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: