Nutch
  1. Nutch
  2. NUTCH-1074

topN is ignored with maxNumSegments

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.3
    • Fix Version/s: 1.4
    • Component/s: generator
    • Labels:
      None

      Description

      When generating segments with topN and maxNumSegments, topN is not respected. It looks like the first generated segment contains topN * maxNumSegments of URLs's, at least the number of map input records roughly matches.

        Issue Links

          Activity

          Markus Jelsma created issue -
          Markus Jelsma made changes -
          Field Original Value New Value
          Link This issue is related to NUTCH-762 [ NUTCH-762 ]
          Markus Jelsma made changes -
          Assignee Markus Jelsma [ markus17 ]
          Robert Thomson made changes -
          Attachment generator_fix.patch [ 12496218 ]
          Markus Jelsma made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Markus Jelsma made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              Markus Jelsma
              Reporter:
              Markus Jelsma
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development