Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2337

urlnormalizer-basic to strip empty port

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 2.3.1, 1.12
    • 2.4, 1.13
    • plugin
    • None
    • Patch Available
    • Patch

    Description

      Basic URL normalizer should strip an empty port from the URL, that's not the case at present:

      echo "http://example.com:/" \
         | nutch plugin urlnormalizer-basic org.apache.nutch.net.urlnormalizer.basic.BasicURLNormalizer
      http://example.com:/
      

      The result should be http://example.com/

      Attachments

        Issue Links

          Activity

            People

              snagel Sebastian Nagel
              snagel Sebastian Nagel
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: