Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2772

Debugging parse filter to show serialized DOM tree

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Implemented
    • Affects Version/s: 1.16
    • Fix Version/s: 1.17
    • Component/s: parser, plugin
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      A tool to show the DOM tree (eg. serialized as XML/HTML) might be helpful for debugging, eg., see NUTCH-2769. The DOM tree is available in the parse plugins and is also passed to the HtmlParseFilter plugins. We could provide a parsefilter-debug plugin which logs the DOM tree and add the serialized string representation to the parse data.

        Attachments

          Activity

            People

            • Assignee:
              snagel Sebastian Nagel
              Reporter:
              snagel Sebastian Nagel

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment