Uploaded image for project: 'Apache Any23 (Retired)'
  1. Apache Any23 (Retired)
  2. ANY23-291

JSON-LD should be looked up in entire HTML document, not just in <head>

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.2
    • 2.2
    • extractors
    • None

    Description

      In org.apache.any23.extractor.html.EmbeddedJSONLDExtractor.extractJSONLDScript(), I think this line :

      List<Node> scriptNodes = DomUtils.findAll(in, "/HTML/HEAD/SCRIPT");

      is too restrictive. scripts containing json-ld can be placed anywhere in the page, and actually some CMS/Wordpress plugin inserting JSON-LD are generating their output in the body, not in the head.

      Attachments

        1. example-embedded-jsonld.html
          2 kB
          Thomas Francart

        Issue Links

          Activity

            People

              hansbrende Hans Brende
              thomas.francart Thomas Francart
              Votes:
              2 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: