Uploaded image for project: 'Apache Any23 (Retired)'
  1. Apache Any23 (Retired)
  2. ANY23-240

Option to process html tags as spaces in Microdata

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.3
    • 2.3
    • extractors, microdata
    • None

    Description

      When extracting Microdata from html pages, any23 silently drops all html tags inside predicates' values. See, for example, http://schema.org/Recipe/ingredients at http://kuking.net/3_2070.htm.
      The problem is that on this page (and many others) ingredients are separated from each other only with '<br>' tag. After any23 drops it, the content becomes mixed and unintelligible. At the same time, Google Structured Data Testing Tool separates them properly with spaces.

      Is it possible to implement this behavior (replacing <br> tags with spaces) in any23 as an option?

      Attachments

        Activity

          People

            hansbrende Hans Brende
            akutuzov Andrey Kutuzov
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: