Uploaded image for project: 'Apache Any23 (Retired)'
  1. Apache Any23 (Retired)
  2. ANY23-65

Update to RDFa extraction stylesheet

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.7.0
    • 1.0
    • core

    Description

      The RDFa 1.1 Core specification requests namespace prefixes in HTML5 be put in a "prefix" attribute like this: "ns1: http://example.org/ ns2: http://example.com/"

      My sample HTML page has this, but Sindice, which uses Any23, didn't read my namespace correctly. I narrowed it down to, and changed accordingly, the XSLT template "tokenize2" in the rdfa.xslt stylesheet. The template expected "ns1:http://example.org/ ns2:http://example.com/" (no spaces between prefix and namespace URI) and did not normalize whitespace, like linebreaks (although I'm not sure that broke the functionality).

      I use Any23 0.6.1 locally, but http://svn.apache.org/viewvc/incubator/any23/trunk/core/src/main/resources/org/apache/any23/extractor/rdfa/rdfa.xslt?revision=1231556&view=markup shows that the template is the same in the trunk.

      A possible problem may be that the new template will not accept the non-spaced namespace definitions, like you can find in the RDFa produced by Best Buy. A further improvement to my template may be accepting both namespace definitions with spaces and the ones without.

      Attachments

        1. rdfa.xslt
          37 kB
          Ben Companjen
        2. rdfa-11-curies-a.html
          0.8 kB
          Ben Companjen
        3. stylesheet.patch
          3 kB
          Ben Companjen
        4. stylesheet3.patch
          5 kB
          Ben Companjen
        5. test.patch
          2 kB
          Ben Companjen

        Issue Links

          Activity

            People

              Unassigned Unassigned
              bencomp Ben Companjen
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 3h
                  3h
                  Remaining:
                  Remaining Estimate - 3h
                  3h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified