Uploaded image for project: 'Apache Any23 (Retired)'
  1. Apache Any23 (Retired)
  2. ANY23-18

Add a new extractor for RDFa using java-rdfa

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Won't Fix
    • 0.7.0
    • 0.7.0
    • core

    Description

      I wonder if it is possible to add a new RDFa extractor which uses java-rdfa [1].

      java-rdfa is (according to its creator, Damian Steer ) "the cruftiest RDFa parser in the world" (and he is probably right!). java-rdfa is currently passing all conformance tests for XHTML, and the HTML 4 and 5 tests with one exception [2]. An online service|demo [3] is also available. java-rdfa, as far as I understand, is currently licensed with a BSD license. The Maven artifacts are available in the Maven central repository [4].

      From my little understanding of Any23, in order to do this one needs to implement BlindExtractor (which extends Extractor<URI>) and ContentExtractor (which extends Extractor<InputStream>).

      See also: [5].

      [1] https://github.com/shellac/java-rdfa
      [2] http://github.com/shellac/java-rdfa/issues#issue/15
      [3] http://rdf-in-html.appspot.com/
      [4] http://repo1.maven.org/maven2/net/rootdev/java-rdfa/
      [5] https://github.com/shellac/java-rdfa/issues/35

      Attachments

        Activity

          People

            michele.mostarda Michele Mostarda
            castagna Paolo Castagna
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 8h
                8h
                Remaining:
                Remaining Estimate - 8h
                8h
                Logged:
                Time Spent - Not Specified
                Not Specified

                Slack

                  Issue deployment