Uploaded image for project: 'Apache Any23 (Retired)'
  1. Apache Any23 (Retired)
  2. ANY23-165

"Invalid content" error if TITLE precedes encoding declaration in the document

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 0.8.0
    • 0.9.0
    • encoding
    • Linux 2.6.18-308.11.1.el5 #1 SMP Tue Jul 10 08:48:43 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux

    Description

      When any23 is asked to extract semantics from a web document which is not in UTF-8 and where TITLE precedes encoding declaration, any23 fails with error "Invalid content '"
      Example of such an URL:
      http://www.kinopoisk.ru/film/565993/
      Compressed dump of this page is attached.

      any23 http://www.kinopoisk.ru/film/565993/
      SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
      SLF4J: Defaulting to no-operation (NOP) logger implementation
      SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

      ------------------------------------------------------------------------
      Apache Any23 :: rover
      ------------------------------------------------------------------------

      @prefix dcterms: <http://purl.org/dc/terms/> .

      <http://www.kinopoisk.ru/film/565993/> dcterms:title "Ïèðàíüè 3DD" .

      ------------------------------------------------------------------------
      Apache Any23 FAILURE

      Execution terminated with errors: Invalid content ''

      Total time: 1s
      Finished at: Mon Jul 15 20:31:14 MSK 2013
      Final Memory: 67M/479M
      ------------------------------------------------------------------------

      Attachments

        1. kinopoisk.html.gz
          51 kB
          Andrey Kutuzov

        Issue Links

          Activity

            People

              Unassigned Unassigned
              akutuzov Andrey Kutuzov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Slack

                  Issue deployment