Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1165

Autodetect and parse Asciidoc

    XMLWordPrintableJSON

Details

    Description

      When parsing asciidoc metadata, we currently get the following:

      Content-Encoding: ISO-8859-1
      Content-Length: 66363
      Content-Type: text/plain; charset=ISO-8859-1
      resourceName: asciidoc.adoc
      

      Steps to reproduce:

      asciidoc.sh
      curl https://raw.github.com/asciidoctor/asciidoctor.org/master/docs/asciidoc-syntax-quick-reference.adoc -O -s
      java -jar tika-app-1.4.jar -m asciidoc-syntax-quick-reference.adoc
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            dadoonet David Pilato
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: