Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2550

ToTextHandler includes <style/> element content

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Trivial
    • Resolution: Fixed
    • None
    • 2.0.0, 1.20
    • None
    • None

    Description

      When using the ToTextHandler to process .java files, the <style/> element content is included, e.g.:

      testFile
      code {
      color: rgb(0,0,0); font-family: monospace; font-size: 12px; white-space: nowrap;
      }
      .java_plain {
      color: rgb(0,0,0);
      }
      .java_keyword {
      color: rgb(0,0,0); font-weight: bold;
      }
      .java_javadoc_tag {
      color: rgb(147,147,147); background-color: rgb(247,247,247); font-style: italic; font-weight: bold;
      }
      h1 {
      font-family: sans-serif; font-size: 16pt; font-weight: bold; color: rgb(0,0,0); background: rgb(210,210,210); border: solid 1px black; padding: 5px; text-align: center;
      }
      .java_type {
      color: rgb(0,44,221);
      }
      .java_literal {
      color: rgb(188,0,0);
      }
      .java_javadoc_comment {
      color: rgb(147,147,147); background-color: rgb(247,247,247); font-style: italic;
      }
      .java_operator {
      color: rgb(0,124,31);
      }
      .java_separator {
      color: rgb(0,33,255);
      }
      .java_comment {
      color: rgb(147,147,147); background-color: rgb(247,247,247);
      }
      
      testFile/*************************************************************************
       *  Compilation:  javac HelloWorld.java
       *  Execution:    java HelloWorld
       *
       *  Prints "Hello, World". By tradition, this is everyone's first program.
       *
       *************************************************************************/
      
      public class HelloWorld {
          public static void main(String[] args) {
              System.out.println("Hello, World");
          }
      
      }
      
      

      Is this what we want as the default behavior?

      Attachments

        Activity

          People

            tallison Tim Allison
            tallison Tim Allison
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: