Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-760

NPE XHTMLContentHandler in characters Method

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.10
    • 1.1
    • parser
    • None
    • JDK 1.6, Linux

    Description

      The method:

      public void characters(String characters) throws SAXException

      { characters(characters.toCharArray(), 0, characters.length()); }

      does not check for null values.
      On many code references a check is done "before" calling this methd. However on other sides, e.g. HSLFExtractor some values are not checked:

      xhtml.characters( comment.getAuthor() );

      which may be null.

      The simplest fix would be to check for null on the handler and if it is null handle it as NOOP or insert the new UTF-8 "replacement char" to let the user decide.

      Attachments

        Activity

          People

            Unassigned Unassigned
            tkrah Torsten Krah
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: