Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-870

Allow to use call parseToString with a additional parameter of MaxStringLength, so it can be changed per call

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.2
    • None
    • None

    Description

      It would be great to be able to call parseToString with an additional parameter of the maxStringLength, instead of having to set it on the Tika instance. This allows to set it per parse call. Sample code:

      public String parseToString(InputStream stream, Metadata metadata, int maxStringLength)
              throws IOException, TikaException {
          WriteOutContentHandler handler =
              new WriteOutContentHandler(maxStringLength);
          try {
              ParseContext context = new ParseContext();
              context.set(Parser.class, parser);
              parser.parse(
                      stream, new BodyContentHandler(handler), metadata, context);
          } catch (SAXException e) {
              if (!handler.isWriteLimitReached(e)) {
                  // This should never happen with BodyContentHandler...
                  throw new TikaException("Unexpected SAX processing failure", e);
              }
          } finally {
              stream.close();
          }
          return handler.toString();
      }
      

      Attachments

        1. TIKA-870.patch
          6 kB
          Michael McCandless

        Activity

          People

            mikemccand Michael McCandless
            kimchy Shay Banon
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: