Issue Details (XML | Word | Printable)

Key: XMLBEANS-226
Type: Bug Bug
Status: Open Open
Priority: Major Major
Assignee: Unassigned
Reporter: Peter lynch
Votes: 8
Watchers: 5
Operations

If you were logged in you would be able to see more operations.
XMLBeans

Exception "Unexpected end of file after null"

Created: 18/Nov/05 05:26 AM   Updated: 04/Apr/08 07:02 PM
Return to search
Component/s: DOM
Affects Version/s: Version 2, Version 2.1
Fix Version/s: None

Time Tracking:
Not Specified


 Description  « Hide
The problem is best described here:
http://www.mail-archive.com/user%40xmlbeans.apache.org/msg00850.html

Additionally I will note that the identical problem happens with Tomcat 5.5.12 ( instead of Jetty). It is always reproducible.
Using an InputStream or a BufferedReader.

I'd prefer to use Piccolo since it is faster but it seems the safeset thing to do is use another parser entirely until the problem is fixed.

So that searches in Jira are easier, I will paste the first part of the thread here as well:


------------ START http://www.mail-archive.com/user%40xmlbeans.apache.org/msg00850.html --------
Hi,

I am trying to upgrade my project which uses XMLBeans v1 to XMLBeans v2.

I have the following situation: a client (using commons-httpclient)
posts XML to a webserver (jetty), where the posted XML is parsed using
something like:

SomeXmlBeansGeneratedClass.Factory.parse(request.getInputStream());

After upgrading to XMLBeans v2, this gives the following exception on
every other request:

org.xml.sax.SAXParseException: Unexpected end of file after null
org.apache.xmlbeans.impl.piccolo.xml.Piccolo.reportFatalError(Piccolo.java:1038)
org.apache.xmlbeans.impl.piccolo.xml.Piccolo.parse(Piccolo.java:723)
org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3354)
org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1267)
org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1254)
org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:345)
org.outerx.daisy.x10Publisher.PublisherRequestDocument$Factory.parse(Unknown
Source)
org.outerj.daisy.publisher.serverimpl.PublisherHttpConnector$PublisherHttpHandler.handle(PublisherHttpConnector.java:115)
org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
org.mortbay.http.HttpServer.service(HttpServer.java:954)
org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)

Thus the first request is OK, the second one gives this exception, the
third one is OK again, the fourth one again gives this exception etc.

After some investigation, I have tracked down the problem to Piccolo
which only closes the InputStream of the previous parse when doing a new
parse, i.e. the following in the class Piccolo:

public void parse(InputSource source) throws IOException, SAXException {
    try {
        reset();
        validateParseState();
        try {
            docEntity.reset(source);
            lexer.reset(docEntity);

whereby the docEntity.reset method does the close.

I tried to fix this by doing a docEntity.close() in the finally.
However, this then causes an NPE in PiccoloSaxLoader.postLoad where it
tries to get the encoding and version from the piccolo parser after the
parse finished. After temporarily disabling these lines, I found that
everything worked OK and I did not have the above exception anymore.

The reason I get this problem is probably specific to the Jetty
situation, as Jetty seems to reuse the same InputStream object between
different requests, and I could work around it by wrapping Jetty's input
stream in a custom input stream which ignores additional close calls,
but it would be nice if this was fixed in XMLBeans. I assume a user can
expect that XMLBeans does not keep references to the inputstream after
the parse finished.

Thanks in advance,

Bruno.

--
Bruno Dumon http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
[EMAIL PROTECTED] [EMAIL PROTECTED]


------------ END http://www.mail-archive.com/user%40xmlbeans.apache.org/msg00850.html --------

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Peter lynch added a comment - 18/Nov/05 06:17 AM
One workaround as stated in the original post is to wrap the BufferedReader in the request with a "NonClosingBufferedReader". Override the close() method to do a noop and in the servlet finally block call a 'forceClose()' to clean things up in the servlet ( which calls the real BufferedReader.close() method ).


Chris Hagmann added a comment - 20/Jan/06 04:37 AM
Exactly the same issue here with JAXP 1.3, Tomcat 5.5.12.

I'm not sure about the work around. It sounds to me as if that results in memory leaks ...

Andy DePue added a comment - 14/Feb/06 04:15 AM
I'm running into this same issue. I have code that assumes the InputStream is not retained and ensures it is closed once parsing has finished, ultimately leading to an exception the next time something is parsed.

Radu Preotiuc-Pietro added a comment - 15/Feb/06 05:50 AM
I think this one is better left for the Piccolo guys to deal with. I guess though, that I can't see how wrapping it into an "unclosable" stream would lead to more memory leaks than you would have should Piccolo not close the stream.
Also, it feels to me that, in some sense, Tomcat could take care that a "close" does the right thing, so that the fact that the streams are in reality shared is transparent to programmers.

Would you feel better if XmlBeans had an option to wrap any streams that it uses in "unclosable" streams?

Andy DePue added a comment - 15/Feb/06 06:53 AM
I don't know much about Piccolo, but according to what I'm seeing (especially the call stack), it does appear to be their problem to fix. I should note that in my case I'm not reusing streams, but simply ensuring they are closed before returning to the user. I happen to be using a custom InputStream (connected to our back end document repository) that, once close() has been called, any further invocations to any method throw an IOException. My code does something similar to this:
<code><pre>
try {
  XmlObject xo = factoryclass.parse(inputStream);
  ...
} finally {
  inputStream.close();
}
</pre></code>
The first time through, everything works fine. However, the second time through (which could be any amount of time later depending on user activity, in a completely separate transaction for another user altogether), Piccolo will attempt to call close() on the previous InputStream! Somewhere it is maintaining a static reference to it and then calling close() <i>way</i> after the fact. Since I didn't have time to dig through Piccolo or XMLBeans source code, I just hacked together a quick and dirty workaround:

<code><pre>
  public static class XmlBeansWorkaroundInputStream extends InputStream
  {
    private final WeakReference<InputStream> in;

    public XmlBeansWorkaroundInputStream(final InputStream in)
    {
      this.in = new WeakReference<InputStream>(in);
    }
    
    protected InputStream getIn()
    {
      return this.in.get();
    }


    @Override
    public int read() throws IOException
    {
      final InputStream in = getIn();
      if(in != null) {
        return in.read();
      } else {
        return -1;
      }
    }

    @Override
    public int read(byte b[]) throws IOException
    {
      final InputStream in = getIn();
      if(in != null) {
        return in.read(b);
      } else {
        return -1;
      }
    }

    @Override
    public int read(byte b[], int off, int len) throws IOException
    {
      final InputStream in = getIn();
      if(in != null) {
        return in.read(b, off, len);
      } else {
        return -1;
      }
    }

    @Override
    public long skip(long n) throws IOException
    {
      final InputStream in = getIn();
      if(in != null) {
        return in.skip(n);
      } else {
        return 0;
      }
    }

    @Override
    public int available() throws IOException
    {
      final InputStream in = getIn();
      if(in != null) {
        return in.available();
      } else {
        return 0;
      }
    }

    /**
     * Part of the fix for this issue is to completely ignore the close()
     * call - thus, this implementation of close() does not call close() on
     * 'in'.
     * @throws IOException
     */
    @Override
    public void close() throws IOException
    {
      final InputStream in = getIn();
      if(in != null) {
        this.in.clear();
      }
    }

    @Override
    public synchronized void mark(int readlimit)
    {
      final InputStream in = getIn();
      if(in != null) {
        in.mark(readlimit);
      }
    }

    @Override
    public synchronized void reset() throws IOException
    {
      final InputStream in = getIn();
      if(in != null) {
        in.reset();
      }
    }

    @Override
    public boolean markSupported()
    {
      final InputStream in = getIn();
      if(in != null) {
        return in.markSupported();
      } else {
        return false;
      }
    }
  }
</pre></code>

Since I'm keeping a strong reference to the source input stream for as long as I'm actually using it, I decided to use a WeakReference in the class above to avoid memory leaks (since Piccolo will hold on to the reference for an indeterminate amount of time). If the source InputStream is garbage collected then the decorating InputStream gracefully pretends that it has reached the end of stream. This has solved our issue, with regression and integration tests passing.

Richard Garris added a comment - 04/Apr/08 07:02 PM
Hi everyone ---

I found an elegant solution to this issue of InputStream cleanup. Apache Commons IO released v1.4 which includes an AutoCloseInputStream wrapper which cleans up InputStreams once they have been read.

http://commons.apache.org/io/api-release/org/apache/commons/io/input/AutoCloseInputStream.html

We are using this IBM WAS 6.1 and this appears to resolve this bug.

Does anyone see any issue with using this fix?

Thanks.

Rich