Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-4246

Resource leaks in org.apache.zookeeper.server.persistence.SnapStream#getInputStream and #getOutputStream

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • server

    Description

       There are three (related) possible resource leaks in the `getInputStream` and `getOutputStream` methods in `SnapStream.java`. I noticed the first because of the use of the error-prone `GZIPOutputStream`, and the other two after looking at the surrounding code.

      Here is the offending code (copied from here):

          /**
           * Return the CheckedInputStream based on the extension of the fileName.
           *
           * @param file the file the InputStream read from
           * @return the specific InputStream
           * @throws IOException
           */
          public static CheckedInputStream getInputStream(File file) throws IOException {
              FileInputStream fis = new FileInputStream(file);
              InputStream is;
              switch (getStreamMode(file.getName())) {
              case GZIP:
                  is = new GZIPInputStream(fis);
                  break;
              case SNAPPY:
                  is = new SnappyInputStream(fis);
                  break;
              case CHECKED:
              default:
                  is = new BufferedInputStream(fis);
              }
              return new CheckedInputStream(is, new Adler32());
          }
      
          /**
           * Return the OutputStream based on predefined stream mode.
           *
           * @param file the file the OutputStream writes to
           * @param fsync sync the file immediately after write
           * @return the specific OutputStream
           * @throws IOException
           */
          public static CheckedOutputStream getOutputStream(File file, boolean fsync) throws IOException {
              OutputStream fos = fsync ? new AtomicFileOutputStream(file) : new FileOutputStream(file);
              OutputStream os;
              switch (streamMode) {
              case GZIP:
                  os = new GZIPOutputStream(fos);
                  break;
              case SNAPPY:
                  os = new SnappyOutputStream(fos);
                  break;
              case CHECKED:
              default:
                  os = new BufferedOutputStream(fos);
              }
              return new CheckedOutputStream(os, new Adler32());
          }
      

      All three possible resource leaks are caused by the constructors of the intermediate streams (i.e. `is` and `os`), some of which might throw `IOException`s:

      • in `getOutputStream`, the call to `new GZIPOutputStream` can throw an exception, because `GZIPOutputStream` writes out the header in the constructor. If it does throw, then `fos` is never closed. That it does so makes it hard to use correctly; someone raised this as an issue with the JDK folks here, but they closed it as "won't fix" because the constructor is documented to throw (hence the need to catch the exception here).
      • in `getInputStream`, the call to `new GZIPInputStream` can throw an `IOException` for a similar reason, causing the file handle held by `fis` to leak.
      • similarly, the call to `new SnappyInputStream` can throw an `IOException`, because it tries to read the file header during construction, which also causes `fis` to leak. `SnappyOutputStream` cannot throw; I checked here.

      I'll submit a PR with a (simple) fix shortly after this bug report goes up and gets assigned an issue number, and add a link to this issue.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              kelloggm Martin Kellogg
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 20m
                  2h 20m