Mahout
  1. Mahout
  2. MAHOUT-788

ClusterDumper is never flushing output stream

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Trivial Trivial
    • Resolution: Fixed
    • Affects Version/s: 0.6
    • Fix Version/s: 0.6
    • Component/s: Clustering, Integration
    • Labels:
      None

      Description

      ClusterDumper utility never calls flush on the OutputStreamWriter. As of issue https://issues.apache.org/jira/browse/MAHOUT-679, the output stream is never being closed when it defaults to System.out – while that's a good thing, it would be nice to flush the stream before exiting the program. As is, when I run cluster dumper with the -b (substring) option set to something like 50, the stream never gets big enough to overflow the default buffer on my machine, so I see no output. Even when it does get big enough to overflow the buffer, I still miss the last cluster's summary. When the output is written to a file, the close() method usually flushes the buffer by default, but it shouldn't hurt to call the flush method either way – therefore I'd suggest adding in an unconditional call to writer.flush(); in the finally block just before conditionally closing the writer. (line 199 in the org.apache.mahout.utils.clustering.ClusterDumper.run(String[] args) method)

      } finally {
      writer.flush();
      if (shouldClose)

      { Closeables.closeQuietly(writer); }

      }

        Activity

        Hide
        Jeff Hansen added a comment -

        By the way, the bug fix that initiated this behavior (https://issues.apache.org/jira/browse/MAHOUT-679) also applies to SequenceFileDumper and VectorDumper, but was never fixed there (so the output is being flushed when the stream is unconditionally closed).

        Show
        Jeff Hansen added a comment - By the way, the bug fix that initiated this behavior ( https://issues.apache.org/jira/browse/MAHOUT-679 ) also applies to SequenceFileDumper and VectorDumper, but was never fixed there (so the output is being flushed when the stream is unconditionally closed).
        Hide
        Hudson added a comment -

        Integrated in Mahout-Quality #990 (See https://builds.apache.org/job/Mahout-Quality/990/)
        MAHOUT-788 flush writer before close

        srowen : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1158594
        Files :

        • /mahout/trunk/integration/src/main/java/org/apache/mahout/utils/clustering/ClusterDumper.java
        Show
        Hudson added a comment - Integrated in Mahout-Quality #990 (See https://builds.apache.org/job/Mahout-Quality/990/ ) MAHOUT-788 flush writer before close srowen : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1158594 Files : /mahout/trunk/integration/src/main/java/org/apache/mahout/utils/clustering/ClusterDumper.java
        Hide
        Sean Owen added a comment -

        I agree with flush()-ing within the main block. That smells better to me and also still addresses the reported issue.

        Show
        Sean Owen added a comment - I agree with flush()-ing within the main block. That smells better to me and also still addresses the reported issue.
        Hide
        Ted Dunning added a comment -

        Lance,

        That idiom is hideous. catch Throwable is almost always a bug.

        Better by far to say

           try {
              ... stuff ...
           } finally {
             try {
               writer.flush();
             } finally {
               if (shouldClose) {
                 Closeables.closeQuietly(writer);
               }
             }
        

        Pushing the flush into the first try clause might be reasonable as well. If the first part blows out, we don't much care if we flush all data (that might be wrong anyway), but we probably do care to close files. Thus,

           try {
             ... stuff ...
             writer.flush();
           } finally {
             if (shouldClose) {
               Closeables.closeQuietly(writer);
             }
           }
        

        might be a better form all round.

        Show
        Ted Dunning added a comment - Lance, That idiom is hideous. catch Throwable is almost always a bug. Better by far to say try { ... stuff ... } finally { try { writer.flush(); } finally { if (shouldClose) { Closeables.closeQuietly(writer); } } Pushing the flush into the first try clause might be reasonable as well. If the first part blows out, we don't much care if we flush all data (that might be wrong anyway), but we probably do care to close files. Thus, try { ... stuff ... writer.flush(); } finally { if (shouldClose) { Closeables.closeQuietly(writer); } } might be a better form all round.
        Hide
        Lance Norskog added a comment -

        A finally clause can throw exceptions in the middle.

        finally {
            try { writer.flush(); } catch (Throwable e) {;}
            if (shouldClose) {
                Closeables.closeQuietly(writer);
            }
        }
        
        Show
        Lance Norskog added a comment - A finally clause can throw exceptions in the middle. finally { try { writer.flush(); } catch (Throwable e) {;} if (shouldClose) { Closeables.closeQuietly(writer); } }

          People

          • Assignee:
            Sean Owen
            Reporter:
            Jeff Hansen
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 0.5h
              0.5h
              Remaining:
              Remaining Estimate - 0.5h
              0.5h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development