Axiom
  1. Axiom
  2. AXIOM-185

Temporary copies of MTOM attachments are not deleted from the file system in a timely manner

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      When customers send MTOM attachments having a certain size, the Axis2 runtime uses Axiom to make copies of these attachments and name them with a pattern of AxisXXXXXX.att, where XXXXXX is an arbitrary sequence of integers. These copies may not be deleted in a timely manner, and may be removed only when the JVM exits. This can cause a lot of files to accumulate on the customer's file system and eat up disk space, and some of these files can be quite large.

      Note that the internal sizeThreshold property controls whether attachment files are written to memory or as files to the disk.

        Issue Links

          Activity

          Hide
          Wendy Raschke added a comment -

          Note that I have a tentative code solution for this, so there is no need for anyone to work on this right now.

          Show
          Wendy Raschke added a comment - Note that I have a tentative code solution for this, so there is no need for anyone to work on this right now.
          Hide
          Rich Scheuerle added a comment -

          Hi Wendy,

          I went ahead and assigned this to me. When you have a patch and test ready, please attach it to the JIRA and describe your design.

          Thanks,

          Rich Scheuerle

          Show
          Rich Scheuerle added a comment - Hi Wendy, I went ahead and assigned this to me. When you have a patch and test ready, please attach it to the JIRA and describe your design. Thanks, Rich Scheuerle
          Hide
          Andreas Veithen added a comment -

          Please note that users have reported issues where the opposite happens, i.e. where MTOM attachments are deleted too early. In particular, see [1]. Ideally the fix should also propose a clean solution in these situations.

          [1] http://markmail.org/thread/2x4rbqcd5z4qhdv7

          Show
          Andreas Veithen added a comment - Please note that users have reported issues where the opposite happens, i.e. where MTOM attachments are deleted too early. In particular, see [1] . Ideally the fix should also propose a clean solution in these situations. [1] http://markmail.org/thread/2x4rbqcd5z4qhdv7
          Hide
          Rich Scheuerle added a comment -

          Hi Andreas,

          Wendy's solution exposes a timeout property that will ensure that an attachment file is cleaned up after that timeout expires. This is necessary to prevent the file system from being filled with attachment files.

          Your suggestion is to provide a property that is essentially a "keep until" property. This would ensure that the temporary file was kept around until at least the "keep until" property was expired. I think that is a great idea !

          Andreas, can you open a separate JIRA for the "keep unitil" property and assign it to me or Wendy.

          Wendy, can you work on the "keep until" solution or perhaps you have a different idea.

          Thanks,

          Rich Scheuerle

          Show
          Rich Scheuerle added a comment - Hi Andreas, Wendy's solution exposes a timeout property that will ensure that an attachment file is cleaned up after that timeout expires. This is necessary to prevent the file system from being filled with attachment files. Your suggestion is to provide a property that is essentially a "keep until" property. This would ensure that the temporary file was kept around until at least the "keep until" property was expired. I think that is a great idea ! Andreas, can you open a separate JIRA for the "keep unitil" property and assign it to me or Wendy. Wendy, can you work on the "keep until" solution or perhaps you have a different idea. Thanks, Rich Scheuerle
          Hide
          Andreas Veithen added a comment -

          Rich,

          I'm not suggesting to provide a "keep until" property, but to clearly identify the root cause of this issue and to implement a solution that fixes it once and for all. A timeout implementation may be a workaround, but IMHO it is not a good solution because it doesn't address the root cause, which is that Axiom/Axis2 is unable to properly manage the lifecycle of the resources it allocates.

          Show
          Andreas Veithen added a comment - Rich, I'm not suggesting to provide a "keep until" property, but to clearly identify the root cause of this issue and to implement a solution that fixes it once and for all. A timeout implementation may be a workaround, but IMHO it is not a good solution because it doesn't address the root cause, which is that Axiom/Axis2 is unable to properly manage the lifecycle of the resources it allocates.
          Hide
          Rich Scheuerle added a comment -

          Andreas,

          Okay, I understand your concern, and I will follow-up with more comments later today.

          I agree that the current solution is a necessary fallback when Axis2/Axiom or customer code fails to properly manage the lifecycle of the resource.

          I agree that this fallback is a tactical solution. More work may be necessary in Axiom attachments to (a) understand and (b) correctly manage the lifecycle of file resources.

          Thanks for digging.

          Show
          Rich Scheuerle added a comment - Andreas, Okay, I understand your concern, and I will follow-up with more comments later today. I agree that the current solution is a necessary fallback when Axis2/Axiom or customer code fails to properly manage the lifecycle of the resource. I agree that this fallback is a tactical solution. More work may be necessary in Axiom attachments to (a) understand and (b) correctly manage the lifecycle of file resources. Thanks for digging.
          Hide
          Rich Scheuerle added a comment -

          Scenario A:

          Customer client code receives a web service response containing a message containing a small attachment (memory)
          The customer client code (not the axiom or axis2 project) has the responsibility of the end of life of the received message.
          The customer may even persist or copy the message.
          Even if the customer code is ill-behaved, the garbage collector will eventually free the memory once the message is no
          longer referenced.

          Scenario B:
          Same scenario as A, except that the message contains a large attachment. Thus the attachment data is kept in a
          file. If the customer's code makes copies of the message, both copies will still reference the same file.
          If the customer's code is ill-behaved, eventually the garbage collector will free the in-core memory. However
          there is no guarantee that the file associated with the attachment is deleted. (The message may not be referenced, but
          threads in the system may still be accessing the file contents). Managing when the file can be deleted is difficult.
          There are some solutions that have already been applied using finalizers and temp delete code, but these approaches
          are not guaranteed by Java...and there is still opportunities for leaks.

          Scenario C:
          The same as scenario B, except now think of the situation from an enterprise perspective. Vendor1 provided the "ill-behaved customer code".
          Vendor2 purchased the code and is the administrator of a system that contains Vendor1 code. Over a long period of time (days or weeks) under high load,
          the file system of Vendor2 fills. Vendor2 is an administrator and cannot change either Axiom or the Vendor1 code.

          -------------------------------

          The point of this fix is to address the C scenario. Now an administrator can set a property to at least prevent a catastrophic file system failure.

          --------------------------------

          I agree that there are problems with the resource management of Attachments. From my own knowledge, there are some things that could or should be improved.

          All access to the cache file should be done in a way that Axiom (Attachments) has full knowledge of the access. For example, there is a FileAccessor (good) but
          unfortunately the FileAccessor leaks out information about the file (bad). FileAccessor is not doing its job if it is simple a bridge to the File. It should be manage
          access to the file.
          Examples:
          FileAccessor allows a caller to do a getFile() to the File object.
          Ouch, we have just lost control of who is touching the file, thus it is hard to know when we can delete it.
          This should be eliminated or done via a wrapper so that we can know the accesses.
          FileAccessor getInputStream() returns a FileInputStream (ouch). Once again we give up control to the caller. We don't know the lifetime of the
          FileInputStream...thus it is hard to know when it is safe to delete the attachment.

          The solution (or partial solution) is to start digging on these places where the actual File or access is leaked outside of the Attachment. Once all of these leaks are blocked, then we will have more information to know when it is safe or not safe to delete the File.

          My thought is that these leaks should be addressed in a separate JIRA.

          Show
          Rich Scheuerle added a comment - Scenario A: Customer client code receives a web service response containing a message containing a small attachment (memory) The customer client code (not the axiom or axis2 project) has the responsibility of the end of life of the received message. The customer may even persist or copy the message. Even if the customer code is ill-behaved, the garbage collector will eventually free the memory once the message is no longer referenced. Scenario B: Same scenario as A, except that the message contains a large attachment. Thus the attachment data is kept in a file. If the customer's code makes copies of the message, both copies will still reference the same file. If the customer's code is ill-behaved, eventually the garbage collector will free the in-core memory. However there is no guarantee that the file associated with the attachment is deleted. (The message may not be referenced, but threads in the system may still be accessing the file contents). Managing when the file can be deleted is difficult. There are some solutions that have already been applied using finalizers and temp delete code, but these approaches are not guaranteed by Java...and there is still opportunities for leaks. Scenario C: The same as scenario B, except now think of the situation from an enterprise perspective. Vendor1 provided the "ill-behaved customer code". Vendor2 purchased the code and is the administrator of a system that contains Vendor1 code. Over a long period of time (days or weeks) under high load, the file system of Vendor2 fills. Vendor2 is an administrator and cannot change either Axiom or the Vendor1 code. ------------------------------- The point of this fix is to address the C scenario. Now an administrator can set a property to at least prevent a catastrophic file system failure. -------------------------------- I agree that there are problems with the resource management of Attachments. From my own knowledge, there are some things that could or should be improved. All access to the cache file should be done in a way that Axiom (Attachments) has full knowledge of the access. For example, there is a FileAccessor (good) but unfortunately the FileAccessor leaks out information about the file (bad). FileAccessor is not doing its job if it is simple a bridge to the File. It should be manage access to the file. Examples: FileAccessor allows a caller to do a getFile() to the File object. Ouch, we have just lost control of who is touching the file, thus it is hard to know when we can delete it. This should be eliminated or done via a wrapper so that we can know the accesses. FileAccessor getInputStream() returns a FileInputStream (ouch). Once again we give up control to the caller. We don't know the lifetime of the FileInputStream...thus it is hard to know when it is safe to delete the attachment. The solution (or partial solution) is to start digging on these places where the actual File or access is leaked outside of the Attachment. Once all of these leaks are blocked, then we will have more information to know when it is safe or not safe to delete the File. My thought is that these leaks should be addressed in a separate JIRA.
          Hide
          Rich Scheuerle added a comment -

          Marking this issue as resolved.

          My belief is that additional changes to the management of File resources should be addressed by another JIRA.

          Show
          Rich Scheuerle added a comment - Marking this issue as resolved. My belief is that additional changes to the management of File resources should be addressed by another JIRA.
          Hide
          Andreas Veithen added a comment -

          Rich,

          There are two points in your analysis that don't seem very convincing to me:

          • If I understand correctly, you're saying that Java doesn't provide enough guarantees to use finalization to reliably clean up temporary files. If finalization is used together with File#deleteOnExit (or a shutdown hook), can you give me an example where this would cause a leak (and where a timeout based solution doesn't)?
          • Your argument about FileAccessor is only relevant if client code can get access to that object. However, I fail to see how you can get from the Attachments object to any of the FileAccessor instances. Can you point me to the code that allows this?
          Show
          Andreas Veithen added a comment - Rich, There are two points in your analysis that don't seem very convincing to me: If I understand correctly, you're saying that Java doesn't provide enough guarantees to use finalization to reliably clean up temporary files. If finalization is used together with File#deleteOnExit (or a shutdown hook), can you give me an example where this would cause a leak (and where a timeout based solution doesn't)? Your argument about FileAccessor is only relevant if client code can get access to that object. However, I fail to see how you can get from the Attachments object to any of the FileAccessor instances. Can you point me to the code that allows this?
          Hide
          Andreas Veithen added a comment -

          Answering my own questions (since IBM never responded...):

          • Of course Java guarantees that the finalizer is called when an object is garbage collected. However, one still needs a solution to delete the file when the JVM stops before the object is garbage collected. Since File#deleteOnExit is known to create a (native) memory leak, this needs to be implemented using a shutdown hook.
          • It is indeed not possible for application code to get access to the FileAccessor object. However, the DataHandler objects created by Axiom (for attachment parts buffered on disk) are backed by CachedFileDataSource objects. CachedFileDataSource extends javax.activation.FileDataSource which gives access to the java.io.File object. There is code in Axis2 that explicitly relies on this to clean up the temporary files.
          Show
          Andreas Veithen added a comment - Answering my own questions (since IBM never responded...): Of course Java guarantees that the finalizer is called when an object is garbage collected. However, one still needs a solution to delete the file when the JVM stops before the object is garbage collected. Since File#deleteOnExit is known to create a (native) memory leak, this needs to be implemented using a shutdown hook. It is indeed not possible for application code to get access to the FileAccessor object. However, the DataHandler objects created by Axiom (for attachment parts buffered on disk) are backed by CachedFileDataSource objects. CachedFileDataSource extends javax.activation.FileDataSource which gives access to the java.io.File object. There is code in Axis2 that explicitly relies on this to clean up the temporary files.

            People

            • Assignee:
              Rich Scheuerle
              Reporter:
              Wendy Raschke
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development