Uploaded image for project: 'James Server'
  1. James Server
  2. JAMES-1394

Mails stuck in queue/spool

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.0-beta3
    • None
    • Queue, Remote Delivery
    • None
    • James 3.0 Beta3
      linux 2.6.24-23-xen #1 SMP Mon Jan 26 03:09:12 UTC 2009 x86_64 GNU/Linux

    Description

      Issue created based on mailing list discussion here:
      http://www.mail-archive.com/server-user@james.apache.org/msg13171.html

      Citing the mail content:

      First a short background info, what our company would like to achieve using James. We have several clients that need mass-mailing capability, thus we want to create a system that allows us to track these mails and especially store errors from bounces (also by using VERP), so we can block unreachable addresses. I created a few custom mailets to handle those requirements and a custom RemoteDelivery implementation for creating VERP style sender addresses. The storage is all done in a MySQL database via JPA. Currently we use it for one client which is not sending huge amounts of mails, around 80,000 per month.

      The system works well, but it happens after a bigger batch of mails (around 4,000 mails) that the mail queueing system gets somehow "stuck". The SMTP server is able to retrieve new mails, but somehow they don't get send out, until I restart the server. Sometimes a flush() on queue/spool helps, but most of the times a restart is necessary.

      I added logging to my mailets when they start and finish to see if there is a problem, but they seem to run fine (no exceptions). It seems that the mails that arrive after James gets stuck get processed normally by the mailets, but somehow are not dequeued and sent out.

      Server info
      James 3.0 Beta3
      linux 2.6.24-23-xen #1 SMP Mon Jan 26 03:09:12 UTC 2009 x86_64 GNU/Linux

      Frankly I'm not sure what is the best way to debug this kind of problem with James. I could find a few other threads with people describing similar problems, but I couldn't get closer to the problem.

      Here some data from JMX that I took while the server was in "stuck" status.

      =======================================================
      INFO org.apache.james:type=component,name=queue,queue=outgoing
      =======================================================
      Size == 80
      =======================================================
      INFO org.apache.james:type=component,name=queue,queue=spool
      =======================================================
      Size == 0
      =======================================================
      INFO org.apache.james:type=component,component=mailetcontainer,name=mailspooler
      =======================================================
      CurrentSpoolCount == 0
      ThreadCount == 20
      =======================================================
      INFO org.apache.james:type=component,component=mailetcontainer,name=mailspooler,sub-type=threadpool,threadpool=dequeuer
      =======================================================
      TotalTasks == 0
      AverageTaskTime == 0.0
      ActiveThreads == 2
      ActiveTasks == 2
      QueuedTasks == 0
      MaximalThreads == 2
      =======================================================
      INFO org.apache.james:type=component,component=mailetcontainer,name=mailspooler,sub-type=threadpool,threadpool=spooler
      =======================================================
      TotalTasks == 4653
      AverageTaskTime == 145.0
      ActiveThreads == 20
      ActiveTasks == 0
      QueuedTasks == 0
      MaximalThreads == 20

      I also created a jstack output, as it was suggested in one of the threads I found. I can post it here as well if it could be helpful (I didn't do it now, because the file is quite large).

      This is the logging output when a new mail is sent while the queue is stuck (IgnoreBlockedMailet and SendAttemptLogMailet are my custom mailets):

      DEBUG 03:54:46,819 | james.mailspooler | ==== Begin processing mail Mail1322625286815-aa8034cb-4d50-4044-926f-10bbe8b637e5====
      DEBUG 03:54:46,819 | james.mailprocessor | Call MailProcessor root
      INFO 03:54:46,827 | james.mailetcontext | IgnoreBlockedMailet called - recipients=[t.eichmann@eggsist.com]
      INFO 03:54:46,847 | james.mailetcontext | IgnoreBlockedMailet finished - recipients=[t.eichmann@eggsist.com]
      DEBUG 03:54:46,847 | james.mailprocessor | Call MailProcessor transport
      INFO 03:54:46,863 | james.mailetcontext | SendAttemptLogMailet called - recipients=[t.eichmann@eggsist.com]
      INFO 03:54:46,927 | james.mailetcontext | SendAttemptLogMailet finished - recipients=[t.eichmann@eggsist.com]
      DEBUG 03:54:46,931 | james.mailprocessor | End of mailetprocessor for state root reached

      What is missing is this part, which usually follows when the server is functioning (at least follows after a few minutes if the server is busy).

      INFO 03:54:46,943 | james.mailetcontext | Attempting delivery of Mail1322620955282-2b903c2d-8378-4a10-848b-6ebd489182bf-to-eggsist.com to host ASPMX.L.GOOGLE.com. at 209.85.225.26 from bounce+t.eichmann=eggsist.com@???.com for addresses [t.eichmann@eggsist.com]
      INFO 03:54:47,671 | james.mailetcontext | Mail (Mail1322620955282-2b903c2d-8378-4a10-848b-6ebd489182bf-to-eggsist.com) sent successfully to ASPMX.L.GOOGLE.com. at 209.85.225.26 from bounce+t.eichmann=eggsist.com@???.com for [t.eichmann@eggsist.com]

      There are a couple of mails that get re-queued 25 times (default limit for attempts) due to DNS issues (like no MX entry for a domain etc.) until they are bounced in the system. But I assume James should be able to handle such mails and new ones in the queue without problems.

      Is there a way to examine mails that are queued and their status? Like how many times are they already being re-queued, why are they still in the queue etc.? I'm really not sure how I can debug this issue as I'm new to James. Any help is very appreciated.

      Attachments

        1. out.txt
          147 kB
          garvice
        2. jstack-output-stuck.txt
          153 kB
          Thomas Eichmann

        Activity

          People

            Unassigned Unassigned
            teichm_eggsist Thomas Eichmann
            Votes:
            2 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: