Uploaded image for project: 'Sling'
  1. Sling
  2. SLING-2874

JcrResourceProvider.create/JobMangerImpl.writeJob can cause inconsistent behavior

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • JCR Resource 2.2.8
    • JCR Resource 2.3.0
    • JCR
    • None

    Description

      I have been debugging an issue with sling jobs and finally found that current implementation JobMangerImpl.writeJob can cause inconsistent behaviour (root cause is JcrResourceProvider.create)

      Issue that led to this:
      I observed that sometime all of my event properties were not being written to the job node. Though the job node was being created. But ultimately JobManager would error out giving following messages:
      17.05.2013 16:41:57.578 WARN [Apache Sling Job Background Loader] org.apache.sling.event.impl.jobs.JobManagerImpl Discarding job - job topic is missing : /var/eventing/jobs/assigned/826cd21a-6a8f-48cb-b112-768b421af572/slingevent:eventadmin/2013/5/17/16/39/com.adobe.cq.collection.update.job_826cd21a-6a8f-48cb-b112-768b421af572_2
      Sometime, my job handler would be called, but event won't have enough properties that I sent to jobManager.
      Problem:
      There was an issue in my code that was adding a property to the event, which had invalid key i.e. /a/b/c/a.txt and JcrResourceProvider can not persist it. Hence the issue. This is fine, I can correct it.

      But the main problem is that this persistence error was never reported in error logs, and job got persisted event though JcrResourceProvider.create threw a PersistenceException. But the job was created with fewer properties with what I intended. This resulted in sometime, my JobHandler being called, but not getting enough properties.

      With the debugging, I found that JobManagerImpl.writeJob can cause some inconsistent behaviour due to the way, ResourceUtil.getOrCreateResource and JcrResourceProvider.create.

      In this case following happened:
      JcrResourceProvider.create threw PE while persisting the property, but the node was already by this time.
      ResourceUtil.getOrCreateResource caught the PE, but checked for the existence of resource and hence ignored it.

      Now, above implementation is wrong, either JcrResourceProvider should ensure that operation is atomic. Or ResourceUtil.getOrCreateResource should be changed revert changes in case of exception.

      I think that JcrResourceProvider should remove the node if addition of properties fails.

      Attachments

        1. SLING-2874
          1 kB
          Amit Gupta

        Activity

          People

            cziegeler Carsten Ziegeler
            amitxlnc Amit Gupta
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: