Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-25576

Primary key duplication error during flushing alerts from alerts cache

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.7.5
    • Fix Version/s: 2.7.6
    • Component/s: ambari-server
    • Labels:
      None

      Description

      Sometimes there are commit errors for clusters with a lot of hosts and enabled alert caching:

      2020-10-09 19:53:14,444 ERROR [alert-event-bus-4] AmbariJpaLocalTxnInterceptor:180 - [DETAILED ERROR] Rollback reason: 
      Local Exception Stack: 
      Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.6.2.v20151217-774c696): org.eclipse.persistence.exceptions.DatabaseException
      Internal Exception: java.sql.BatchUpdateException: Batch entry 1 INSERT INTO alert_history (alert_id, alert_instance, alert_label, alert_state, alert_text, alert_timestamp, cluster_id, component_name, host_name, service_name, alert_definition_id) VALUES (15363461, NULL, 'DataNode Web UI', 'OK', 'HTTP 200 response in 0.000s', 1602286496756, 2, 'DATANODE', 'host1', 'HDFS', 53) was aborted: ERROR: duplicate key value violates unique constraint "pk_alert_history"
        Detail: Key (alert_id)=(15363461) already exists.  Call getNextException to see other errors in the batch.
      Error Code: 0
      Call: INSERT INTO alert_history (alert_id, alert_instance, alert_label, alert_state, alert_text, alert_timestamp, cluster_id, component_name, host_name, service_name, alert_definition_id) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
      	bind => [11 parameters bound]
      

      This is not often issue, but anyway it has extensive logging. Also this issue can cause other rare problems, so it should be fixed.

      The reason of the issue is we have a shareable cache which can be updated with just merged value before this value will be really committed into DB. In this case other thread (from CachedAlertFlushService or AlertEventPublisher) can try to also merge already merged entity.
      For example, we've created a new AlertHistoryEntity and set it to existing AlertCurrentEntity. A first thread started transaction, merged current entity to context, saved merged value to the cache and paused execution. After that a second thread tries to merge all content of cache and also merges just updated current entity. So we have two transaction and both think they should update current entity and create the new history entity. As result one of them is failing on duplicate error.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                hapylestat Dmytro Grinenko
                Reporter:
                dvitiiuk Dmytro Vitiuk
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m