Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-1557

Controller Services and Reporting Tasks not properly ordered in fingerprint verification - makes restart/upgrades difficult

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 0.3.0, 0.5.0, 0.4.1
    • 0.6.0, 0.5.1
    • Core Framework
    • None

    Description

      In upgrading a cluster have found unreliable behavior. In restarting nodes in a cluster found unreliable behavior. Turns out it appears we are not ordering the controller services and reporting tasks before doing fingerprint verification. This causes unreliable restarts/upgrades.

      The NMC says:

      ===========

      2016-02-23 19:42:32,466 INFO [Handle Reconnection Failure Message from [id=61103aa8-1226-44e4-827c-b150ac7d4079, apiAddress=processing-2.demo.onyara.com, apiPort=8443, socketAddress=processing-2.demo.onyara.com, socketPort=6000, siteToSiteAddress=processing-2.demo.onyara.com, siteToSitePort=9000]] o.a.n.c.manager.impl.WebClusterManager Node Event: [id=61103aa8-1226-44e4-827c-b150ac7d4079, apiAddress=processing-2.demo.onyara.com, apiPort=8443, socketAddress=processing-2.demo.onyara.com, socketPort=6000, siteToSiteAddress=processing-2.demo.onyara.com, siteToSitePort=9000] – 'Node could not rejoin cluster. Setting node to Disconnected. Node reported the following error: org.apache.nifi.cluster.ConnectionException: Failed to connect node to cluster because local flow is different than cluster flow.'
      2016-02-23 19:42:32,530 INFO [Process NCM Request-5] o.a.n.c.p.impl.SocketProtocolListener Finished processing request 6d4a5f0e-e4e5-47cf-843a-7194660197c8 (type=RECONNECTION_FAILURE, length=654 bytes) in 102 millis
      2016-02-23 19:42:32,530 INFO [Handle Reconnection Failure Message from [id=e5577bb3-290c-4688-be43-78b0b4af06fb, apiAddress=processing-1.demo.onyara.com, apiPort=8443, socketAddress=processing-1.demo.onyara.com, socketPort=6000, siteToSiteAddress=processing-1.demo.onyara.com, siteToSitePort=9000]] o.a.n.c.manager.impl.WebClusterManager Node Event: [id=e5577bb3-290c-4688-be43-78b0b4af06fb, apiAddress=processing-1.demo.onyara.com, apiPort=8443, socketAddress=processing-1.demo.onyara.com, socketPort=6000, siteToSiteAddress=processing-1.demo.onyara.com, siteToSitePort=9000] – 'Node could not rejoin cluster. Setting node to Disconnected. Node reported the following error: org.apache.nifi.cluster.ConnectionException: Failed to connect node to cluster because local flow is different than cluster flow.'
      2016-02-23 19:42:34,875 INFO [Process NCM Request-6] o.a.n.c.p.impl.SocketProtocolListener Received request 1b681e8d-3f68-4be2-9ab2-7134050cce9d from 172.31.4.123
      2016-02-23 19:42:34,908 INFO [Process NCM Request-7] o.a.n.c.p.impl.SocketProtocolListener Received request 5008cbb3-5084-4083-9b56-84bd5ad3c55c from 172.31.4.123
      2016-02-23 19:42:34,994 INFO [Process NCM Request-6] o.a.n.c.p.impl.SocketProtocolListener Finished processing request 1b681e8d-3f68-4be2-9ab2-7134050cce9d (type=BULLETINS, length=1550 bytes) in 118 millis
      2016-02-23 19:42:35,008 INFO [Process NCM Request-7] o.a.n.c.p.impl.SocketProtocolListener Finished processing request 5008cbb3-5084-4083-9b56-84bd5ad3c55c (type=RECONNECTION_FAILURE, length=654 bytes) in 100 millis
      2016-02-23 19:42:35,009 INFO [Handle Reconnection Failure Message from [id=2240c663-2b20-48dc-936d-078a11812d49, apiAddress=processing-3.demo.onyara.com, apiPort=8443, socketAddress=processing-3.demo.onyara.com, socketPort=6000, siteToSiteAddress=processing-3.demo.onyara.com, siteToSitePort=9000]] o.a.n.c.manager.impl.WebClusterManager Node Event: [id=2240c663-2b20-48dc-936d-078a11812d49, apiAddress=processing-3.demo.onyara.com, apiPort=8443, socketAddress=processing-3.demo.onyara.com, socketPort=6000, siteToSiteAddress=processing-3.demo.onyara.com, siteToSitePort=9000] – 'Node could not rejoin cluster. Setting node to Disconnected. Node reported the following error: org.apache.nifi.cluster.ConnectionException: Failed to connect node to cluster because local flow is different than cluster flow.'
      ============

      processing-1 says:

      ==========
      2016-02-23 19:42:32,410 ERROR [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService Handling reconnection request failed due to: org.apache.nifi.cluster.ConnectionException: Failed t
      o connect node to cluster because local flow is different than cluster flow.
      org.apache.nifi.cluster.ConnectionException: Failed to connect node to cluster because local flow is different than cluster flow.
      at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:760) [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at org.apache.nifi.controller.StandardFlowService.handleReconnectionRequest(StandardFlowService.java:533) [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at org.apache.nifi.controller.StandardFlowService.access$200(StandardFlowService.java:81) [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at org.apache.nifi.controller.StandardFlowService$1.run(StandardFlowService.java:370) [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79]
      Caused by: org.apache.nifi.controller.UninheritableFlowException: Proposed configuration is not inheritable by the flow controller because of flow differences: Found difference in Flows:
      Local Fingerprint: 9ca96b153807ca8876-6f68-4ba8-9bd7-96febb03cbdbPROCESSORNO_VALUE1a03fca1-fbf2-4029-8c23-1ba2e96f00c81ad4e807-4d92-3475-9e99-e7b5e8b3592aorg.apache.nifi.processors.kafka.Pu
      tKafkaNO_VALUEClient NameNiFi-
      Cluster Fingerprint: 9ca96b153807ca8876-6f68-4ba8-9bd7-96febb03cbdbPROCESSORNO_VALUE1a03fca1-fbf2-4029-8c23-1ba2e96f00c818c0ac39-0aef-3df1-8d20-f2c1460536c4org.apache.nifi.processors.kafka.Pu
      tKafkaNO_VALUEClient NameNiFi-
      at org.apache.nifi.controller.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:216) ~[nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1285) ~[nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at org.apache.nifi.persistence.StandardXMLFlowConfigurationDAO.load(StandardXMLFlowConfigurationDAO.java:72) ~[nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:629) [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:737) [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      ... 4 common frames omitted
      ==========

      processing-2 says:
      ===========
      Caused by: org.apache.nifi.controller.UninheritableFlowException: Proposed configuration is not inheritable by the flow controller because of flow differences: Found difference in Flows:
      Local Fingerprint: 9ca96b153807ca8876-6f68-4ba8-9bd7-96febb03cbdbPROCESSORNO_VALUE1a03fca1-fbf2-4029-8c23-1ba2e96f00c81ad4e807-4d92-3475-9e99-e7b5e8b3592aorg.apache.nifi.processors.kafka.Pu
      tKafkaNO_VALUEClient NameNiFi-
      Cluster Fingerprint: 9ca96b153807ca8876-6f68-4ba8-9bd7-96febb03cbdbPROCESSORNO_VALUE1a03fca1-fbf2-4029-8c23-1ba2e96f00c818c0ac39-0aef-3df1-8d20-f2c1460536c4org.apache.nifi.processors.kafka.Pu
      tKafkaNO_VALUEClient NameNiFi-
      at org.apache.nifi.controller.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:216) ~[nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1285) ~[nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at org.apache.nifi.persistence.StandardXMLFlowConfigurationDAO.load(StandardXMLFlowConfigurationDAO.java:72) ~[nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:629) [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:737) [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      ... 4 common frames omitted
      ===========
      processing-3 says:
      ===========
      2016-02-23 19:42:34,896 ERROR [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService Handling reconnection request failed due to: org.apache.nifi.cluster.ConnectionException: Failed t
      o connect node to cluster because local flow is different than cluster flow.
      org.apache.nifi.cluster.ConnectionException: Failed to connect node to cluster because local flow is different than cluster flow.
      at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:760) [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at org.apache.nifi.controller.StandardFlowService.handleReconnectionRequest(StandardFlowService.java:533) [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at org.apache.nifi.controller.StandardFlowService.access$200(StandardFlowService.java:81) [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at org.apache.nifi.controller.StandardFlowService$1.run(StandardFlowService.java:370) [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at java.lang.Thread.run(Thread.java:745) [na:1.8.0_65]
      Caused by: org.apache.nifi.controller.UninheritableFlowException: Proposed configuration is not inheritable by the flow controller because of flow differences: Found difference in Flows:
      Local Fingerprint: 9ca96b153807ca8876-6f68-4ba8-9bd7-96febb03cbdbPROCESSORNO_VALUE1a03fca1-fbf2-4029-8c23-1ba2e96f00c81ad4e807-4d92-3475-9e99-e7b5e8b3592aorg.apache.nifi.processors.kafka.Pu
      tKafkaNO_VALUEClient NameNiFi-
      Cluster Fingerprint: 9ca96b153807ca8876-6f68-4ba8-9bd7-96febb03cbdbPROCESSORNO_VALUE1a03fca1-fbf2-4029-8c23-1ba2e96f00c818c0ac39-0aef-3df1-8d20-f2c1460536c4org.apache.nifi.processors.kafka.Pu
      tKafkaNO_VALUEClient NameNiFi-
      at org.apache.nifi.controller.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:216) ~[nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1285) ~[nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at org.apache.nifi.persistence.StandardXMLFlowConfigurationDAO.load(StandardXMLFlowConfigurationDAO.java:72) ~[nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:629) [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:737) [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT]
      ... 4 common frames omitted

      ===========

      Attachments

        Activity

          People

            markap14 Mark Payne
            joewitt Joe Witt
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: