Cassandra
  1. Cassandra
  2. CASSANDRA-2967

Only bind JMX to the same IP address that is being used in Cassandra

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Not a Problem
    • Fix Version/s: None
    • Component/s: Tools
    • Labels:

      Description

      The setup is 5 nodes in each data center are all running on one physical test machine and even though the repair was run against the correct IP the wrong JMX port was used. As a result, instead of repairing all 5 nodes I was repairing the same node 5 times.

      It would be nice if Cassandra's JMX would bind to only the IP address on which its thrift/RPC services are listening on instead of binding to all IP's on the box.

      1. cassandra-0.8-2967.txt
        8 kB
        Alex Araujo
      2. cassandra-1.0-2967-v2.txt
        8 kB
        Alex Araujo
      3. cassandra-1.0-2967-v3.txt
        9 kB
        Alex Araujo
      4. cassandra-1.0-2967-v4.txt
        10 kB
        Alex Araujo

        Activity

        Hide
        Brandon Williams added a comment -

        The interface and port JMX binds can be set in cassandra-env.sh by changing JMX_PORT and the following line:

        # JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=<public name>"
        
        Show
        Brandon Williams added a comment - The interface and port JMX binds can be set in cassandra-env.sh by changing JMX_PORT and the following line: # JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=<public name>"
        Hide
        Nick Bailey added a comment -

        I'm pretty sure that setting doesn't control the interface jmx binds to, only the interface jmx tells a client to connect to when it does it's crazy 2-connection communication dance.

        I'd say this is still a valid ticket. Seems to be an example of how to do it here:

        http://vafer.org/blog/20061010091658/

        Show
        Nick Bailey added a comment - I'm pretty sure that setting doesn't control the interface jmx binds to, only the interface jmx tells a client to connect to when it does it's crazy 2-connection communication dance. I'd say this is still a valid ticket. Seems to be an example of how to do it here: http://vafer.org/blog/20061010091658/
        Hide
        Brandon Williams added a comment -

        Fair enough.

        Show
        Brandon Williams added a comment - Fair enough.
        Hide
        Norman Maurer added a comment -

        We have something similar in JAMES. I will write up a patch for it.

        Show
        Norman Maurer added a comment - We have something similar in JAMES. I will write up a patch for it.
        Hide
        Jackson Chung added a comment -

        i wouldn't say it is lhf, this is quite important in terms of network security.

        not adding value here, just a ref site similar to the above blog (its just from official Sun's page).

        http://download.oracle.com/javase/6/docs/technotes/guides/management/agent.html

        see "Monitoring Applications through a Firewall" section if tl;dr

        Show
        Jackson Chung added a comment - i wouldn't say it is lhf, this is quite important in terms of network security. not adding value here, just a ref site similar to the above blog (its just from official Sun's page). http://download.oracle.com/javase/6/docs/technotes/guides/management/agent.html see "Monitoring Applications through a Firewall" section if tl;dr
        Hide
        Jonathan Ellis added a comment -

        (lhf just means it's relatively straightforward, not that it's not valuable.)

        Show
        Jonathan Ellis added a comment - (lhf just means it's relatively straightforward, not that it's not valuable.)
        Hide
        Jonathan Ellis added a comment -

        Norman, are you still planning to tackle this?

        Show
        Jonathan Ellis added a comment - Norman, are you still planning to tackle this?
        Hide
        Alex Araujo added a comment -

        Added JMXRemoteListener that binds to listen_address with default/configurable ports

        Show
        Alex Araujo added a comment - Added JMXRemoteListener that binds to listen_address with default/configurable ports
        Hide
        Alex Araujo added a comment -

        Made default RMI server and registry ports 3000 and 3001, respectively. We may want to select different values if these are likely to cause conflicts.

        Show
        Alex Araujo added a comment - Made default RMI server and registry ports 3000 and 3001, respectively. We may want to select different values if these are likely to cause conflicts.
        Hide
        Jonathan Ellis added a comment -

        I'd like to not mess with 0.8 here, does this apply to 1.0?

        Show
        Jonathan Ellis added a comment - I'd like to not mess with 0.8 here, does this apply to 1.0?
        Hide
        Alex Araujo added a comment - - edited

        It does, but v2 has better defaults

        Show
        Alex Araujo added a comment - - edited It does, but v2 has better defaults
        Hide
        Radim Kolar added a comment -

        I do not like that it binds to cassandra storage IP. It will break our configuration. At least make it configurable so we can set it back to 0.0.0.0

        Show
        Radim Kolar added a comment - I do not like that it binds to cassandra storage IP. It will break our configuration. At least make it configurable so we can set it back to 0.0.0.0
        Hide
        Jonathan Ellis added a comment -

        I'm okay with breaking niche configurations to make life better for the vast majority. Is there a technical reason same-ip is unusable for you?

        Show
        Jonathan Ellis added a comment - I'm okay with breaking niche configurations to make life better for the vast majority. Is there a technical reason same-ip is unusable for you?
        Hide
        Radim Kolar added a comment -

        dedicated network to cassandra

        Show
        Radim Kolar added a comment - dedicated network to cassandra
        Hide
        Alex Araujo added a comment -

        If we're going to add a jmx_listen_address and the current default is 0.0.0.0 we might as well leave it as the default. Any objections?

        Show
        Alex Araujo added a comment - If we're going to add a jmx_listen_address and the current default is 0.0.0.0 we might as well leave it as the default. Any objections?
        Hide
        Radim Kolar added a comment -

        Yes. it will work good

        Show
        Radim Kolar added a comment - Yes. it will work good
        Hide
        Alex Araujo added a comment -

        v3 adds jmx_listen_address with default of 0.0.0.0

        Show
        Alex Araujo added a comment - v3 adds jmx_listen_address with default of 0.0.0.0
        Hide
        Jonathan Ellis added a comment -

        Back up, though. I haven't heard a reason yet as to why we should add jmx_listen_address.

        Show
        Jonathan Ellis added a comment - Back up, though. I haven't heard a reason yet as to why we should add jmx_listen_address.
        Hide
        Radim Kolar added a comment -

        because if you have network dedicated to cassandra with private IPs then you can not connect to it from internet to do nodetool/jconsole operations.

        Show
        Radim Kolar added a comment - because if you have network dedicated to cassandra with private IPs then you can not connect to it from internet to do nodetool/jconsole operations.
        Hide
        Jonathan Ellis added a comment -

        So basically, this bypasses the security you've set up, but that's a feature?

        Show
        Jonathan Ellis added a comment - So basically, this bypasses the security you've set up, but that's a feature ?
        Hide
        Radim Kolar added a comment -

        dedicated network is there for performance reasons. Feature is to be able to use jconsole from windows desktop instead of running ubuntu in vmplayer for remote X display.

        Its up to every user to firewall their JMX_LISTEN_ADDRESS. Its not different from running cassandra now, which listens to all IPs.

        Show
        Radim Kolar added a comment - dedicated network is there for performance reasons. Feature is to be able to use jconsole from windows desktop instead of running ubuntu in vmplayer for remote X display. Its up to every user to firewall their JMX_LISTEN_ADDRESS. Its not different from running cassandra now, which listens to all IPs.
        Hide
        Yuki Morishita added a comment -

        Few comments on the patch:

        • If you leave -Dcom.sun.management.jmxremote.port=$JMX_PORT inside cassandra-env.sh, JVM still expose jmx to all interfaces.
        • Applied patch to 1.0, removed above option from env.sh, and accessed jmx via nodetool, I get following error. Am I missing something?
        # Inside patched cassandra.yaml, I set the following
        jmx_listen_address: 127.0.0.2
        jmx_registry_port: 7200
        jmx_server_port: 7100
        
        $ bin/nodetool -h 127.0.0.2 -p 7100 ring
        Error connection to remote JMX agent!
        java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.CommunicationException [Root exception is java.rmi.NoSuchObjectException: no such object in table]
        	at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:340)
        	at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:248)
        	at org.apache.cassandra.tools.NodeProbe.connect(NodeProbe.java:143)
        	at org.apache.cassandra.tools.NodeProbe.<init>(NodeProbe.java:113)
        	at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:585)
        Caused by: javax.naming.CommunicationException [Root exception is java.rmi.NoSuchObjectException: no such object in table]
        	at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:101)
        	at com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:185)
        	at javax.naming.InitialContext.lookup(InitialContext.java:392)
        	at javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1888)
        	at javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1858)
        	at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:257)
        	... 4 more
        Caused by: java.rmi.NoSuchObjectException: no such object in table
        	at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:255)
        	at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:233)
        	at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:359)
        	at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source)
        	at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:97)
        	... 9 more
        
        Show
        Yuki Morishita added a comment - Few comments on the patch: If you leave -Dcom.sun.management.jmxremote.port=$JMX_PORT inside cassandra-env.sh, JVM still expose jmx to all interfaces. Applied patch to 1.0, removed above option from env.sh, and accessed jmx via nodetool, I get following error. Am I missing something? # Inside patched cassandra.yaml, I set the following jmx_listen_address: 127.0.0.2 jmx_registry_port: 7200 jmx_server_port: 7100 $ bin/nodetool -h 127.0.0.2 -p 7100 ring Error connection to remote JMX agent! java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.CommunicationException [Root exception is java.rmi.NoSuchObjectException: no such object in table] at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:340) at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:248) at org.apache.cassandra.tools.NodeProbe.connect(NodeProbe.java:143) at org.apache.cassandra.tools.NodeProbe.<init>(NodeProbe.java:113) at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:585) Caused by: javax.naming.CommunicationException [Root exception is java.rmi.NoSuchObjectException: no such object in table] at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:101) at com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:185) at javax.naming.InitialContext.lookup(InitialContext.java:392) at javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1888) at javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1858) at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:257) ... 4 more Caused by: java.rmi.NoSuchObjectException: no such object in table at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:255) at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:233) at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:359) at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source) at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:97) ... 9 more
        Hide
        Alex Araujo added a comment -

        Good catch on cassandra-env.sh - I did not test with bin/cassandra. If you remove the option from env.sh and use jmx_registry_port in the nodetool command instead of jmx_server_port (i.e., -p 7200) you should see the output.

        Show
        Alex Araujo added a comment - Good catch on cassandra-env.sh - I did not test with bin/cassandra. If you remove the option from env.sh and use jmx_registry_port in the nodetool command instead of jmx_server_port (i.e., -p 7200 ) you should see the output.
        Hide
        Alex Araujo added a comment -

        v4 removes conflicting -Dcom.sun.management.jmxremote.port system property from cassandra-env.sh and preserves the current RMI registry default port value.

        Without any changes to the default cassandra.yaml file the following will work:

        nodetool ring
        nodetool -h localhost ring
        
        JConsole Remote Process
        service:jmx:rmi://localhost:7299/jndi/rmi://localhost:7199/jmxrmi
        
        Show
        Alex Araujo added a comment - v4 removes conflicting -Dcom.sun.management.jmxremote.port system property from cassandra-env.sh and preserves the current RMI registry default port value. Without any changes to the default cassandra.yaml file the following will work: nodetool ring nodetool -h localhost ring JConsole Remote Process service:jmx:rmi: //localhost:7299/jndi/rmi://localhost:7199/jmxrmi
        Hide
        Vijay added a comment -

        Plz note: http://download.oracle.com/javase/6/docs/technotes/guides/management/agent.html enables the RMI server, which will in-turn do System.gc every 60 min. I have tried it before and reverted because of the same. http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6200091 you might want to use DisableExplicitGC gc option or you might want to increase the -Dsun.rmi.dgc.server.gcInterval=600000 both of which might be kind of dangerous depnds on how people use it.

        Show
        Vijay added a comment - Plz note: http://download.oracle.com/javase/6/docs/technotes/guides/management/agent.html enables the RMI server, which will in-turn do System.gc every 60 min. I have tried it before and reverted because of the same. http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6200091 you might want to use DisableExplicitGC gc option or you might want to increase the -Dsun.rmi.dgc.server.gcInterval=600000 both of which might be kind of dangerous depnds on how people use it.
        Hide
        Vijay added a comment -

        https://github.com/Vijay2win/jmxProxy is what i use to connect jconsole to the remote vm.

        Show
        Vijay added a comment - https://github.com/Vijay2win/jmxProxy is what i use to connect jconsole to the remote vm.
        Hide
        Alex Araujo added a comment -

        Are you sure that it's System.gc? I thought RMI had a reference counting based "Distributed Garbage Collector" that invokes a clean method just for remote objects.

        Also, AFAIK, starting a JMXConnectorServer from a dynamically loaded javaagent is the same as starting one from compiled code in an application (i.e., the same DGC behavior applies to the JVM running the server regardless of how it's started). If you don't notice a GC impact when running your JmxProxy, chances are the built-in JMXConnectorServer code will not have a noticeable impact as well.

        Nevertheless, I think this warrants additional testing to verify my assumptions. I'll post my results when I get a chance to test this.

        Show
        Alex Araujo added a comment - Are you sure that it's System.gc ? I thought RMI had a reference counting based "Distributed Garbage Collector" that invokes a clean method just for remote objects. Also, AFAIK, starting a JMXConnectorServer from a dynamically loaded javaagent is the same as starting one from compiled code in an application (i.e., the same DGC behavior applies to the JVM running the server regardless of how it's started). If you don't notice a GC impact when running your JmxProxy, chances are the built-in JMXConnectorServer code will not have a noticeable impact as well. Nevertheless, I think this warrants additional testing to verify my assumptions. I'll post my results when I get a chance to test this.
        Hide
        Alex Araujo added a comment -
        Show
        Alex Araujo added a comment - Actually, this seems to confirm my assumptions: http://java.sun.com/developer/onlineTraining/rmi/exercises/DistributedGarbageCollector/index.html
        Hide
        Vijay added a comment -

        Yes i have seen this happening in Java 6 not sure about java 7. Adding a agent will not fix the System.gc issue, but it will just make it configurable to for those it is ok (I mentioned the same in the Readme - github).
        You can enable GC log and look for the Full GC's, DGC is the one which forces every hour.

        Show
        Vijay added a comment - Yes i have seen this happening in Java 6 not sure about java 7. Adding a agent will not fix the System.gc issue, but it will just make it configurable to for those it is ok (I mentioned the same in the Readme - github). You can enable GC log and look for the Full GC's, DGC is the one which forces every hour.
        Hide
        Alex Araujo added a comment -

        Vijay, you are correct - glad you chimed in. I ran the patch with -Dsun.rmi.dgc.server.gcInterval=60000 and this was the output:

        2011-11-17T21:06:03.832-0600: 2.809: [Full GC 29495K->1075K(1028096K), 0.0484004 secs]
        2011-11-17T21:07:03.919-0600: 62.894: [Full GC 73330K->2885K(1028096K), 0.0668785 secs]
        2011-11-17T21:08:03.993-0600: 122.967: [Full GC 16014K->2824K(1028096K), 0.0584933 secs]
        2011-11-17T21:09:04.056-0600: 183.029: [Full GC 15953K->2823K(1028096K), 0.0548553 secs]
        2011-11-17T21:10:04.118-0600: 243.089: [Full GC 17583K->2823K(1028096K), 0.0513944 secs]
        2011-11-17T21:11:04.175-0600: 303.145: [Full GC 24671K->2858K(1028096K), 0.0547190 secs]
        2011-11-17T21:12:04.236-0600: 363.205: [Full GC 21430K->2826K(1028096K), 0.0535503 secs]
        2011-11-17T21:13:04.295-0600: 423.263: [Full GC 17574K->2823K(1028096K), 0.0539739 secs]
        2011-11-17T21:14:04.355-0600: 483.322: [Full GC 17589K->2822K(1028096K), 0.0500845 secs]
        2011-11-17T21:15:04.412-0600: 543.377: [Full GC 19214K->2822K(1028096K), 0.0578777 secs]
        

        -XX:-DisableExplicitGC does not disable Full GC's (only explicit System.gc calls as the name implies), and increasing gcInterval does seem a bit dangerous. At best, we can enable the code in the patch if the values are uncommented in cassandra.yaml, but not sure that's any more useful than the current options.

        Show
        Alex Araujo added a comment - Vijay, you are correct - glad you chimed in. I ran the patch with -Dsun.rmi.dgc.server.gcInterval=60000 and this was the output: 2011-11-17T21:06:03.832-0600: 2.809: [Full GC 29495K->1075K(1028096K), 0.0484004 secs] 2011-11-17T21:07:03.919-0600: 62.894: [Full GC 73330K->2885K(1028096K), 0.0668785 secs] 2011-11-17T21:08:03.993-0600: 122.967: [Full GC 16014K->2824K(1028096K), 0.0584933 secs] 2011-11-17T21:09:04.056-0600: 183.029: [Full GC 15953K->2823K(1028096K), 0.0548553 secs] 2011-11-17T21:10:04.118-0600: 243.089: [Full GC 17583K->2823K(1028096K), 0.0513944 secs] 2011-11-17T21:11:04.175-0600: 303.145: [Full GC 24671K->2858K(1028096K), 0.0547190 secs] 2011-11-17T21:12:04.236-0600: 363.205: [Full GC 21430K->2826K(1028096K), 0.0535503 secs] 2011-11-17T21:13:04.295-0600: 423.263: [Full GC 17574K->2823K(1028096K), 0.0539739 secs] 2011-11-17T21:14:04.355-0600: 483.322: [Full GC 17589K->2822K(1028096K), 0.0500845 secs] 2011-11-17T21:15:04.412-0600: 543.377: [Full GC 19214K->2822K(1028096K), 0.0578777 secs] -XX:-DisableExplicitGC does not disable Full GC's (only explicit System.gc calls as the name implies), and increasing gcInterval does seem a bit dangerous. At best, we can enable the code in the patch if the values are uncommented in cassandra.yaml, but not sure that's any more useful than the current options.
        Hide
        Yuki Morishita added a comment -

        Patch works fine for basic use case(binding agent to specified address), but we we cannot use SSL or password based auth any more, which out-of-the-box JMX agent supports via system properties.
        AFAIK you have to implement those to JmxRemoteListener like the one described below.

        http://docs.oracle.com/javase/6/docs/technotes/guides/management/agent.html#gdfvv

        I don't know how many people need those functionalities(SSL/auth), but since default JMX agent supports those, we also should add those functionalities.

        I think it would be better to provide javaagent version of module like Vijay implemented with SSL/Auth support. Or maybe just give pointer to his module in somewhere?

        Show
        Yuki Morishita added a comment - Patch works fine for basic use case(binding agent to specified address), but we we cannot use SSL or password based auth any more, which out-of-the-box JMX agent supports via system properties. AFAIK you have to implement those to JmxRemoteListener like the one described below. http://docs.oracle.com/javase/6/docs/technotes/guides/management/agent.html#gdfvv I don't know how many people need those functionalities(SSL/auth), but since default JMX agent supports those, we also should add those functionalities. I think it would be better to provide javaagent version of module like Vijay implemented with SSL/Auth support. Or maybe just give pointer to his module in somewhere?
        Hide
        Jonathan Ellis added a comment -

        Alex, are you still working on this?

        Show
        Jonathan Ellis added a comment - Alex, are you still working on this?
        Hide
        Alex Araujo added a comment -

        I haven't looked at this since Nov. Perhaps I should have unassigned it. Is this still something that's needed? I can take a stab at finishing/testing the missing options if there's a preference on -javaagent vs cassandra.yml.

        Show
        Alex Araujo added a comment - I haven't looked at this since Nov. Perhaps I should have unassigned it. Is this still something that's needed? I can take a stab at finishing/testing the missing options if there's a preference on -javaagent vs cassandra.yml.
        Hide
        Yuki Morishita added a comment -

        In my opinion, I prefer to do this outside of Cassandra, since there already is a way to achieve the goal using Vijay's javaagent when needed. Or when people need security and binding to specific IP address, they can do those following the blog post here https://blogs.oracle.com/jmxetc/entry/jmx_connecting_through_firewalls_using.

        Show
        Yuki Morishita added a comment - In my opinion, I prefer to do this outside of Cassandra, since there already is a way to achieve the goal using Vijay's javaagent when needed. Or when people need security and binding to specific IP address, they can do those following the blog post here https://blogs.oracle.com/jmxetc/entry/jmx_connecting_through_firewalls_using .
        Hide
        Jonathan Ellis added a comment -

        resolving notaproblem then.

        Show
        Jonathan Ellis added a comment - resolving notaproblem then.

          People

          • Assignee:
            Unassigned
            Reporter:
            Joaquin Casares
            Reviewer:
            Yuki Morishita
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development