Details

      Description

      We want to avoid a security exploit for our users. We need to make sure we ship 2.2 UDFs with good defaults so someone exposing it to the internet accidentally doesn't open themselves up to having arbitrary code run.

      1. 9402-post-disable.txt
        2 kB
        Robert Stupp
      2. 9402-warning.txt
        2 kB
        Jonathan Ellis

        Issue Links

          Activity

          Hide
          tjake T Jake Luciani added a comment -

          We should also be sandboxing the code, restricting access to things like opening files and sockets http://www.jayway.com/2014/06/13/sandboxing-plugins-in-java/

          Show
          tjake T Jake Luciani added a comment - We should also be sandboxing the code, restricting access to things like opening files and sockets http://www.jayway.com/2014/06/13/sandboxing-plugins-in-java/
          Hide
          snazy Robert Stupp added a comment -

          As discussed on IRC:

          1. 2.2 beta1 will get a flag to enable UDFs (defaults to disabled) - CASSANDRA-9404
          2. 2.2 rc1 will get a proper security manager setup
          Show
          snazy Robert Stupp added a comment - As discussed on IRC: 2.2 beta1 will get a flag to enable UDFs (defaults to disabled) - CASSANDRA-9404 2.2 rc1 will get a proper security manager setup
          Hide
          snazy Robert Stupp added a comment - - edited

          I ran a standard perf test on cstar to compare "pure C*" against "C* with a security manager w/ just AllPermission.

          Performance regression for writes is about 3% and for 1% for writes.

          Background: unfortunately it's only possible to use one "monolithic" SecurityManager in the whole VM. I found no way to use a security manager just during the execution of UDFs. The additional "critical paths" traveled for checking permissions is java.security.AccessController#checkPermission and java.security.AccessControlContext#checkPermission. (Permissions (ProtectionDomain) are "attached" to classes not to threads.)

          EDIT: It's of course possible to turn off the security manager (and probably disable UDFs) - so leaving the "enable UDFs" config options makes sense.

          Show
          snazy Robert Stupp added a comment - - edited I ran a standard perf test on cstar to compare "pure C*" against "C* with a security manager w/ just AllPermission . Performance regression for writes is about 3% and for 1% for writes. Background: unfortunately it's only possible to use one "monolithic" SecurityManager in the whole VM. I found no way to use a security manager just during the execution of UDFs. The additional "critical paths" traveled for checking permissions is java.security.AccessController#checkPermission and java.security.AccessControlContext#checkPermission . (Permissions ( ProtectionDomain ) are "attached" to classes not to threads.) EDIT: It's of course possible to turn off the security manager (and probably disable UDFs) - so leaving the "enable UDFs" config options makes sense.
          Hide
          snazy Robert Stupp added a comment -

          Current branch adds security manager for Java + script UDFs.
          It works against Java7 + Java8 (Rhino/Nashorn).

          Had to add some system properties so that these can be properly replaced in the policy files files:

          • cassandra.triggers to grant permissions to triggers
          • cassandra.home to access the various directories (e.g. build/lib/jars)
          • cassandra.lib to grant permissions to default libraries except other JSR223 implementations

          conf/cassandra.policy is for production use - test/conf/cassandra-ide.policy for unit tests and (in limited form) for use in IDEA and Eclipse. Limited use, because both IDEs add some own code that cannot be easily covered in a security policy file (just because the path of that code is "unknown").

          Permissions for UDFs is basically nothing (hardcoded). But it would be possible to raise permissions by adding some URL scheme for UDFs like udf:keyspace/function - this could make it possible to specify permissions for individual UDFs at the cost of rolling out policy files and restarting nodes or adding our own permission syntax e.g. to CREATE FUNCTION. But I’d like to leave that for another ticket.

          The only exception for no permissions is Nashorn, which requires some runtime permissions to work (pass the existing unit tests - especially UFTest.testJavascriptTupleTypeCollection): nashorn.JavaReflection + accessClassInPackage.jdk.internal.dynalink.support + accessClassInPackage.jdk.nashorn.internal.runtime.linker.
          To be clear: I’m not convinced on adding these permissions - but they seem to be neccessary - somehow related to what Nashorn does internally (but Rhino doesn’t).

          I did not check with other JSR223 implementations - but script compilation and execution is guarded with AccessController.doPrivileged().

          Links:

          Show
          snazy Robert Stupp added a comment - Current branch adds security manager for Java + script UDFs. It works against Java7 + Java8 (Rhino/Nashorn). Had to add some system properties so that these can be properly replaced in the policy files files: cassandra.triggers to grant permissions to triggers cassandra.home to access the various directories (e.g. build/lib/jars) cassandra.lib to grant permissions to default libraries except other JSR223 implementations conf/cassandra.policy is for production use - test/conf/cassandra-ide.policy for unit tests and (in limited form) for use in IDEA and Eclipse. Limited use, because both IDEs add some own code that cannot be easily covered in a security policy file (just because the path of that code is "unknown"). Permissions for UDFs is basically nothing (hardcoded). But it would be possible to raise permissions by adding some URL scheme for UDFs like udf:keyspace/function - this could make it possible to specify permissions for individual UDFs at the cost of rolling out policy files and restarting nodes or adding our own permission syntax e.g. to CREATE FUNCTION . But I’d like to leave that for another ticket. The only exception for no permissions is Nashorn, which requires some runtime permissions to work (pass the existing unit tests - especially UFTest.testJavascriptTupleTypeCollection ): nashorn.JavaReflection + accessClassInPackage.jdk.internal.dynalink.support + accessClassInPackage.jdk.nashorn.internal.runtime.linker . To be clear: I’m not convinced on adding these permissions - but they seem to be neccessary - somehow related to what Nashorn does internally (but Rhino doesn’t). I did not check with other JSR223 implementations - but script compilation and execution is guarded with AccessController.doPrivileged() . Links: Troubeshooting Security Policy Files Nashorn Java Reflection
          Hide
          jbellis Jonathan Ellis added a comment -

          Permissions for UDFs is basically nothing (hardcoded). But it would be possible to raise permissions by adding some URL scheme for UDFs like udf:keyspace/function

          IIRC the approach postgresql takes is to distinguish between "trusted" (sandboxed) and "untrusted" (anything goes) UDF languages. Rather than finer grained sandboxing, I'd suggest just providing different CQL permissions for trusted and untrusted, i.e. if you have CREATE FUNCTION permission that allows you to create sandboxed UDF, but you can only create unsandboxed if you have CREATE UNTRUSTED.

          Show
          jbellis Jonathan Ellis added a comment - Permissions for UDFs is basically nothing (hardcoded). But it would be possible to raise permissions by adding some URL scheme for UDFs like udf:keyspace/function IIRC the approach postgresql takes is to distinguish between "trusted" (sandboxed) and "untrusted" (anything goes) UDF languages. Rather than finer grained sandboxing, I'd suggest just providing different CQL permissions for trusted and untrusted, i.e. if you have CREATE FUNCTION permission that allows you to create sandboxed UDF, but you can only create unsandboxed if you have CREATE UNTRUSTED.
          Hide
          iamaleksey Aleksey Yeschenko added a comment -

          How does Posgres handle permissions in that scenario?

          I'm not a fan of this to be honest. Allowing creation/usage of anything-goes UDFs essentially gives you full access to the DB, and should require superuser status.

          Show
          iamaleksey Aleksey Yeschenko added a comment - How does Posgres handle permissions in that scenario? I'm not a fan of this to be honest. Allowing creation/usage of anything-goes UDFs essentially gives you full access to the DB, and should require superuser status.
          Hide
          snazy Robert Stupp added a comment -

          Updated the branch with working Java UDF sandboxing and class white/blacklisting.
          Java UDFs have no permissions. Additionally it's necessary to restrict the classes that a UDF can access (e.g. access internal C* code, create a Cluster instance from Java driver, etc). Both are very restrictive. It is not even possible to access a harmless class (for example in org.apache.commons.lang). Only "safe" classes in java.lang/math/text/util and the Java Driver as defined in UDFunction can be accessed.

          NB: Script stuff not checked thoroughly yet. But scripted UDFs make me sad. The test UFTest.testJavascriptTupleTypeCollection fails without the permissions getProtectionDomain, nashorn.JavaReflection, accessClassInPackage.jdk.internal.dynalink.support, accessClassInPackage.jdk.nashorn.internal.runtime.linker with AccessControlException in Nashorn. I have no idea what it does there. Rhino requires no additional permissions (I doubt Rhino really knows about permissions).
          Since JSR223 providers are loaded via our "root" class loader, they'd have full permissions - therefore the policy file "grants" no permissions to optional JSR223 providers by default.

          CREATE FUNCTION permission that allows you to create sandboxed UDF, but you can only create unsandboxed if you have CREATE UNTRUSTED

          Have opened CASSANDRA-9476 for that.

          NOTE: It would be a nice feature for OpsCenter best-practice service to check whether enable_user_defined_functions=false if no security manager is present.

          Show
          snazy Robert Stupp added a comment - Updated the branch with working Java UDF sandboxing and class white/blacklisting. Java UDFs have no permissions. Additionally it's necessary to restrict the classes that a UDF can access (e.g. access internal C* code, create a Cluster instance from Java driver, etc). Both are very restrictive. It is not even possible to access a harmless class (for example in org.apache.commons.lang ). Only "safe" classes in java.lang/math/text/util and the Java Driver as defined in UDFunction can be accessed. NB: Script stuff not checked thoroughly yet. But scripted UDFs make me sad. The test UFTest.testJavascriptTupleTypeCollection fails without the permissions getProtectionDomain , nashorn.JavaReflection , accessClassInPackage.jdk.internal.dynalink.support , accessClassInPackage.jdk.nashorn.internal.runtime.linker with AccessControlException in Nashorn. I have no idea what it does there. Rhino requires no additional permissions (I doubt Rhino really knows about permissions). Since JSR223 providers are loaded via our "root" class loader, they'd have full permissions - therefore the policy file "grants" no permissions to optional JSR223 providers by default. CREATE FUNCTION permission that allows you to create sandboxed UDF, but you can only create unsandboxed if you have CREATE UNTRUSTED Have opened CASSANDRA-9476 for that. NOTE: It would be a nice feature for OpsCenter best-practice service to check whether enable_user_defined_functions=false if no security manager is present.
          Hide
          snazy Robert Stupp added a comment -

          Although it's not fully implemented yet (scripted UDFs), I'd like to get some feedback about the actual implementation - whether I left a big hole open and such things. Any volunteer out there?

          Show
          snazy Robert Stupp added a comment - Although it's not fully implemented yet (scripted UDFs), I'd like to get some feedback about the actual implementation - whether I left a big hole open and such things. Any volunteer out there?
          Hide
          jbellis Jonathan Ellis added a comment -

          PG splits it up into two parts:

          CREATE LANGUAGE and subsequently CREATE FUNCTION.

          Creating an untrusted language always requires superuser mode. Once that is done, creating functions in it requires nothing special.

          Personally I would be fine with this approach, but I think it would be more useful to have the extra permission on creating the function, and also wouldn't require adding explicit CREATE LANGUAGE.

          Show
          jbellis Jonathan Ellis added a comment - PG splits it up into two parts: CREATE LANGUAGE and subsequently CREATE FUNCTION . Creating an untrusted language always requires superuser mode. Once that is done, creating functions in it requires nothing special. Personally I would be fine with this approach, but I think it would be more useful to have the extra permission on creating the function, and also wouldn't require adding explicit CREATE LANGUAGE .
          Hide
          iamaleksey Aleksey Yeschenko added a comment -

          I'm not sure that having whitelist patterns with blacklist exceptions is right here. New classes get added in JDK, and it's easy to miss something unless you perform constant audit. Same goes for the driver as well.

          This exposes way, way too much by default. How much can we cut?

          Show
          iamaleksey Aleksey Yeschenko added a comment - I'm not sure that having whitelist patterns with blacklist exceptions is right here. New classes get added in JDK, and it's easy to miss something unless you perform constant audit. Same goes for the driver as well. This exposes way, way too much by default. How much can we cut?
          Hide
          snazy Robert Stupp added a comment -

          white/blacklisting works like this:

          1. The whitelist is checked whether a class is allowed (either by its package name or its class name). So unknown packages are rejected.
          2. When a matching entry in the whitelist has been found, the blacklist is checked whether it’s a restricted package or class.

          I think you mean that it’s possible to miss a new sub-package of for example java.util. We could of course make the white/blacklist approach a bit more flexible by using regex instead of String.startsWith. Honestly I think that even missing a new package in the java. namespace is quite safe, if a security manager is in place. And using UDFs without a security-manager is prevented.

          We could of course mention all classes explicitly (basically only white-listing classes). But we need to be very careful then to include every (static) inner class, too.

          Unfortunately it’s not possible to setup a separate class loader just for UDFs. Otherwise that separate class loader would have to load the relevant classes on its own - which leads to duplicate classes in the VM and strange error messages like AbstractType cannot be casted AbstractType. Sure, we could use a root class loader having apache-cassandra.jar and cassandra-driver-core.jar as the base for both UDFs and ”the rest of C*”. But I’m not sure whether this really solves the problem or introduces other strange behaviors.

          Show
          snazy Robert Stupp added a comment - white/blacklisting works like this: The whitelist is checked whether a class is allowed (either by its package name or its class name). So unknown packages are rejected. When a matching entry in the whitelist has been found, the blacklist is checked whether it’s a restricted package or class. I think you mean that it’s possible to miss a new sub-package of for example java.util . We could of course make the white/blacklist approach a bit more flexible by using regex instead of String.startsWith . Honestly I think that even missing a new package in the java. namespace is quite safe, if a security manager is in place. And using UDFs without a security-manager is prevented. We could of course mention all classes explicitly (basically only white-listing classes). But we need to be very careful then to include every (static) inner class, too. Unfortunately it’s not possible to setup a separate class loader just for UDFs. Otherwise that separate class loader would have to load the relevant classes on its own - which leads to duplicate classes in the VM and strange error messages like AbstractType cannot be casted AbstractType . Sure, we could use a root class loader having apache-cassandra.jar and cassandra-driver-core.jar as the base for both UDFs and ”the rest of C*”. But I’m not sure whether this really solves the problem or introduces other strange behaviors.
          Hide
          brianmhess Brian Hess added a comment -

          There is another alternative - FENCED UDFs. DB2 and Netezza had these options. Essentially, the UDF will run in a separate process when it is registered as FENCED, and run in-process when it is registered as UNFENCED.

          This doesn't necessarily remove all the issues, but it does help mitigate them/some - especially since it would (optionally) run as another user.

          This could look like the following with Cassandra:

          • FENCED is a GRANTable privilege
          • In cassandra.yaml you can specify the user to use when launching the separate process (so that it is not the same user that is running the database - or optionally is)
          • This is good so that the UDF can't stop the database, delete database files, etc.
          • For FENCED UDFs, IPC would be used to transfer rows to the UDF and to return results. We could use CQL rows for the data. This could be shared memory or sockets (Unux or TPC - slight preference for sockets for some follow-on ideas).
          • Ideally, switching from FENCED to UNFENCED would be just a DDL change. That is, the API would work such that a simple "ALTER FUNCTION myFunction(DOUBLE, DOUBLE) UNFENCED" would change it.
          • If you wanted, because this is a separate process you could use a separate class loader.
          Show
          brianmhess Brian Hess added a comment - There is another alternative - FENCED UDFs. DB2 and Netezza had these options. Essentially, the UDF will run in a separate process when it is registered as FENCED, and run in-process when it is registered as UNFENCED. This doesn't necessarily remove all the issues, but it does help mitigate them/some - especially since it would (optionally) run as another user. This could look like the following with Cassandra: FENCED is a GRANTable privilege In cassandra.yaml you can specify the user to use when launching the separate process (so that it is not the same user that is running the database - or optionally is) This is good so that the UDF can't stop the database, delete database files, etc. For FENCED UDFs, IPC would be used to transfer rows to the UDF and to return results. We could use CQL rows for the data. This could be shared memory or sockets (Unux or TPC - slight preference for sockets for some follow-on ideas). Ideally, switching from FENCED to UNFENCED would be just a DDL change. That is, the API would work such that a simple "ALTER FUNCTION myFunction(DOUBLE, DOUBLE) UNFENCED" would change it. If you wanted, because this is a separate process you could use a separate class loader.
          Hide
          jbellis Jonathan Ellis added a comment -

          Worth thinking about for the future, but -1 on blocking 2.2 for reimplementing with FENCED.

          Show
          jbellis Jonathan Ellis added a comment - Worth thinking about for the future, but -1 on blocking 2.2 for reimplementing with FENCED.
          Hide
          brianmhess Brian Hess added a comment - - edited

          I do think that you may want to leverage FENCED UDFs to accomplish some of the objections/objectives here.
          Either way, I created CASSANDRA-9481 for FENCED UDFs

          Show
          brianmhess Brian Hess added a comment - - edited I do think that you may want to leverage FENCED UDFs to accomplish some of the objections/objectives here. Either way, I created CASSANDRA-9481 for FENCED UDFs
          Hide
          jbellis Jonathan Ellis added a comment -

          I'm also a bit concerned about the whitelist/blacklist approach being error prone.

          What prior art is out there for this? Seems like something people would have tried to solve before.

          Can we label UDF experimental in 2.2 and ship based on the 9404 approach and push this to 3.0?

          Show
          jbellis Jonathan Ellis added a comment - I'm also a bit concerned about the whitelist/blacklist approach being error prone. What prior art is out there for this? Seems like something people would have tried to solve before. Can we label UDF experimental in 2.2 and ship based on the 9404 approach and push this to 3.0?
          Hide
          iamaleksey Aleksey Yeschenko added a comment -

          Can we label UDF experimental in 2.2 and ship based on the 9404 approach and push this to 3.0?

          I will be okay with it, so longer as we communicate this very clearly. Experimentalness is also very limited here - we are pretty certain that the syntax will stay the same, it's just that some functions might become broken for security reasons once the sandbox is in place.

          Show
          iamaleksey Aleksey Yeschenko added a comment - Can we label UDF experimental in 2.2 and ship based on the 9404 approach and push this to 3.0? I will be okay with it, so longer as we communicate this very clearly. Experimentalness is also very limited here - we are pretty certain that the syntax will stay the same, it's just that some functions might become broken for security reasons once the sandbox is in place.
          Hide
          jbellis Jonathan Ellis added a comment -

          Attached

          Show
          jbellis Jonathan Ellis added a comment - Attached
          Hide
          iamaleksey Aleksey Yeschenko added a comment -

          LGTM

          Show
          iamaleksey Aleksey Yeschenko added a comment - LGTM
          Hide
          jbellis Jonathan Ellis added a comment -

          Committed and edited fixver for this ticket to 3.0b1

          Show
          jbellis Jonathan Ellis added a comment - Committed and edited fixver for this ticket to 3.0b1
          Hide
          snazy Robert Stupp added a comment -

          There are some other projects that also execute arbitrary code. Will look around how these projects implement security - and how restrictive their approaches are.
          I'm fine with leaving UDFs experimental in 2.2. Gives a nice amount of time to thoroughly think about and test the sandbox.

          Show
          snazy Robert Stupp added a comment - There are some other projects that also execute arbitrary code. Will look around how these projects implement security - and how restrictive their approaches are. I'm fine with leaving UDFs experimental in 2.2. Gives a nice amount of time to thoroughly think about and test the sandbox.
          Hide
          jbellis Jonathan Ellis added a comment -

          I had a look at postgresql's pl/java source. The SecurityManager is pretty simple: https://github.com/tada/pljava/blob/5489d14073f02c064583f1d4818c2bcea7fee951/pljava/src/main/java/org/postgresql/pljava/internal/Backend.java

          Why can't we do something like that?

          Show
          jbellis Jonathan Ellis added a comment - I had a look at postgresql's pl/java source. The SecurityManager is pretty simple: https://github.com/tada/pljava/blob/5489d14073f02c064583f1d4818c2bcea7fee951/pljava/src/main/java/org/postgresql/pljava/internal/Backend.java Why can't we do something like that?
          Hide
          snazy Robert Stupp added a comment - - edited

          The pl/java SM is too lax. It only prevents System.exit, System.setProperty and setSecurityManager - and does not restrict access to classes.
          Further, it only checks for RuntimePermission and PropertyPermission - that's not everything - a lot of permissions like ExecPermission, NetPermission, LinkPermission, SOcketPermission, SecurityPermission and ReflectPermission are missing. Sure, as of now, we have a "very binary" permission model: either all permissions or no permissions.

          The methods SecurityManager.checkPackageAccess/checkPackageDefinition() are not sufficient to prevent class access - we want people to be able to access java.lang.Long but not java.lang.Runtime.

          With static code blocks, it's easy to execute code with a simple Class.forName() or by just assuming that you've managed to get that static code block through the UDF compile process. To check access to specific classes, you really have to implement a custom class loader. That's what the UDF CL does - it does not know about any class itself - so it's able to check all class names - even those in java.lang. To prevent duplicate classes (e.g. UDFunction cannot be cast to UDFunction it delegates to the "main" class loader.

          EDIT: I would like to implement a custom SM - but mainly for performance reasons since we do not need the flexible permission concept at the moment - it's only "all perms" or "no perms". Well - except for Nashorn...

          Show
          snazy Robert Stupp added a comment - - edited The pl/java SM is too lax. It only prevents System.exit , System.setProperty and setSecurityManager - and does not restrict access to classes. Further, it only checks for RuntimePermission and PropertyPermission - that's not everything - a lot of permissions like ExecPermission , NetPermission , LinkPermission , SOcketPermission , SecurityPermission and ReflectPermission are missing. Sure, as of now, we have a "very binary" permission model: either all permissions or no permissions. The methods SecurityManager.checkPackageAccess/checkPackageDefinition() are not sufficient to prevent class access - we want people to be able to access java.lang.Long but not java.lang.Runtime . With static code blocks, it's easy to execute code with a simple Class.forName() or by just assuming that you've managed to get that static code block through the UDF compile process. To check access to specific classes, you really have to implement a custom class loader. That's what the UDF CL does - it does not know about any class itself - so it's able to check all class names - even those in java.lang . To prevent duplicate classes (e.g. UDFunction cannot be cast to UDFunction it delegates to the "main" class loader. EDIT: I would like to implement a custom SM - but mainly for performance reasons since we do not need the flexible permission concept at the moment - it's only "all perms" or "no perms". Well - except for Nashorn...
          Hide
          tjake T Jake Luciani added a comment -

          Can we label UDF experimental in 2.2 and ship based on the 9404 approach and push this to 3.0?

          Yes so long as we don't remove the .yaml flag disabling the udf feature by default

          Show
          tjake T Jake Luciani added a comment - Can we label UDF experimental in 2.2 and ship based on the 9404 approach and push this to 3.0? Yes so long as we don't remove the .yaml flag disabling the udf feature by default
          Hide
          jbellis Jonathan Ellis added a comment -

          Why can't we do something like [pl/java's SecurityManager]?

          The problem is that SecurityManager is global to the JVM. I don't see any way to make it ClassLoader-specific. So we can't use it to stop UDF from opening sockets and writing to files, without restricting the rest of C* too.

          Now, you can inspect the call stack in a SecurityManager permissions check, so we could do that and only ban the activity if it's coming from a UDF call site, but may not be acceptable for performance.

          I think the other main option is what Brian suggests and spin up a separate JVM for UDF execution. (In which case I would want to split it into the trusted/untrested options we discussed so that users could opt into running in the C* JVM when necessary.)

          Show
          jbellis Jonathan Ellis added a comment - Why can't we do something like [pl/java's SecurityManager] ? The problem is that SecurityManager is global to the JVM. I don't see any way to make it ClassLoader-specific. So we can't use it to stop UDF from opening sockets and writing to files, without restricting the rest of C* too. Now, you can inspect the call stack in a SecurityManager permissions check, so we could do that and only ban the activity if it's coming from a UDF call site, but may not be acceptable for performance. I think the other main option is what Brian suggests and spin up a separate JVM for UDF execution. (In which case I would want to split it into the trusted/untrested options we discussed so that users could opt into running in the C* JVM when necessary.)
          Hide
          snazy Robert Stupp added a comment -

          So we can't use it to stop UDF from opening sockets and writing to files, without restricting the rest of C* too.

          No. Class loaders associate a ProtectionDomain to each class via java.lang.ClassLoader#defineClass(..., ProtectionDomain).
          At runtime (in AccessController.checkPermission) a stack of these {{ProtectionDomain}}s is used to verify permissions.
          Assigning AllPermission to all jars in lib + build/classes and essentially no permissions to UDFs works.

          UDF classes are defined via the UDF class loader - and the UDF class loader is also used for the thread via Thread.setContextClassLoader().

          The pain-point with a SecurityManager is the performance regression of 1% to 3% just because "it's there". I want to experiment with a ThreadLocal in a custom SM to speed that up.

          Another option is probably to spin-off separate threads for UDF execution. These threads could get a complete separate class loader hierarchy. I think doing that in separate threads would cost "only" the time for a context switch. But we could monitor UDF runtime and basically kill a dead UDF.
          We already have serialization effort to convert from and to {{ByteBuffer}}s (arguments and return types).

          (Hope, I expressed it clearly)

          Show
          snazy Robert Stupp added a comment - So we can't use it to stop UDF from opening sockets and writing to files, without restricting the rest of C* too. No. Class loaders associate a ProtectionDomain to each class via java.lang.ClassLoader#defineClass(..., ProtectionDomain) . At runtime (in AccessController.checkPermission ) a stack of these {{ProtectionDomain}}s is used to verify permissions. Assigning AllPermission to all jars in lib + build/classes and essentially no permissions to UDFs works. UDF classes are defined via the UDF class loader - and the UDF class loader is also used for the thread via Thread.setContextClassLoader() . The pain-point with a SecurityManager is the performance regression of 1% to 3% just because "it's there". I want to experiment with a ThreadLocal in a custom SM to speed that up. Another option is probably to spin-off separate threads for UDF execution. These threads could get a complete separate class loader hierarchy. I think doing that in separate threads would cost "only" the time for a context switch. But we could monitor UDF runtime and basically kill a dead UDF. We already have serialization effort to convert from and to {{ByteBuffer}}s (arguments and return types). (Hope, I expressed it clearly)
          Hide
          snazy Robert Stupp added a comment -

          Pushed an update of sandboxing to my branch and wanted to provide some status of the work.

          • Uses a global, custom implementation of a SecurityManager and Policy. The security-manager only performs permission checks for user-defined functions (try-finally with a ThreadLocal) - this eliminates the previously mentioned 3% pref regression. (cstar).
          • The effective Permissions set for UDFs is empty (Java and Javascript). No access to files, sockets, processes, etc.
          • UDFs only have access to a very restricted set of classes.
            • This works very nicely for Java UDFs. Direct use and Class.forName() approaches are prevented. Access to Runtime, Thread and other ”evil” classes is prevented with a compiler error or ClassNotFoundException.
            • Scripted UDFs need to take the SecurityManager.checkPackageAccess() approach (see below), which results in an AccessControlException.
            • So Nashorn unfortunately still allows code like java.lang.Runtime.getRuntime().gc() - but Nashorn does a nice job to prevent bad access in general.
            • Thread starting required some special casing in the sec-mgr (the default permission handling in java.lang.SecurityManager is … well … could be improved). Java SecurityManager is kind of dangerous because it explicitly allows thread creation and modification of all threads that do not belong to the root thread group (source code cite: // just return).

          Notes:

          • I did not test this with other JSR223 implementations. Security heavily depends on the implementation of the scripting engine. All UDFs are executed within a doPriviledged() and the UDF class loader as the thread’s context class loader.
          • Did also not test against Rhino (Java7) due to the tentative "decision" for Java8 for 3.0

          White/blacklisting of classes (class loading):
          I tried to find something that’s better than WB, but honestly could not find anything better. It works as follows:

          1. only whitelisted pattern (startsWith) are allowed - if not found, then it’s not accessible
          2. if the individual pattern matching a whitelist entry is contained in the blacklist, then it’s not accessible
          3. so, patterns matching whitelist and not matching blacklist are allowed

          It is possible to consume a lot of CPU with an endless loop or to force a lot of GCs using Nashorn (java.lang.Runtime.getRuntime().gc()).
          Will work on finding a solution for that (separate pooled threads, adopted thread priority, maximum execution time + UDF blacklisting) - maybe with a separate class loader hierarchy.

          I’m not completely sold on moving UDF execution to a separate process (fenced UDFs).
          TBH I think it adds too much complexity - requires a new IPC/RPC protocol, state management, recovery scenarios, etc. plus child process (state) management + recovery.
          To make it really safe, each individual invocation would have to spawn its own process, that has to be monitored.

          We can probably not prevent UDFs from excessive use of heap space (by accident or bad intention).

          Other implementations:

          • JEE containers rely on a (more or less) standard SecurityManager + Policy implementation for the whole VM. These usually rely on proper class loader separation.
          • Some other projects use user-provided code - either these projects have no or effectively no security/sandboxing.
          Show
          snazy Robert Stupp added a comment - Pushed an update of sandboxing to my branch and wanted to provide some status of the work. Uses a global, custom implementation of a SecurityManager and Policy . The security-manager only performs permission checks for user-defined functions (try-finally with a ThreadLocal ) - this eliminates the previously mentioned 3% pref regression. ( cstar ). The effective Permissions set for UDFs is empty (Java and Javascript). No access to files, sockets, processes, etc. UDFs only have access to a very restricted set of classes. This works very nicely for Java UDFs. Direct use and Class.forName() approaches are prevented. Access to Runtime , Thread and other ”evil” classes is prevented with a compiler error or ClassNotFoundException . Scripted UDFs need to take the SecurityManager.checkPackageAccess() approach (see below), which results in an AccessControlException . So Nashorn unfortunately still allows code like java.lang.Runtime.getRuntime().gc() - but Nashorn does a nice job to prevent bad access in general. Thread starting required some special casing in the sec-mgr (the default permission handling in java.lang.SecurityManager is … well … could be improved). Java SecurityManager is kind of dangerous because it explicitly allows thread creation and modification of all threads that do not belong to the root thread group (source code cite: // just return ). Notes: I did not test this with other JSR223 implementations. Security heavily depends on the implementation of the scripting engine. All UDFs are executed within a doPriviledged() and the UDF class loader as the thread’s context class loader. Did also not test against Rhino (Java7) due to the tentative "decision" for Java8 for 3.0 White/blacklisting of classes (class loading): I tried to find something that’s better than WB, but honestly could not find anything better. It works as follows: only whitelisted pattern ( startsWith ) are allowed - if not found, then it’s not accessible if the individual pattern matching a whitelist entry is contained in the blacklist, then it’s not accessible so, patterns matching whitelist and not matching blacklist are allowed It is possible to consume a lot of CPU with an endless loop or to force a lot of GCs using Nashorn ( java.lang.Runtime.getRuntime().gc() ). Will work on finding a solution for that (separate pooled threads, adopted thread priority, maximum execution time + UDF blacklisting) - maybe with a separate class loader hierarchy. I’m not completely sold on moving UDF execution to a separate process (fenced UDFs). TBH I think it adds too much complexity - requires a new IPC/RPC protocol, state management, recovery scenarios, etc. plus child process (state) management + recovery. To make it really safe, each individual invocation would have to spawn its own process, that has to be monitored. We can probably not prevent UDFs from excessive use of heap space (by accident or bad intention). Other implementations: JEE containers rely on a (more or less) standard SecurityManager + Policy implementation for the whole VM. These usually rely on proper class loader separation. Some other projects use user-provided code - either these projects have no or effectively no security/sandboxing.
          Hide
          jbellis Jonathan Ellis added a comment -

          UDFs only have access to a very restricted set of classes.

          Is this actually necessary given that we're already restricting the Permissions?

          Show
          jbellis Jonathan Ellis added a comment - UDFs only have access to a very restricted set of classes. Is this actually necessary given that we're already restricting the Permissions?
          Hide
          snazy Robert Stupp added a comment -

          [restricted set of classes] necessary given that we're already restricting the Permissions?

          Unfortunately yes. One example is creating a new thread: Although there is a permission to prevent a new thread being started, the actual check only works from the root thread group (e.g. from thread main, see java.lang.SecurityManager.checkAccess(ThreadGroup) and checkAccess(Thread), which say: if (g == rootGroup) checkPermission(SecurityConstants.MODIFY_THREADGROUP_PERMISSION); else // just return) - so you can create and start a new thread from any other thread group - irrespectively how restrictive the actual permission set is (e.g. no permissions). That's a quite annoying thing in Java and was a surprising behavior (for me).

          Show
          snazy Robert Stupp added a comment - [restricted set of classes] necessary given that we're already restricting the Permissions? Unfortunately yes. One example is creating a new thread: Although there is a permission to prevent a new thread being started, the actual check only works from the root thread group (e.g. from thread main , see java.lang.SecurityManager.checkAccess(ThreadGroup) and checkAccess(Thread) , which say: if (g == rootGroup) checkPermission(SecurityConstants.MODIFY_THREADGROUP_PERMISSION); else // just return ) - so you can create and start a new thread from any other thread group - irrespectively how restrictive the actual permission set is (e.g. no permissions). That's a quite annoying thing in Java and was a surprising behavior (for me).
          Hide
          jbellis Jonathan Ellis added a comment -

          Does the same hold true for the other major categories (sockets, files)? Can we just extend SecurityManager to check on all threadgroups?

          Show
          jbellis Jonathan Ellis added a comment - Does the same hold true for the other major categories (sockets, files)? Can we just extend SecurityManager to check on all threadgroups?
          Hide
          snazy Robert Stupp added a comment -

          For threads it would work (and it's in the branch).
          I'm 99% sure, that sockets and files are have pretty good protection via SecurityManager - so g2g for java.*
          The reason why I've added such "class usage prevention" is the C* code base and the libraries.
          And since that was already in place, I also restricted access to "dangerous" java.* classes.

          I also tried to implement a completely separate class loader hierarchy (so one for C* core and one for Java UDFs just with the Java UDFs and the supporting Java Driver classes). But it just did not work due to indirect class references by the Java Driver itself. Modifying the Java Driver and possibly restricting its future development for sandboxing felt somewhat ugly.

          Show
          snazy Robert Stupp added a comment - For threads it would work (and it's in the branch). I'm 99% sure, that sockets and files are have pretty good protection via SecurityManager - so g2g for java.* The reason why I've added such "class usage prevention" is the C* code base and the libraries. And since that was already in place, I also restricted access to "dangerous" java .* classes. I also tried to implement a completely separate class loader hierarchy (so one for C* core and one for Java UDFs just with the Java UDFs and the supporting Java Driver classes). But it just did not work due to indirect class references by the Java Driver itself. Modifying the Java Driver and possibly restricting its future development for sandboxing felt somewhat ugly.
          Hide
          jbellis Jonathan Ellis added a comment -

          Sounds reasonable. Can you take review T Jake Luciani?

          Show
          jbellis Jonathan Ellis added a comment - Sounds reasonable. Can you take review T Jake Luciani ?
          Hide
          tjake T Jake Luciani added a comment - - edited

          Overall, This is an improvement. We spoke offline and addressed a potential issue with user_function_timeout_policy. Since a Stop-the-world GC could happen during execution of the UDF.

          I'd like to get a professional opinion on this work, since I'm not convinced you couldn't, for example, access "/etc/passwd" via Nashorn (since nio is whitelisted).

          Show
          tjake T Jake Luciani added a comment - - edited Overall, This is an improvement. We spoke offline and addressed a potential issue with user_function_timeout_policy. Since a Stop-the-world GC could happen during execution of the UDF. I'd like to get a professional opinion on this work, since I'm not convinced you couldn't, for example, access "/etc/passwd" via Nashorn (since nio is whitelisted).
          Hide
          iamaleksey Aleksey Yeschenko added a comment -

          Is it giving the user access to Keyspace, Schema, and related classes? Is it still possible with this sandbox to bypass our authz permissions this way?

          Show
          iamaleksey Aleksey Yeschenko added a comment - Is it giving the user access to Keyspace , Schema , and related classes? Is it still possible with this sandbox to bypass our authz permissions this way?
          Hide
          snazy Robert Stupp added a comment -

          Already pushed a commit to solve the long GC issue.

          Re-checked what happens, if java.net and java.no are not available to scripted UDFs - and it now works. Probably due to changes in the Java Driver.
          Anyway - pushed another commit that removes these packages.

          And +10 to get more opinions about this patch (for Java + JavaScript).

          Show
          snazy Robert Stupp added a comment - Already pushed a commit to solve the long GC issue. Re-checked what happens, if java.net and java.no are not available to scripted UDFs - and it now works. Probably due to changes in the Java Driver. Anyway - pushed another commit that removes these packages. And +10 to get more opinions about this patch (for Java + JavaScript).
          Hide
          snazy Robert Stupp added a comment -

          access to Keyspace, Schema, and related classes?

          no access to these classes (and thus no way to bypass authz perms)

          Show
          snazy Robert Stupp added a comment - access to Keyspace, Schema, and related classes? no access to these classes (and thus no way to bypass authz perms)
          Hide
          jbellis Jonathan Ellis added a comment -

          nio is whitelisted, but my understanding is that's only checked if the SecurityManager approves. All i/o (file, socket) is prohibited there.

          Show
          jbellis Jonathan Ellis added a comment - nio is whitelisted, but my understanding is that's only checked if the SecurityManager approves. All i/o (file, socket) is prohibited there.
          Hide
          tjake T Jake Luciani added a comment -

          One other issue is the fact that we don't seal our jars. So someone could implement a bad method in a whitelisted package name

          http://docs.oracle.com/javase/tutorial/deployment/jar/sealman.html

          Show
          tjake T Jake Luciani added a comment - One other issue is the fact that we don't seal our jars. So someone could implement a bad method in a whitelisted package name http://docs.oracle.com/javase/tutorial/deployment/jar/sealman.html
          Hide
          snazy Robert Stupp added a comment -

          All java UDFs land in org.apache.cassandra.cql3.functions.JavaBasedUDFunction#GENERATED_PACKAGE (org.apache.cassandra.cql3.udf.gen) - so it's not a big problem.
          But I'll cross check, whether Java-UDFs could access each other.

          Show
          snazy Robert Stupp added a comment - All java UDFs land in org.apache.cassandra.cql3.functions.JavaBasedUDFunction#GENERATED_PACKAGE ( org.apache.cassandra.cql3.udf.gen ) - so it's not a big problem. But I'll cross check, whether Java-UDFs could access each other.
          Hide
          snazy Robert Stupp added a comment -

          Pushed two more commits:

          • some more tests (against NIO classes)
          • separate package for each UDF with random component in the name + added random part to class name
          Show
          snazy Robert Stupp added a comment - Pushed two more commits: some more tests (against NIO classes) separate package for each UDF with random component in the name + added random part to class name
          Hide
          snazy Robert Stupp added a comment -

          Pushed another commit, that fixes the failing dtests.
          These failed because they (thankfully) only execute a JavaScript UDF (so unveiled a bug in this patch, if a JavaScript UDF is executed first).
          This commit also fixes the related issues and adds a separate utest to test that (only JavaScript UDFs).
          It also maintains restricted access to java.nio and java.net - but has to initialize the driver classes to ensure that.

          Show
          snazy Robert Stupp added a comment - Pushed another commit, that fixes the failing dtests. These failed because they (thankfully) only execute a JavaScript UDF (so unveiled a bug in this patch, if a JavaScript UDF is executed first). This commit also fixes the related issues and adds a separate utest to test that (only JavaScript UDFs). It also maintains restricted access to java.nio and java.net - but has to initialize the driver classes to ensure that.
          Hide
          tjake T Jake Luciani added a comment -

          +1

          Show
          tjake T Jake Luciani added a comment - +1
          Hide
          snazy Robert Stupp added a comment -

          Thanks!
          Committed as 5790b4a44ba85e7e8ece64613d9e6a1b737a6cde

          Show
          snazy Robert Stupp added a comment - Thanks! Committed as 5790b4a44ba85e7e8ece64613d9e6a1b737a6cde
          Hide
          iamaleksey Aleksey Yeschenko added a comment -

          The patch enables UDFs by default. I thought the plan was to still leave them off by default (but when on, have better protection).

          Show
          iamaleksey Aleksey Yeschenko added a comment - The patch enables UDFs by default. I thought the plan was to still leave them off by default (but when on, have better protection).
          Hide
          snazy Robert Stupp added a comment -

          Hm - thought we disable them in 2.2 (since experimental) and enable in 3.0 (since we have a sandbox)

          Show
          snazy Robert Stupp added a comment - Hm - thought we disable them in 2.2 (since experimental) and enable in 3.0 (since we have a sandbox)
          Hide
          iamaleksey Aleksey Yeschenko added a comment -

          Hm - thought we disable them in 2.2 (since experimental) and enable in 3.0 (since we have a sandbox)

          The sandbox will need to pass the test of time first, before we can just enable UDFs by default. FWIW, I've looked at the code - multiple times - and it seems fine. But I wouldn't trust myself (obviously), Jake, you, or even all of us combined, to get it 100% right on the first try.

          The way I see it, for now, is that the sandbox makes enabling UDFs an easier choice, by making it safer. But I would still strongly prefer them to be off by default, at least until 4.0.

          Show
          iamaleksey Aleksey Yeschenko added a comment - Hm - thought we disable them in 2.2 (since experimental) and enable in 3.0 (since we have a sandbox) The sandbox will need to pass the test of time first, before we can just enable UDFs by default. FWIW, I've looked at the code - multiple times - and it seems fine. But I wouldn't trust myself (obviously), Jake, you, or even all of us combined, to get it 100% right on the first try. The way I see it, for now, is that the sandbox makes enabling UDFs an easier choice, by making it safer. But I would still strongly prefer them to be off by default, at least until 4.0.
          Hide
          snazy Robert Stupp added a comment -

          Well, I'm pretty sure, nobody can make it 100% - there's always a way to break things. But we can make it good enough to prevent probably 99.9% of user mistakes.

          Understand your cautiousness. Enabling UDFs in 4.0 sounds like a nice plan.
          Attached patch to disable UDFs by default.

          Show
          snazy Robert Stupp added a comment - Well, I'm pretty sure, nobody can make it 100% - there's always a way to break things. But we can make it good enough to prevent probably 99.9% of user mistakes. Understand your cautiousness. Enabling UDFs in 4.0 sounds like a nice plan. Attached patch to disable UDFs by default.
          Hide
          iamaleksey Aleksey Yeschenko added a comment -

          We could relax the constraint on Java UDFs sooner than that. Script-based sandboxing that we have is not exactly secure, though (and there are no easy ways to fix it that I see), so that will probably have to stay forever gated.

          Show
          iamaleksey Aleksey Yeschenko added a comment - We could relax the constraint on Java UDFs sooner than that. Script-based sandboxing that we have is not exactly secure, though (and there are no easy ways to fix it that I see), so that will probably have to stay forever gated.
          Hide
          slebresne Sylvain Lebresne added a comment -

          But I would still strongly prefer them to be off by default, at least until 4.0.

          I don't have strong feeling and I'm fine with not defaulting them on for 3.0, though if we get more confident about this before 4.0, I don't think we should feel obliged to wait for a major version to make the on by default.

          Anyway, just wanted to make sure we at least don't call them experimental anymore for 3.0. It's ok to say we prefer having it opt-in for now because we're not entirely confident on our protections and you should assert the risk for yourself, but "expiremental" sounds like "don't use it in production it probably doesn't work".

          Show
          slebresne Sylvain Lebresne added a comment - But I would still strongly prefer them to be off by default, at least until 4.0. I don't have strong feeling and I'm fine with not defaulting them on for 3.0, though if we get more confident about this before 4.0, I don't think we should feel obliged to wait for a major version to make the on by default. Anyway, just wanted to make sure we at least don't call them experimental anymore for 3.0. It's ok to say we prefer having it opt-in for now because we're not entirely confident on our protections and you should assert the risk for yourself, but "expiremental" sounds like "don't use it in production it probably doesn't work".
          Hide
          iamaleksey Aleksey Yeschenko added a comment -

          Anyway, just wanted to make sure we at least don't call them experimental anymore for 3.0

          We don't, and I'm not suggesting that we do. experimental is a done deal. Maybe we should label script-based UDFs as experimental, but not Java ones.

          Show
          iamaleksey Aleksey Yeschenko added a comment - Anyway, just wanted to make sure we at least don't call them experimental anymore for 3.0 We don't, and I'm not suggesting that we do. experimental is a done deal. Maybe we should label script-based UDFs as experimental, but not Java ones.
          Hide
          iamaleksey Aleksey Yeschenko added a comment -

          +1 to the follow up patch (though would prefer an explicit = false).

          Show
          iamaleksey Aleksey Yeschenko added a comment - +1 to the follow up patch (though would prefer an explicit = false ).
          Hide
          snazy Robert Stupp added a comment -

          Thanks, committed follow-up as 7a6c3272e3bb6b69ec55bb362ddad5978b4fb35f

          Show
          snazy Robert Stupp added a comment - Thanks, committed follow-up as 7a6c3272e3bb6b69ec55bb362ddad5978b4fb35f

            People

            • Assignee:
              snazy Robert Stupp
              Reporter:
              tjake T Jake Luciani
              Reviewer:
              T Jake Luciani
            • Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development