How the Secure Scripting in Activiti works
One of the prominent features of the recent Activiti 5.21.0 release is ‘secure scripting’. The way to enable and use this feature is documented in detail in the Activiti user guide. In this post, I’ll show you how we came to its final implementation and what it’s doing under the hood. And of course, as it is my usual signature style, we’ll also have a bit of a look at the performance.
The Problem
The Activiti engine has supported scripting for script tasks (and task/execution listeners) since a long time. The scripts that are used are defined in the process definition and they can be executed directly after deploying the process definition. Which is something many people like. This is a big difference with Java delegate classes or delegate expressions, as they generally require putting the actual logic on the classpath. Which, in itself already introduces some sort of ‘protection’ as a power user generally only can do this.
However, with scripts, no such ‘extra step’ is needed. If you give the power of script tasks to end users (and we know from some of our users some companies do have this use case), all bets are pretty much off. You can shut down the JVM or do malicious things by executing a process instance.
A second problem is that it’s quite easy to write a script that does an infinite loop and never ends. A third problem is that a script can easily use a lot of memory when executed and hog a lot of system resources.
Let’s look at the first problem for starters. First off all, let’s add the latest and greatest Activiti engine dependency and the H2 in memory database library:
<dependencies> <dependency> <groupId>org.activiti</groupId> <artifactId>activiti-engine</artifactId> <version>5.21.0</version> </dependency> <dependency> <groupId>com.h2database</groupId> <artifactId>h2</artifactId> <version>1.3.176</version> </dependency> </dependencies>
The process we’ll use here is trivially simple: just a start event, script task and end. The process is not really the point here, the script execution is.
The first script we’ll try does two things: it will get and display my machine’s current network configuration (but there are obviously more dangerous applications of this idea) and then shutdown the whole JVM. Of course, in a proper setup, some of this will be mitigated by making sure that the user running the logic does not have any rights that matter on the machine (but doesn’t solve the resources hogging issue). But I think that demonstrates pretty well why giving the power of scripts to just about anyone is really bad security-wise.
<scriptTask id="myScriptTask" scriptFormat="javascript"> <script> var s = new java.util.Scanner(java.lang.Runtime.getRuntime().exec("ifconfig").getInputStream()).useDelimiter("\\A"); var output = s.hasNext() ? s.next() : ""; java.lang.System.out.println("--- output = " + output); java.lang.System.exit(1); </script> </scriptTask>
Let’s deploy the process definition and execute a process instance:
public class Demo1 { public static void main (String[] args) { // Build engine and deploy ProcessEngine processEngine = new StandaloneInMemProcessEngineConfiguration().buildProcessEngine(); RepositoryService repositoryService = processEngine.getRepositoryService(); repositoryService.createDeployment().addClasspathResource("process.bpmn20.xml").deploy(); // Start process instance RuntimeService runtimeService = processEngine.getRuntimeService(); runtimeService.startProcessInstanceByKey("myProcess"); } }
Which gives following output (shortened here):
— output = eth0 Link encap:Ethernet
inet addr:192.168.0.114 Bcast:192.168.0.255 Mask:255.255.255.0
…
Process finished with exit code 1
It outputs information about all my network interfaces and then shutdows down the whole JVM. Yipes. That’s scary.
Trying Nashorn
The solution to our first problem is that we need to whitelist what we want to expose in a script, and have everything blacklisted by default. This way, users won’t be able to run any class or method that can do something malicious.
In Activiti, when a javascript script task is part of a process definition, we give this script to the javascript engine that is embedded in the JDK, using the ScriptEngine class in the JDK. In JDK 6/7 this was Rhino, in JDK 8 this is Nashorn. I first did some serious googling to find a solution for Nashorn (as this would be more future-proof). Nashorn does have a ‘class filter’ concept to effectively implement white-listing. However, the ScriptEngine abstraction does not have any facilities to actually tweak or configure the Nashorn engine. We’ll have to do some low-level magic to get it working.
Instead of using the default Nashorn scripting engine, we instantiate the Nashorn scripting engine ourselves in a ‘SecureScriptTask’ (which is a regular JavaDelegate). Note the use of the usage of jdk.nashorn.* package – not really nice. We follow the docs from https://docs.oracle.com/javase/8/docs/technotes/guides/scripting/nashorn/api.html to make the script execution more secure by adding a ‘ClassFilter’ to the Nashorn engine. This effectively acts as a white-list of approved classes that can be used in the script.
public class SafeScriptTaskDemo2 implements JavaDelegate { private Expression script; public void execute(DelegateExecution execution) throws Exception { NashornScriptEngineFactory factory = new NashornScriptEngineFactory(); ScriptEngine scriptEngine = factory.getScriptEngine(new SafeClassFilter()); ScriptingEngines scriptingEngines = Context .getProcessEngineConfiguration() .getScriptingEngines(); Bindings bindings = scriptingEngines.getScriptBindingsFactory().createBindings(execution, false); scriptEngine.eval((String) script.getValue(execution), bindings); System.out.println("Java delegate done"); } public static class SafeClassFilter implements ClassFilter { public boolean exposeToScripts(String s) { return false; } } }
When executed, the script above won’t be executed, an exception is thrown stating ‘Exception in thread “main” java.lang.RuntimeException: java.lang.ClassNotFoundException: java.lang.System.out.println’.
Note that the ClassFilter is only available from JDK 1.8.0_40 (quite recent!).
However, this doesn’t solve our second problem with infinite loops. Let’s execute a simple script:
while (true) { print("Hello"); }
You can guess what this’ll do. This will run forever. If you’re lucky, a transaction timeout will happen as the script task is executed in a transaction. But that’ far from a decent solution, as it hogs CPU resources for a while doing nothing.
The third problem, using a lot of memory, is also easy to demonstrate:
var array = [] for(var i = 0; i < 2147483647; ++i) { array.push(i); java.lang.System.out.println(array.length); }
When starting the process instance, the memory will quickly fill up (starting with only a couple of MB):
and eventually end with an OutOfMemoryException: Exception in thread “main” java.lang.OutOfMemoryError: GC overhead limit exceeded
Switching to Rhino
Between the following example and the previous one a lot of time was spent to make Nashorn somehow intercept or cope with the infinite loop/memory usage. However, after extensive searching and experimenting, it seems the features simply are not (yet?) in Nashorn. A quick search will teach you that we’re not the only one looking for a solution to this. Often, it is mentioned that Rhino did have features on board to solve this.
For example in JDK < 8, the Rhino javascript engine had the ‘instructionCount’ callback mechanism, which is not present in Nashorn. It basically gives you a way to execute logic in a callback that is automatically called every x instructions (bytecode instructions!). I first tried (and lost a lot of time) to mimic the instructionCount idea with Nashorn, for example by prettifying the script first (because people could write the whole script on one line) and then injecting a line of code in the script that triggers a callback. However, that was 1) not very straightforward to do 2) one would still be able to write an instruction on one line that runs infinitely/uses a lot of memory.
Being stuck there, the search led us to the Rhino engine from Mozilla. Since its inclusion in the JDK a long time ago it actually evolved further on its own, while the version in the JDK wasn’t updated with those changes! After reading up the (quite sparse) Rhino docs, it became clear Rhino seemed to have a far richer feature-set with regards to our use case.
The ClassFilter from Nashorn matched the ‘ClassShutter’ concept in Rhino. The cpu and memory problem were solved using the callback mechanism of Rhino: you can define a callback that is called every x instructions. This means that one line could be hundreds of byte code instructions and we get a callback every x instructions …. which make it an excellent candidate for monitoring our cpu and memory usage when executing the script.
If you are interested in our implementation of these ideas in the code, have a look here.
This does mean that whatever JDK version you are using, you will not be using the embedded javascript engine, but always Rhino.
Trying it out
To use the new secure scripting feature, add the following depdendency:
<dependency> <groupId>org.activiti</groupId> <artifactId>activiti-secure-javascript</artifactId> <version>5.21.0</version> </dependency>
This will transitevly include the Rhino engine. This also enables the SecureJavascriptConfigurator, which needs to be configured before creating the process engine:
SecureJavascriptConfigurator configurator = new SecureJavascriptConfigurator() .setWhiteListedClasses(new HashSet<String>(Arrays.asList("java.util.ArrayList"))) .setMaxStackDepth(10) .setMaxScriptExecutionTime(3000L) .setMaxMemoryUsed(3145728L) .setNrOfInstructionsBeforeStateCheckCallback(10); ProcessEngine processEngine = new StandaloneInMemProcessEngineConfiguration() .addConfigurator(configurator) .buildProcessEngine();
This will configure the secure scripting to
- Every 10 instructions, check the CPU execution time and memory usage
- Give the script 3 seconds and 3MB to execute
- Limit stack depth to 10 (to avoid recursing)
- Expose the array list as a class that is safe to use in the scripts
Running the script from above that tries to read the ifconfig and shut down the JVM leads to:
TypeError: Cannot call property getRuntime in object [JavaPackage java.lang.Runtime]. It is not a function, it is “object”.
Running the infinite loop script from above gives
Exception in thread “main” java.lang.Error: Maximum variableScope time of 3000 ms exceeded
And running the memory usage script from above gives
Exception in thread “main” java.lang.Error: Memory limit of 3145728 bytes reached
And hurray! The problems defined above are solved
Performance
I did a very unscientific quick check … and I almost didn’t dare to share it as the result go against what I assumed would happen.
I created a quick main that runs a process instance with a script task 10000 times:
public class PerformanceUnsecure { public static void main (String[] args) { ProcessEngine processEngine = new StandaloneInMemProcessEngineConfiguration().buildProcessEngine(); RepositoryService repositoryService = processEngine.getRepositoryService(); repositoryService.createDeployment().addClasspathResource("performance.bpmn20.xml").deploy(); Random random = new Random(); RuntimeService runtimeService = processEngine.getRuntimeService(); int nrOfRuns = 10000; long total = 0; for (int i=0; i<nrOfRuns; i++) { Map<String, Object> variables = new HashMap<String, Object>(); variables.put("a", random.nextInt()); variables.put("b", random.nextInt()); long start = System.currentTimeMillis(); runtimeService.startProcessInstanceByKey("myProcess", variables); long end = System.currentTimeMillis(); total += (end - start); } System.out.println("Finished process instances : " + processEngine.getHistoryService().createHistoricProcessInstanceQuery().count()); System.out.println("Total time = " + total + " ms"); System.out.println("Avg time/process instance = " + ((double)total/(double)nrOfRuns) + " ms"); } }
The process definition is just a start -> script task -> end. The script task simply adds to variables and saves the result in a third variable.
<scriptTask id="myScriptTask" scriptFormat="javascript"> <script> var c = a + b; execution.setVariable('c', c); </script> </scriptTask>
I ran this five times, and got an average of 2.57 ms / process instance. This is on a recent JDK 8 (so Nashorn).
Then I switched the first couple of lines above to use the new secure scripting, thus switching to Rhino plus the security features enabled:
SecureJavascriptConfigurator configurator = new SecureJavascriptConfigurator() .addWhiteListedClass("org.activiti.engine.impl.persistence.entity.ExecutionEntity") .setMaxStackDepth(10) .setMaxScriptExecutionTime(3000L) .setMaxMemoryUsed(3145728L) .setNrOfInstructionsBeforeStateCheckCallback(1); ProcessEngine processEngine = new StandaloneInMemProcessEngineConfiguration() .addConfigurator(configurator) .buildProcessEngine();
Did again five runs … and got 1.07 ms / process instance. Which is more than twice as fast for the same thing.
Of course, this is not a real test. I assumed the Rhino execution would be slower, with the class whitelisting checking and the callbacks … but no such thing. Maybe this particular case is one that is simply better suited for Rhino … If anyone can explain it, please leave a comment. But it is an interesting result nonetheless.
Conclusion
If you are using scripts in your process definition, do read up on this new secure scripting feature in the engine. As this is a new feature, feedback and improvements are more than welcome!
Reference: | How the Secure Scripting in Activiti works from our JCG partner Joram Barrez at the Small steps with big feet blog. |