Introduction into GraalVM (Community Edition): GraalVM as a Polyglot Platform
1. Introduction
Until now we have discussed the GraalVM exclusively in the context of the JVM platform. It is not surprising taking into account all the advantages that JVM applications and services could get out of the GraalVM compiler and native image builder.
Table Of Contents
But GraalVM aims for much more ambitious goals: to become a truly polyglot platform where the components written in different programming languages seamlessly cooperate inside single, high-performance runtime. To understand how these goals come to live, we have to talk about yet another piece of the groundbreaking technology, the Truffle Framework.
2. The Truffle Framework
So what is Truffle Framework, or just Truffle?
Truffle is a Java library for building programming language implementations as interpreters for self-modifying Abstract Syntax Trees. When writing a language interpreter with Truffle, it will automatically use the GraalVM compiler as a just-in-time compiler for the language. By having access to this framework, a Ruby application, for example, can run on the same JVM as a Java application. Also, a host JVM-based language and a guest language can directly interoperate with each other and pass data back and forth in the same memory space.
https://www.graalvm.org/reference-manual/polyglot-programming/
Truffle is the foundation for enabling GraalVM polyglot capabilities. Being a Java library means that you could include it as dependency in your own projects (for example, using Apache Maven’s pom.xml
) and just start building things.
<dependency> <groupId>org.graalvm.truffle</groupId> <artifactId>truffle-api</artifactId> <version>20.3.0</version> </dependency> <dependency> <groupId>org.graalvm.truffle</groupId> <artifactId>truffle-dsl-processor</artifactId> <version>20.3.0</version> <scope>provided</scope> </dependency>
Essentially, Truffle serves two purposes:
- Implementing new languages on top of GraalVM
- Implementing new language-agnostic tooling on top of GraalVM
We are not going to implement our own language or tool but nonetheless we will glance over the basic concepts Truffle comes with.
3. Truffle for Language Implementors
For language implementors Truffle provides Language API. Not only the usage of this API simplifies the development of language interpreters, one its key advantages are to automatically derive high-performance code. Also, it does not matter if your language is statically or dynamically typed, the Language API contains the necessary primitives for both. Start with looking at the TruffleLanguage class, which the one should subclass to start developing a language.
Truffle is at heart of GraalVM polyglot capabilities. It offers a way to interoperate with other languages which are also based on Truffle. For that only reason, a special polyglot interoperability protocol has been developed. This protocol allows GraalVM to bolster interoperability between any mixtures of the languages without requiring them to know of each other existence. The Polyglot API-based Test Compatibility Kit (or shortly, Polyglot TCK) which comes along with GraalVM helps to verify your language against polyglot interoperability requirements.
The usage of the Polyglot API allows the languages implemented on top of Truffle to be embedded into other applications. In this context, you may often encounter the terms host
and guest
languages. The host
language is the one where the polyglot context is initialized (for example, your Java application or service), whereas the guest
language refers to the one(s) called from the host
language (and in turn, the guest
language may itself be a host
language and delegate to other guest
languages). The embedding opens the door for rich and efficient scripting capabilities since the host
and the guest
languages can directly interoperate with each other and pass data back and forth in the same memory space.
The official documentation is a great starting point to learn how to implement your own language using Truffle. Additionally, I found the Graal Truffle tutorial to be a well written and easy to follow series of blog posts with in depth coverage of the advanced topics.
3.1. The Power of Graphs
Although we are not implementing a language, it is worth mentioning a number of essential tools in case you are going to create your own. The first one is the Ideal Graph Visualizer (IGV), a tool to understand Truffle’s ASTs and the GraalVM compiler graphs.
The second one is Seafoam from Shopify, a tool for working with compiler graphs. It’s designed primarily for working with the GraalVM compiler graphs.
3.2. Supported Languages
The GraalVM distribution ships with a number of Truffle-based language implementations. For the 20.3.x
release line, the baseline of the tutorial, those include:
- JavaScript and Node.js
- LLVM Languages
- Python (experimental)
- R (experimental)
- Ruby (experimental)
- WebAssembly (experimental)
The most recent 21.0.x
release brings yet another one.
- Java on Truffle (experimental, codenamed
“Espresso”
)
Let us stop for a moment. Truffle is Java library. And now there is Java implementation in Truffle. Does it make any sense? In fact, it probably does and I encourage you to read the official announcement to understand the reasoning behind it.
A note of caution: not all languages are available on every platform. In particular, Windows support is still somewhat experimental and is lacking behind.
And to wrap it up, there are much more language implementations and experiments available. The up to date list is published under language implementations section (please do not forget to add your own language).
4. Truffle for Tool Implementors
Besides Language API, Truffle comes with the Instrument API. With this API you can create language-agnostic tools like debuggers, profilers, inspectors, code coverage tools or other instrumentations. To begin with, use TruffleInstrument class, which the one should subclass to start developing a tool.
The Truffle-based tools instrument the language using the same AST-based approach. As such, most of the techniques available to language developers are in disposal of the tool developers as well. Conclusively, it is better to understand how Truffle works from the language perspective before embarking on development of your own tools.
4.1. Supported Tools
The GraalVM distribution bundles a number of such tools, specifically targeting polyglot application and services. Here are a few examples.
VisualVM, which we talked about previously, includes visualizations for the supported guest languages:
- Java: Heap Summary, Objects View, Threads View, OQL Console
- JavaScript: Heap Summary, Objects View, Thread View
- Python: Heap Summary, Objects View
- Ruby: Heap Summary, Objects View, Threads View
- R: Heap Summary, Objects View
GraalVM Insight: a multipurpose, flexible tool for writing reliable microservices solutions that traces program runtime behavior and gathers insights (offered as a technology preview). One the coolest feature of this tool is polyglot tracing: you can take the same instrumentation and apply it to any supported language.
Chrome Debugger: supports debugging of guest language applications and provides a built-in implementation of the Chrome DevTools Protocol.
Later on in this part of the tutorial we are going to see some of these tools in action while playing with a simple polyglot application we are about to build. Be aware that the same limitations as with languages may apply: not all tools may be available on every platform.
5. Truffle and Compatibility
New GraalVM releases are dropped regularly and the question of the preserving the compatibility between those is quite important. As the language or tool developer, you want to be sure that your creation works with older and newer GraalVM releases.
At the moment, the Truffle APIs are evolved in a backwards-compatible manner. When an API becomes deprecated, it will stay like that for at least two GraalVM releases, and a minimum of one month, before it will be dropped.
6. Polyglot on GraalVM
The best way to illustrate the powerful GraalVM polyglot potential is by developing a sample application which uses multiple languages. Obviously, our host application is going to be written in Java but some pieces of work are going to be done in Python and Ruby.
First off, Python and Ruby are not installed by default, so we need to bring them in using GraalVM Updater tool, shortly gu.
$ bin/gu install python Downloading: Component catalog from www.graalvm.org Processing Component: Graal.Python Downloading: Component python: Graal.Python from github.com Installing new component: Graal.Python (org.graalvm.python, version 20.3.0) ...
$ bin/gu install ruby Downloading: Component catalog from www.graalvm.org Processing Component: TruffleRuby Downloading: Component ruby: TruffleRuby from github.com Installing new component: TruffleRuby (org.graalvm.ruby, version 20.3.0) ...
Optionally, if Native Image builder is already installed, you may need to rebuild native images as well, for example, for Python tooling:
$ bin/gu rebuild-images python
And for Ruby respectively:
$ bin/gu rebuild-images ruby
6.1. Embedding
The first scenario we are going to play with is embedding Python and Ruby languages inside the Java host applications. The application itself will do just two things:
- Get the current system’s timezone using Python script
- Fetch the current timezone details from World Time API services over HTTP using Ruby script
Not very complicated but a few things will stand off soon enough. So let us start from the Python script, named get_timezone.py
.
from time import gmtime, strftime import polyglot @polyglot.export_value def get_timezone(): return strftime("%Z", gmtime())
The first unusual thing you will notice is the presence of the @polyglot.export_value
, a necessary element of the GraalVM polyglot interoperability. We could use such exported objects from other languages as the Java code snipped below illustrates.
private static final String PYTHON = "python"; public static String getTimezone(Engine engine) throws IOException { final Context context = Context .newBuilder(PYTHON) .allowHostAccess(HostAccess.NONE) .allowPolyglotAccess(PolyglotAccess .newBuilder() .allowBindingsAccess(PYTHON) .build()) .engine(engine) .build(); var script = IOUtils.resourceToString("/get_timezone.py", StandardCharsets.UTF_8); context.eval(PYTHON, script); final Value result = context .getPolyglotBindings() .getMember("get_timezone") .execute(); // Returns value in single quotes: 'EST', 'PST', ... return result.asString().replaceAll("'", ""); }
GraalVM allows a fine grained control of what guest languages can or cannot do through Context
instance. In case of Python, we explicitly prohibit the access to the host application and only allow bindings (so we could import the value from the script). Once Context
instance is build, we could evaluate the code written in the guest language (Python).
context.eval(PYTHON, script);
Upon completion, we invoke the function get_timezone
using polyglot bindings and store its results on the host side.
final Value result = context .getPolyglotBindings() .getMember("get_timezone") .execute();
At this point we get our timezone and are ready to move on to the next step, calling World Time API HTTP APIs from Ruby script, stored as fetch_timezone.rb
.
require 'net/http' require 'json' def fetch_timezone(tz) uri = URI('http://worldtimeapi.org/api/timezone/' + tz) response = Net::HTTP.get(uri) JSON.parse(response) end Polyglot.export_method("fetch_timezone")
Still, the explicit exports are required, this time by calling Polyglot.export_method
method, provided by the runtime. On the host side, the code gets a little bit more complicated.
public static String fetchTimezone(Engine engine, String timezone) throws IOException { final Context context = Context .newBuilder(RUBY) .allowHostAccess(HostAccess.NONE) .allowNativeAccess(true) .allowIO(true) .allowPolyglotAccess(PolyglotAccess .newBuilder() .allowBindingsAccess(RUBY) .build()) .engine(engine) .build(); var script = IOUtils.resourceToString("/fetch_timezone.rb", StandardCharsets.UTF_8); context.eval(RUBY, script); /** * Sample Ruby hash object: * * { * "abbreviation"=>"EST", * "datetime"=>"2021-02-28T12:46:09.097360-05:00", * "day_of_week"=>0, * "day_of_year"=>59, * "dst"=>false, * "dst_from"=>nil, * "dst_offset"=>0, * "dst_until"=>nil, * "raw_offset"=>-18000, * "timezone"=>"EST", * "unixtime"=>1614534369, * "utc_datetime"=>"2021-02-28T17:46:09.097360+00:00", * "utc_offset"=>"-05:00", * "week_number"=>8 * } */ final Value result = context .getPolyglotBindings() .getMember("fetch_timezone") .execute(timezone); return result .getMember("fetch") .execute("datetime") .asString(); }
Most of the same elements are present, but since Ruby script needs access to HTTP APIs, some restrictions have to be lifted.
final Context context = Context .newBuilder(RUBY) .allowHostAccess(HostAccess.NONE) .allowNativeAccess(true) .allowIO(true) ...
The access to the host application is still not allowed. On the evaluation side, the result extraction is more verbose because of the fact we have to deal with Ruby’s Hash instance.
return result .getMember("fetch") .execute("datetime") .asString();
And we should extract the current time for the timezone in question. To be fair, quite inefficient way to get the current date and time but hopefully we could get along with it for the sake of being an example of polyglot interoperability. Without further ado, let us build an executable JAR and run it, using the JVM from GraalVM distribution.
$ mvn clean package ... [INFO] ---------------------------------------------------------------------- [INFO] BUILD SUCCESS [INFO] ---------------------------------------------------------------------- [INFO] Total time: 1.592 s [INFO] Finished at: 2021-02-28T14:36:24-05:00 [INFO] ---------------------------------------------------------------------- ...
$ java -jar target/polyglot-0.0.1-SNAPSHOT-jar-with-dependencies.jar 2021-02-28T14:57:53.939714-05:00
The current date and time should be printed out on the console. Now, the one million dollar question: could we use GraalVM polyglot capabilities along with native image builder? The short answer is yes, you can! The native image builder has dedicated command line argument --language:<lang>
to bundle the support of embedding of the guest language(s) of your choice:
$ native-image --language:ruby --language:python ...
But the duration and memory requirements during the native executable build phase may surprise you, depending on which language(s) you need. Anyway, we learnt how to create native executables so let us build one for our sample application.
<plugin> <groupId>org.graalvm.nativeimage</groupId> <artifactId>native-image-maven-plugin</artifactId> <version>20.3.0</version> <configuration> <mainClass>com.javacodegeeks.graalvm.polyglot.PolyglotRunner</mainClass> <buildArgs>--language:ruby --language:python</buildArgs> <imageName>${project.artifactId}</imageName> </configuration> <executions> <execution> <goals> <goal>native-image</goal> </goals> <phase>package</phase> </execution> </executions> </plugin>
To preserve the traditional packaging, the plugin configuration is part of the native-image
profile, conveniently supported by Apache Maven.
$ mvn clean package -Pnative-image ... [polyglot:19463] classlist: 2,372.98 ms, 1.18 GB [polyglot:19463] (cap): 831.98 ms, 1.18 GB [polyglot:19463] setup: 2,206.91 ms, 1.18 GB [polyglot:19463] (clinit): 2,464.66 ms, 12.44 GB [polyglot:19463] (typeflow): 105,500.88 ms, 12.44 GB [polyglot:19463] (objects): 128,162.67 ms, 12.44 GB [polyglot:19463] (features): 21,266.20 ms, 12.44 GB [polyglot:19463] analysis: 265,383.42 ms, 12.44 GB [polyglot:19463] universe: 4,735.95 ms, 12.44 GB 31037 method(s) included for runtime compilation [polyglot:19463] (parse): 9,724.75 ms, 11.61 GB [polyglot:19463] (inline): 7,568.92 ms, 10.50 GB [polyglot:19463] (compile): 44,265.31 ms, 12.23 GB [polyglot:19463] compile: 69,129.66 ms, 12.24 GB [polyglot:19463] image: 38,436.01 ms, 11.61 GB [polyglot:19463] write: 2,278.55 ms, 11.61 GB [polyglot:19463] [total]: 387,343.12 ms, 11.61 GB [INFO] ---------------------------------------------------------------------- [INFO] BUILD SUCCESS [INFO] ---------------------------------------------------------------------- [INFO] Total time: 06:29 min [INFO] Finished at: 2021-03-06T17:48:34-05:00 [INFO] ----------------------------------------------------------------------
Awesome, the native executable is there, but before we could run it, we need so set some environment variables, assuming you already have GRAALVM_HOME
pointing to your distribution of the GraalVM. For Java 11 based distributions, those additional environment variables are:
export GRAAL_PYTHONHOME=$GRAALVM_HOME/languages/python
In case of Java 8 based distributions, the path is slightly different:
export GRAAL_PYTHONHOME=$GRAALVM_HOME/jre/languages/python
But even that is not enough. In addition, we have to pass a few system properties to our native executable, org.graalvm.language.ruby.home
and llvm.home
, to point to Ruby and LLVM languages. With that sorted out, we are ready to roll.
$ target/polyglot -Dorg.graalvm.language.ruby.home=$GRAALVM_HOME/languages/ruby -Dllvm.home=$GRAALVM_HOME/languages/llvm 2021-03-06T19:24:39.914788-05:00
What is execution time? It is just around two seconds. Similarly, in case of Java 8 based distributions, please adjust these system properties to point to $GRAALVM_HOME/jre/languages/ruby
and $GRAALVM_HOME/jre/languages/llvm
respectively.
Last but not least, let us take a look at VisualVM and specifically on the polyglot capabilities it has been enhanced with, for example Polyglot Sampler
.
Since our application uses Python and Ruby guest languages, the respective samples from both are present in the resulting view.
6.2. Polyglot Shell
Embedding is just one option. The GraalVM distribution comes with experimental new launcher, called polyglot
. The polyglot launcher allows running code for JavaScript, Ruby, R and Python without requiring the selection of a primary (host) language in advance.
$ bin/polyglot --polyglot --jvm fetch_timezone.rb { "abbreviation"=>"EST", "datetime"=>"2021-03-03T20:18:01.595142-05:00", "day_of_week"=>3, "day_of_year"=>62, "dst"=>false, "dst_from"=>nil, "dst_offset"=>0, "dst_until"=>nil, "raw_offset"=>-18000, "timezone"=>"EST", "unixtime"=>1614820681, "utc_datetime"=>"2021-03-04T01:18:01.595142+00:00", "utc_offset"=>"-05:00", "week_number"=>9 }
The launcher could be used in REPL mode and as such is referred to as the Polyglot Shell
. It is also an experimental feature with allows to play with the Truffle-based languages interactively.
$ bin/polyglot --jvm --shell GraalVM MultiLanguage Shell 20.3.0 Copyright (c) 2013-2020, Oracle and/or its affiliates JavaScript version 20.3.0 Python version 3.8.5 Ruby version 2.6.6 Usage: Use Alt+L to switch language and Ctrl+D to exit. Enter -usage to get a list of available commands. js>
The usefulness of REPLs for quick prototyping and exploration has been proven for years and it is great to see such tooling was not left out in GraalVM.
6.3. Native Launchers
Yet another way to exploit polyglot capabilities of the GraalVM is to use native language launchers (js
, python
, ruby
, …), available as standalone executables (you could always rebuild them in case some are missing).
$ gu rebuild-images polyglot|libpolyglot|js|llvm|python|ruby [custom native-image args]
Every language launcher has been enhanced to be polyglot-aware and to have access to the options of other Truffle-based languages.
$ bin/ruby --polyglot --jvm fetch_timezone.rb { "abbreviation"=>"EST", "datetime"=>"2021-03-03T20:18:01.595142-05:00", "day_of_week"=>3, "day_of_year"=>62, "dst"=>false, "dst_from"=>nil, "dst_offset"=>0, "dst_until"=>nil, "raw_offset"=>-18000, "timezone"=>"EST", "unixtime"=>1614820681, "utc_datetime"=>"2021-03-04T01:18:01.595142+00:00", "utc_offset"=>"-05:00", "week_number"=>9 }
7. Polyglot Performance
The performance of the language implementations running on GraalVM (using Truffle) may differ from the native language runtimes, sometimes quite noticeably. The official documentation assembles a number of hints related to analysing and troubleshooting the performance issues.
8. What’s Next
In the next section of the tutorial we are going to talk about what GraalVM means for regular Java developers out there. Should you care? If so, why and how it could be helpful?