Can/Should I use parallel streams in a transaction context?
Introduction
To make a long story short, you should not use transactions within a parallel stream. This is because each thread in the parallel stream has its own name thus it does participate in the transaction.
The Streams API is designed to work correctly under certain guidelines. In practice, to benefit from parallelism, each operation is not allowed to change the state of shared objects (such operations are called side-effect-free). Provided you follow this guideline, the internal implementation of parallel streams cleverly splits the data, assigns different parts to independent threads, and merges the final result.
This is primary originated because of the way Transactions are implemented. In sort, a ThreadLocal variable is used to mark each method participating in the transaction. ThreadLocal variables are not able to keep their vale within a parallel stream. To demonstrate that I have created the following test
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | import org.junit.Assert; import org.junit.Test; import java.util.concurrent.atomic.AtomicBoolean; import java.util.stream.IntStream; public class ThreadNameTest { @Test public void threadLocalTest(){ ThreadContext.set( "MAIN" ); AtomicBoolean hasNamechanged = new AtomicBoolean( false ); IntStream.range( 0 , 10000000 ).boxed().parallel().forEach(n->{ if (! "MAIN" .equals(ThreadContext.get())){ hasNamechanged.set( true ); } }); Assert.assertTrue(hasNamechanged.get()); } private static class ThreadContext { private static ThreadLocal<String> val = ThreadLocal.withInitial(() -> "empty" ); public ThreadContext() { } public static String get() { return val.get(); } public static void set(String x) { ThreadContext.val.set(x); } } } |
The higher the IntStream.range value the more certain the test will succeed.
Now take a look at this github project https://github.com/diakogiannis/transactionplayground/
The TransactionPlayground project
I created a services that loads cats in 4 different ways
- sequentially
curl -I -X GET http://localhost:8080/api/cats/all
- sequentially but throwing an exception in order to create a rollback mark
curl -I -X GET http://localhost:8080/api/cats/all-exception
- in parallel
curl -I -X GET http://localhost:8080/api/cats/all-parallel
- in parallel but throwing an exception in order to create a rollback mark
curl -I -X GET http://localhost:8080/api/cats/all-parallel-exception
There are also 2 helper calls
- cleanup
curl -I -X DELETE http://localhost:8080/api/cats/
- and one to actually view the cats
curl -X GET http://localhost:8080/api/cats/
Start the project
please execute mvn clean package wildfly-swarm:run
Normal Ordered
Call curl -I -X GET http://localhost:8080/api/cats/all
and then curl -X GET http://localhost:8080/api/cats/
Normal Without Order aka Parallel
Call for cleanup curl -I -X DELETE http://localhost:8080/api/cats/
Call curl -I -X GET http://localhost:8080/api/cats/all-parallel
and then curl -X GET http://localhost:8080/api/cats/
The expected result is to see a list of cats. Without ordering. This is why parallel stream is first come-first served and reads randomly from the list.
Normal With exception
Call for cleanup curl -I -X DELETE http://localhost:8080/api/cats/
Call curl -I -X GET http://localhost:8080/api/cats/all-exception
and then curl -X GET http://localhost:8080/api/cats/
The expected result is an empty list. This is because the transaction was marked as rollback, so the jdbc transaction was rolledback thus all entries were not persisted to the database following the ACID model.
Parallel With exception
Call for cleanup curl -I -X DELETE http://localhost:8080/api/cats/
Call curl -I -X GET http://localhost:8080/api/cats/all-parallel-exception
and then curl -X GET http://localhost:8080/api/cats/
The expected result is NOT an empty list. This is because each thread in the parallel stream opens its own jdbc transaction and commits when done. So each time you do this, you get some cats displayed up until the point you get an Exception and the execution stops. Rollback is made only in one thread.
Published on Java Code Geeks with permission by Alexius Diakogiannis, partner at our JCG program. See the original article here: Can/Should I use parallel streams in a transaction context? Opinions expressed by Java Code Geeks contributors are their own. |