Now Reading
How we switched to Java 21 digital threads and obtained a impasse in TPC-C for PostgreSQL | by Evgeniy Ivanov | Jan, 2024

How we switched to Java 21 digital threads and obtained a impasse in TPC-C for PostgreSQL | by Evgeniy Ivanov | Jan, 2024

2024-01-15 18:46:17

Dinning Java philosophers
Eating Java 21 philosophers have an issue

In our earlier post about TPC-C, we mentioned some drawbacks within the authentic TPC-C implementation from the Benchbase venture (which is nice nonetheless). One of many drawbacks was the concurrency restrict on account of spawning too many bodily threads, and we solved it by switching to Java 21 digital threads. Later we found that, as regular, there isn’t any free lunch. On this publish, we current a case examine on how we encountered a impasse with digital threads in TPC-C for PostgreSQL, even with out the eating philosophers drawback.

This publish is perhaps helpful for Java builders who’re contemplating switching to digital threads. We revise some basic background after which spotlight an vital concern behind digital threads: deadlocks is perhaps unpredictable as a result of they might occur deep contained in the libraries you employ. Fortuitously, debugging is simple and we clarify the best way to discover these deadlocks after they occur.

PostgreSQL is an open-source database administration system famend for its excessive efficiency, wealthy function set, superior degree of SQL compliance, and vibrant and supportive group. It’s nice till you take into accounts horizontal scalability and fault tolerance. Then you find yourself with PostgreSQL-based third-party options like Citus, which implement sharded PostgreSQL. Having a single elephant is perhaps enjoyable. Being a mahout of a herd of elephants is a problem, particularly in order for you these elephants to keep up a number of constant replicas and carry out distributed transactions with serializable isolation.

Versus this, YDB is a distributed database administration system by its authentic design. YDB’s distributed transactions are first-class residents and run at a serializable isolation degree by default. Now, we’re actively transferring in direction of PostgreSQL compatibility as a result of we see sturdy demand amongst PostgreSQL customers to make their current functions robotically scalable and fault-tolerant. That’s why we preserve TPC-C for PostgreSQL (we hope to get it merged into upstream Benchbase quickly).

Let’s recap some basic ideas: concurrency, parallel execution, and asynchronous vs. synchronous requests.

Concurrency signifies that duties are carried out on the identical time, both in parallel or sequentially. For instance, you may need two actions: writing your code in an editor and having a Slack chat along with your colleagues. You carry out these duties concurrently, however not in parallel. Otherwise you would possibly take a stroll along with your canine and communicate on the telephone with a buddy. Once more, you carry out these two duties concurrently, however this time, in parallel.

Now, take into account the case when your software desires to make a request to the database. The request is shipped by means of the community, serviced by the database, and the reply is shipped again to your software. Be aware that the community spherical journey is perhaps the costliest a part of the request and will take a number of milliseconds. What are you able to do on the appliance aspect whereas ready for the reply?

1. The request is perhaps synchronous, i.e., it can block the calling thread. This strategy could be very simple to jot down code for: on line 1, you’ve the request; on line 2, you may course of the response:

String userName = get_username_from_db(userId);
System.out.printf("Whats up, %s!", userName);

2. The request is perhaps asynchronous. Your thread shouldn’t be blocked and continues the execution, whereas the request is processed in parallel:

CompletableFuture<String> userNameFuture = get_username_from_db(userId);

// Be aware, that that is type of callback, it is not executed "proper right here",
// much more, sooner or later it will likely be executed in parallel along with your thread.
// In actual life eventualities, you'll have to use mutual exclusion.
userNameFuture.thenAccept(userName -> {
System.out.println("Whats up, %s!", userName);
});
execute_something_else();
userNameFuture.get(); // anticipate the completion of your request

In both case, there are two concurrent duties: your thread is ready for the reply from the database, and the database is dealing with the request. Synchronous code is very simple to jot down and browse. However what if it is advisable make 1000’s of requests to the database concurrently? You’ll have to spawn a thread per request. Spawning a thread in Linux is reasonable, although there are sturdy considerations behind spawning too many threads:

  1. Every thread requires a stack. You possibly can’t allocate much less reminiscence than the web page dimension in your system, which is often about 4 KiB except you employ hugepages, the place the default web page dimension is 2 MiB.
  2. There’s the Linux scheduler. You possibly can attempt to spawn 100,000 threads able to execute solely you probably have a reset button.

For this reason, till Java 21, there was no approach to write synchronous code with a excessive degree of concurrency: you may’t spawn many threads. Concurrently (pun meant), the Go language revolutionized this: goroutines present very light-weight concurrency in an effort to write synchronous code effectively. We suggest this talk in regards to the Go scheduler by Dmitry Vyukov. Java 21 launched digital threads that are in lots of senses much like goroutines. Take into account that goroutines and digital threads usually are not innovations, however moderately a reincarnation of the previous good idea of user-level threads.

Now, you may perceive the issue with synchronous database requests within the authentic Benchbase TPC-C implementation. In case your database can deal with a excessive load, you could run many TPC-C warehouses, spawning many threads. With bodily threads, we didn’t run greater than 30,000 terminal-threads, whereas with digital threads, we will simply have a whole bunch of 1000’s of terminal-vthreads.

Think about that you have already got multithreaded Java code. Including an choice to make use of digital threads is surprisingly simple and could be extremely useful. By merely changing your customary thread creation with the brand new digital thread builders, your software can deal with 1000’s of concurrent duties with out the overhead related to bodily threads. Right here is an example from our TPC-C implementation:

if (useRealThreads) {
thread = new Thread(employee);
} else {
thread = Thread.ofVirtual().unstarted(employee);
}

That’s all it takes; now, you’re utilizing digital threads. Underneath the hood, the Java Digital Machine creates a pool of provider threads, which execute your digital threads. This transition seems seamless till, unexpectedly, your software freezes.

Our PostgreSQL TPC-C implementation makes use of c3p0 for connection pooling. The TPC-C standard dictates that every terminal should have its personal connection. Nevertheless, in lots of real-world eventualities, this isn’t sensible, so we’ve included an choice to restrict the variety of database connections.

The variety of terminals is way better than the variety of obtainable connections. Consequently, some terminals should anticipate a session to develop into obtainable, i.e., launched by one other terminal.

After we initiated the TPC-C run, the appliance froze. Fortuitously, debugging such instances is simple:

  1. Seize the thread stacks utilizing jstack -p <PID>.
  2. Create a extra detailed dump of the present state, which incorporates details about provider threads and digital threads, utilizing jcmd <PID> Thread.dump_to_file -format=textual content jcmd.dump.1.

Upon investigation, we found that some digital threads ready for a session had pinned their provider thread. Right here is the stack for one such digital thread:

See Also

#7284 "TPCCWorker<7185>" digital
java.base/java.lang.Object.wait0(Native Methodology)
java.base/java.lang.Object.wait(Object.java:366)
com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(BasicResourcePool.java:1503)
com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:644)
com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(BasicResourcePool.java:554)
com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutAndMarkConnectionInUse(C3P0PooledConnectionPool.java:758)
com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:685)
com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection(AbstractPoolBackedDataSource.java:140)
com.oltpbenchmark.api.BenchmarkModule.makeConnection(BenchmarkModule.java:108)
com.oltpbenchmark.api.Employee.doWork(Employee.java:428)
com.oltpbenchmark.api.Employee.run(Employee.java:304)
java.base/java.lang.VirtualThread.run(VirtualThread.java:309)

and the stack of its carrying thread:

"ForkJoinPool-1-worker-254" #50326 [32859] daemon prio=5 os_prio=0 cpu=12.39ms elapsed=489.99s tid=0x00007f3810003140  [0x00007f37abafe000]
Carrying digital thread #7284
at jdk.inside.vm.Continuation.run(java.base@21.0.1/Continuation.java:251)
at java.lang.VirtualThread.runContinuation(java.base@21.0.1/VirtualThread.java:221)
at java.lang.VirtualThread$$Lambda/0x00007f3c2424e410.run(java.base@21.0.1/Unknown Supply)
at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(java.base@21.0.1/ForkJoinTask.java:1423)
at java.util.concurrent.ForkJoinTask.doExec(java.base@21.0.1/ForkJoinTask.java:387)
at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(java.base@21.0.1/ForkJoinPool.java:1312)
at java.util.concurrent.ForkJoinPool.scan(java.base@21.0.1/ForkJoinPool.java:1843)
at java.util.concurrent.ForkJoinPool.runWorker(java.base@21.0.1/ForkJoinPool.java:1808)
at java.util.concurrent.ForkJoinWorkerThread.run(java.base@21.0.1/ForkJoinWorkerThread.java:188)

As you may see, the thread is hanging in Object.wait(), a way used along side synchronized. This causes the provider thread to develop into pinned, which means it’s not launched to execute another digital thread. In the meantime, the session holders have launched their provider threads whereas they’re ready for I/O operations:

java.base/java.lang.VirtualThread.park(VirtualThread.java:582)
java.base/java.lang.System$2.parkVirtualThread(System.java:2639)
java.base/jdk.inside.misc.VirtualThreads.park(VirtualThreads.java:54)
java.base/java.util.concurrent.locks.LockSupport.park(LockSupport.java:369)
java.base/solar.nio.ch.Poller.pollIndirect(Poller.java:139)
java.base/solar.nio.ch.Poller.ballot(Poller.java:102)
java.base/solar.nio.ch.Poller.ballot(Poller.java:87)
java.base/solar.nio.ch.NioSocketImpl.park(NioSocketImpl.java:175)
java.base/solar.nio.ch.NioSocketImpl.park(NioSocketImpl.java:201)
java.base/solar.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:309)
java.base/solar.nio.ch.NioSocketImpl.learn(NioSocketImpl.java:346)
java.base/solar.nio.ch.NioSocketImpl$1.learn(NioSocketImpl.java:796)
java.base/java.internet.Socket$SocketInputStream.learn(Socket.java:1099)
java.base/solar.safety.ssl.SSLSocketInputRecord.learn(SSLSocketInputRecord.java:489)
java.base/solar.safety.ssl.SSLSocketInputRecord.readHeader(SSLSocketInputRecord.java:483)
java.base/solar.safety.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:70)
java.base/solar.safety.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1461)
java.base/solar.safety.ssl.SSLSocketImpl$AppInputStream.learn(SSLSocketImpl.java:1066)
org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:161)
org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:128)
org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:113)
org.postgresql.core.VisibleBufferedInputStream.learn(VisibleBufferedInputStream.java:73)
org.postgresql.core.PGStream.receiveChar(PGStream.java:465)
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2155)
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:574)
org.postgresql.jdbc.PgStatement.internalExecuteBatch(PgStatement.java:896)
org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:919)
org.postgresql.jdbc.PgPreparedStatement.executeBatch(PgPreparedStatement.java:1685)
com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeBatch(NewProxyPreparedStatement.java:2544)
com.oltpbenchmark.benchmarks.tpcc.procedures.NewOrder.newOrderTransaction(NewOrder.java:214)
com.oltpbenchmark.benchmarks.tpcc.procedures.NewOrder.run(NewOrder.java:147)
com.oltpbenchmark.benchmarks.tpcc.TPCCWorker.executeWork(TPCCWorker.java:66)
com.oltpbenchmark.api.Employee.doWork(Employee.java:442)
com.oltpbenchmark.api.Employee.run(Employee.java:304)
java.base/java.lang.VirtualThread.run(VirtualThread.java:309)

Thus, we ended up within the following state of affairs:

  1. All provider threads are pinned by session waiters, which means there aren’t any provider threads obtainable.
  2. Digital threads holding the classes can’t end their duties to launch the classes.

Impasse made simple!

JEP 444 states that:

There are two eventualities wherein a digital thread can’t be unmounted throughout blocking operations as a result of it’s pinned to its provider:

When it executes code inside a synchronized block or technique, or When it executes a local technique or a overseas perform.

The issue is that this synchronized code is perhaps deeply embedded inside the libraries you employ. In our case, it was inside the c3p0 library. So, the fix is simple: we merely wrapped the reference to a java.util.concurrent.Semaphore. With this variation, digital threads are blocked on the semaphore and, crucially, launch the provider thread as an alternative of delving inside c3p0. Thus, we by no means block inside c3p0 as a result of we enter c3p0 code solely when there’s a free session obtainable.

That is the entrance cowl artwork for the e-book The Legendary Man-Month written by Fred Brooks. The e-book cowl artwork copyright is believed to belong to the writer, Addison-Wesley, or the quilt artist.

Evidently regardless of a long time of progress in software program growth, there may be nonetheless no silver bullet. But, Java 21 digital threads are a exceptional function, providing important advantages if used rigorously: it’s very simple to jot down an environment friendly async code even when concurrency is excessive.

Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top