Chris Wong's Development Blog: Making concurrent HTTP requests in Java

There may be occasions when you need to make multiple HTTP requests. The easy way is to simply make one request after another in sequence. If that's all you need, there is no need to read further. This will only complicate your life. Go away. Shoo.

The hard way to do this is to make multiple requests concurrently. Generally at scale (otherwise, why bother?). That's what I will play with in this post. Most of the time spent in a HTTP request is waiting for the response, so we want to get those requests all out at once, then wait for their responses. The typical pattern is to use async requests.

Test scenario:

Make 1000 concurrent HTTP calls to a REST endpoint
For each request, the server will wait 3 seconds before responding, simulating "work".

The results of my experimentation follows.

Method 1: wrap your call in CompletableFuture

You could just wrap your existing old school synchronous request in a CompletableFuture:

List<CompletableFuture<Response>> futures = new ArrayList<>();
for (int i = 0; i < numRequests; i++) {
     String url = "...";
     futures.add(CompletableFuture.supplyAsync(() ->
         restTemplate.getForObject(url, SomeResponse.class), executor));
}
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join();
// read and process results

How long this takes depends on at least a couple of things:

What kind of Executor are those futures running on?
When does the request actually get sent?

If you did not specify the executor instance as I did above, the CompletableFuture will just run in the default ForkJoinPool.commonPool(). This typically is configured with a thread pool roughly equal to the number of CPU cores. That configuration is fine for CPU intensive processing, but is not appropriate for I/O intensive use cases like this, where the thread spends most of its time waiting for the network. In this experiment, I configured an Executor with a thread pool of size 50. How does this perform?

1000 calls using RestTemplate took 60 seconds

This performance is consistent with processing 1000 requests 50 at a time, and taking 3 seconds to process each of them. What holds back this fake-async code as configured is that you still need a thread per request, so it's limited by the size of the thread pool. Moreover, the actual request does not get sent until the worker thread starts processing.

Method 2: use native HttpClient's async API

Starting with Java 11, we have a HttpClient class with built-in support for async requests, speaking CompletableFutures natively.

List<CompletableFuture<Response>> futures = new ArrayList<>();
for (int i = 0; i < numRequests; i++) {
    var request = HttpRequest.newBuilder()
        .uri(URI.create("..."))
        .build();
    httpClient.sendAsync(request, HttpResponse.BodyHandlers.ofString())
        .thenApply(response -> /* parse JSON response */ null);
}
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join();
// read and process results

1000 calls using HttpClient took 3 seconds

This is the performance we're looking for. All 1000 requests are processed at once. The key is that the request itself is sent immediately without waiting for a thread. Thread pool size doesn't really matter because we only need a worker thread when there is a response to handle.

Method 3: use Spring's WebClient

Spring's WebClient is part of their Reactive framework and has a Reactive -- therefore async -- API. For consistency with other code here we'll convert it into a CompletableFuture:

List<CompletableFuture<Response>> futures = new ArrayList<>(); for (int i = 0; i < numRequests; i++) { var future = webClient.get() .uri("...") .retrieve() .bodyToMono(Response.class) .toFuture(); futures.add(future); } CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join(); // read and process results

1000 calls using WebClient took 6 seconds

WebClient is an API wrapper around an implementing HTTP client class. It defaults to a Netty-based client, but you can also make it use HttpClient. Unless you're writing Reactive code, it might seem clunky to use this client and work with the Reactive vocabulary (Mono, Flux). On the other hand it has convenient functionality like built-in deserialization of JSON into an object like in the code above. It's not clear to me why this particular demo code took 6 seconds (which meant at least request had to wait for the others to finish first), but obviously performance is in the same ballpark.

Method 4: use OkHttp

Not all async implementations are equivalent. Consider my results with OkHttp. It seems to be a popular library. It has a built-in async API using enqueue, but the callback style makes the code somewhat clunky. The following does show some error handling, which I skipped with previous examples. Here's what I wrote to set up 1 future:

var request = new Request.Builder()
    .url("...")
    .get()
    .build();
var future = new CompletableFuture<ReflectResponse>();
okHttpClient.newCall(request).enqueue(new Callback() {
    public void onFailure(Call call, IOException e) {
        future.completeExceptionally(e);
    }
    public void onResponse(Call call, Response response)
        throws IOException {
        if (!response.isSuccessful()) {
            future.completeExceptionally(new IOException("Unexpected response code: " + response.code()));
        } else {
            if (response.body() == null) {
                throw new IOException("response is missing body");
            } else {
                var result = objectMapper.readValue(response.body().string(), Response.class);
                future.complete(result);
            }
        }
    }
});

1000 calls using OkHttp took 601 seconds

Uh, what?

It turns out OkHttp uses the one-thread-per-request model. While you can create a bunch of requests to be serviced asynchronously, nothing gets sent to the server unless a worker thread starts work on it. It's the same issue as the "fake" async setup we used for the synchronous RestTemplate call above. But it performs much worse because it defaults to the tiny thread pool of about 5 threads on my machine. So it's dribbling out 5 requests at a time. You could make it as fast as HttpClient by configuring a humongous thread pool, but then again you can do the same with the wrapped RestTemplate code too.

Other limits

If you have a Unix-ish OS, have you checked the output of "ulimit -a" on your target machine?

processes limits the number of threads you can have in your JVM
file descriptors limits the number of network connections in your JVM

What about virtual threads, a.k.a Project Loom?

Java 21 is the first LTS version to officially support virtual threads. Virtual threads are much cheaper. A JVM can handle "millions of threads", so threads are no longer "evil". And virtual threads are meant for exactly these nonblocking I/O operations where they yield CPU time. So can we use virtual threads? You betcha. In the first method above, we wrapped a synchronous HTTP call in a CompletableFuture with a thread pool executer. The thread pool was what throttled the total time to make 1000 HTTP requests. What if we had unlimited threads?

var executor = Executors.newVirtualThreadPerTaskExecutor();

The resulting Executor from the above starts a new virtual thread for each task. There is no limit. What is the run time consequence of unlimited threads?

1000 calls using RestTemplate and virtual thread executor took 3 seconds

Conclusions

We have essentially come full circle here. You get similar parallelism and run time with either a synchronous HTTP client with unlimited threads, or asynchronous client. My thoughts are:

I don't see a significant advantage either way. The usual complaint about reactive/async code is the difficulty in debugging because you lost context. But a virtual thread's stack trace is also similarly context-free, starting where it was created.
Your choice might be a matter of convenience. The newer fluent APIs may be more pleasant to work with, or may be more convenient wrappers around lower level HTTP clients. Spring's RestClient, for example, supersedes RestTemplate. Higher level wrappers take care of some basic plumbing like error handling, retries, metrics and JSON serialization/deserialization.
Your choice might be limited to what version of Java you're on. Virtual threads only enjoy official status in Java 21.
But if you do have access to virtual threads, the conventional wisdom on thread conservation and pooling may need to go out the window.
Not all async HTTP clients are created equal. You might not like the Reactive vocabulary, or (as with OkHttp) an async API does not guarantee full concurrency.

Chris Wong's Development Blog

Tuesday, February 6, 2024

Making concurrent HTTP requests in Java