The problem

Recently one of my teams was working on a microservice that would allow us to show some eBay ads embedded in our list of results. The idea was simple:

Can we provide good value to our customers by also surfacing some good eBay inventory merged with Gumtree’s?

I think the resulting product was really good

eBay results intermingled
eBay Results mingled with Gumtree's

The system was challenging because:

  • We agreed with the mobile team that we would give them the full list of results in one go, instead of them having to do 2 requests: one for the list of Gumtree ads, and one for the list of eBay ads. This meant that this microservice is used synchronously, and therefore is in the critical path.
  • Unfortunately the eBay Public API that we were using had an awful performance. The response times we were getting to our clusters where in the order of 2.3 seconds per call.
eBay Response Time
eBay API Response Time

The solution

The solution was rather nice (if I may say so myself). We used the capabilities of the great caching library Caffeine: A high performance caching library for Java 8 that offers an AsyncLoadingCache. This is useful because it keeps a cache of CompletableFutures, so you can decide to wait or not on the loading of an element.

It worked well, but we saw some performance issues. Not related to any I/O, but in a part of the code that should be very deterministic. Certainly a strange behaviour.

Performance issue
Performance issue

The piece of the code in question is the one measured by the blue line (labeled as: “Rules Build Time”), and it had an impact in the overall response time of the microservice as measured by the green line (measured on the embedded Tomcat that answers to requests).

The clue on what was happening came from an error message shown after the instance went down in one occasion:

I0421 09:23:07.656251 8603 exec.cpp:132] Version: 0.21.1
I0421 09:23:07.658957 8611 exec.cpp:206] Executor registered on slave 20150106-144508-3230474250-5050-46182-S0
Java HotSpot(TM) 64-Bit Server VM warning: CodeCache is full. Compiler has been disabled.
Java HotSpot(TM) 64-Bit Server VM warning: Try increasing the code cache size using -XX:ReservedCodeCacheSize=

After some Googling, I found a couple of good articles that describe what this is related to: The Just-In-Time Compiler and the Code Cache.

Just In Time (JIT) Compilation (*)
Java byte code is interpreted, however this is not as fast as directly executing native code on the JVM’s host CPU. To improve performance the Oracle Hotspot VM looks for “hot” areas of byte code that are executed regularly and compiles these to native code. The native code is then stored in the code cache in non-heap memory. In this way the Hotspot VM tries to choose the most appropriate way to trade-off the extra time it takes to compile code verses the extra time it take to execute interpreted code.

This is particularly true when the JVM is running in “client” mode:

Client vs Server modes on the JVM (*)
The JDK includes two flavors of the VM -- a client-side offering, and a VM tuned for server applications. These two solutions share the Java HotSpot runtime environment code base, but use different compilers that are suited to the distinctly unique performance characteristics of clients and servers. These differences include the compilation inlining policy and heap defaults.

...

The Client VM compiler does not try to execute many of the more complex optimizations performed by the compiler in the Server VM, but in exchange, it requires less time to analyze and compile a piece of code. This means the Client VM can start up faster and requires a smaller memory footprint.

The Server VM contains an advanced adaptive compiler that supports many of the same types of optimizations performed by optimizing C++ compilers, as well as some optimizations that cannot be done by traditional compilers, such as aggressive inlining across virtual method invocations. This is a competitive and performance advantage over static compilers. Adaptive optimization technology is very flexible in its approach, and typically outperforms even advanced static analysis and compilation techniques.

So, with this info, we looked into JMX to find that the CodeCache memory pool was constantly under high utilization and that the JIT compiler was constantly working.

The Fix

In the end the fix was simple: - Added the -server flag to the JVM - Increasing the Code Cache size (from 16mv -> 32mb)

Performance issue - after
Performance issue - After

Now the response time is close to 7ms total :)

Comments