A friend of mine asked me a really good question tonight:
It would be great if you could please give me a sense for how many development teams get hit by a database bottleneck in JEE / Java / 3-tier / ORM / JPA land? And, how they go about addressing it? What exactly causes their bottleneck?
I think most successful apps – scaling problems are hopefully a sign that people are actually using the stuff, right? – built with Hibernate/JPA hit db contention pretty early on. From what I’ve seen this is usually caused by doing excessive round-trips over the wire or returning too large data sets.
And then we spend time fixing all the obvious broken data access patterns, by first to use HQL over standard eager/lazy fetching, or tuning existing HQL and then direct SQL if needed.
I believe the next step after this is typically to try to scale vertically, both in the db and app tier. Throwing more hardware at the problem may get us quite a bit further at this point.
Then we might get to the point where the app gets fixed so that it actually makes sense to scale horizontally in the app tier. We will probably have to add a load balancer to the mix and use sticky sessions by now.
And then then we will perhaps find out that we will not do that very well without a distributed 2nd level cache, and that all our direct SQL code writing to the DB (that bypass the 2nd level cache) won’t allow us to use a 2nd level cache for reads either…
Here is where I think there are many options and I’m not sure how people tend go from here. Here we might see some people abandoning ORM, while others may try to get the 2nd level cache to work?
Are these the typical steps for scaling up a Java Hibernate/JPA app? What’s your experience?