Have you walked down the ORM road of death?

A friend of mine asked me a really good question tonight:

Hey Stefan,
It would be great if you could please give me a sense for how many development teams get hit by a database bottleneck in JEE / Java / 3-tier / ORM / JPA land? And, how they go about addressing it? What exactly causes their bottleneck?

I think most successful apps – scaling problems are hopefully a sign that people are actually using the stuff, right? – built with Hibernate/JPA hit db contention pretty early on. From what I’ve seen this is usually caused by doing excessive round-trips over the wire or returning too large data sets.

And then we spend time fixing all the obvious broken data access patterns, by first to use HQL over standard eager/lazy fetching, or tuning existing HQL and then direct SQL if needed.

I believe the next step after this is typically to try to scale vertically, both in the db and app tier. Throwing more hardware at the problem may get us quite a bit further at this point.

Then we might get to the point where the app gets fixed so that it actually makes sense to scale horizontally in the app tier. We will probably have to add a load balancer to the mix and use sticky sessions by now.

And then then we will perhaps find out that we will not do that very well without a distributed 2nd level cache, and that all our direct SQL code writing to the DB (that bypass the 2nd level cache) won’t allow us to use a 2nd level cache for reads either…

Here is where I think there are many options and I’m not sure how people tend go from here. Here we might see some people abandoning ORM, while others may try to get the 2nd level cache to work?

Are these the typical steps for scaling up a Java Hibernate/JPA app? What’s your experience?

One thought on “Have you walked down the ORM road of death?

  1. Hi Stefan,

    I have been using hibernate for data persistence. I have had a very bad experience. It seemed a good fit where you don’t have to scale a lot. But for applications where u need scalability its not the right choice at least in my opinion. Since we have to scale up we tried to use custom caching of the less frequently updated data where read to write ratio is high. But due to hibernate proxy we have to enable few configurations like lazy=”false” etc. And then we tried using a second level cache (with distributed memory cache using teracotta) it solved a lot of our problems but i felt it is like a remedy to fix a mistake. When we looked at 2nd level cache of ours from a distant view its like a distributed cache just like any other key value pair database or Bigtable. Then i have started exploring Cassandra.

    Can you please share your view on such databases/data store.

    Thanks & Regards
    Suresh Reddy

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s