Cache-aside, write-behind, magic and why it sucks being an Oracle customer

I’ve been looking at a few different technologies to improve the scalability of one of our applications. We’re scaling pretty ok to be honest, considering that we currently have a traditional database centric solution. The cost for scaling an database bound application running Oracle is crazy to say the least considering they charge $47 500 per two x86 cores for Oracle Enterprise Edition. On top of this it’s 22% for software updates and support per year. As this wasn’t enough, they also increase the support cost with 4% per year.

You might think that the price above is for production environments, but in fact you have to pay for every single installation throughout the organization. There are no discounts for staging, DR, test or development environments.

I have a piece of advice for all you kids out there considering to run Oracle – just don’t do it.

This advice goes for all Oracle products really, as they all have the same pricing model.

Databases are overrated

My strong recommendation is to build an application that doesn’t rely on an underlying RDBMS. The relational database is an overrated, overly complex form of persistent store. They are slow, and are also usually a single point of failure. Does this mean that databases are dead and a thing from the past? No, but the role of the database will probably change going forward. In my opinion we should use the RDBMS as a System of Record that is mostly up to date.

If you ask me, databases are great at mainly two things:

  1. They make the data accessible for other systems in a standard way and
  2. They have a strong query language that many people know

So, write to databases asynchronously and use it for reporting and extracting data. Store the data in a data grid in the application tier (where it’s used).

What is a Data Grid?

A Data Grid is a horizontally scalable in-memory data management solution. Data grids try to eliminate data source contention by scaling out data management with commodity hardware.

Some underlying philosophies of data grids – according to Oracle (sic!):

  • Keep data in the application tier (where it’s used)
  • Disks are slow and databases are evil
  • Data Grids will solve your application scalability and performance problems

I have been looking at three different data grid vendors; Oracle Coherence, Gigaspaces EDG/XAP and Terracotta DSO.

Oracle Coherence

I really like this product. It focuses solely on being a potent data grid, with abilities to do act as a compute grid as well. Although I haven’t used Coherence for any large projects, its design and concepts are easy to relate to. It supports JTA transactions and consists of a single jar that you drop into your class path. The Coherence configuration doesn’t contain any infrastructural descriptions which means that you can use the same configuration on a your development laptop as in the production environment with multiple servers. The main issue with Coherence is the fact that Oracle owns it since a few years back.

Gigaspaces XAP

Gigaspaces mission seem to be to provide a very scalable application server with XAP – “The Scale-Out Application Server”. The EDG – enterprise data grid – packaging seem to provide about the same feature set as Coherence. The main difference to me, is the fact that the Gigaspaces offerings are both application server infrastructure that needs configuration, deployments and all of that. As I see things, the main drawback is the application server approach – it feels overwhelming. On the other hand, Gigaspaces is still a smaller company and eager to do business and provide great implementation support and the product seems to be a really good application server.

Terracotta DSO

Terracotta has a different approach. They provide Networked Attached Memory for the Java heap. If you can write a thread-safe program, you can scale out using Terracotta with no or minor changes to your application. From a technical point of view it’s a beautiful solution: You declare what objects you want to make available using Terracotta, and then Terracotta will makes your data persistent (if you want) and available on all clustered nodes. When you invoke new() on a clustered object, you will get a reference to the cluster object (if one exists). Another important difference between Terracotta and the others is that they only send the part of an object that’s been changed rather than the full serialized object graph.

I’m in love with this product. Its free and open source too and Terracotta Inc provides commercial support. The main concern I have with Terracotta is that its really a paradigm shift to the average java enterprise developer to start to write multi-threaded programs without having JTA transactions. Another concern is the magic – the low-level hooks they do in the JVM:s. At the time of writing, only Sun and IBM JVM:s are supported. It runs fine on OSX though.

The bottom line

So which one is the better? Well, that depends on a lot of things as always. If you decide to move to the grid it’s going to require retraining of your developers regardless of what solution you go for.

Please do keep in mind that products doesn’t usually solve your problems. And that you can go a long way using a less expensive RDBMS by partitioning the data across multiple servers – sharding. This is what a lot of large sites out there do.

Further reading:
The Coming of the Shard
eBay’s Architectural Principles
Oracle Coherence
Gigaspaces EDG
Terracotta DSO

About these ads

12 thoughts on “Cache-aside, write-behind, magic and why it sucks being an Oracle customer

  1. Hi Stephan

    Interesting and balanced overview.
    Just wanted to clarify GigaSpaces position on that regard.
    It seems that we all agree that the database role is changing and with the speed of the network and availability of memory resources the right approach is to decouple the application from the database and the entire persistence layer as I noted in few of my blogs:
    http://natishalom.typepad.com/nati_shaloms_blog/2008/03/scaling-out-mys.html. One of the core aspect that is unique with our approach is the level of transparency in which one can integrate existing database with data-grid. A good example on how this works with GigaSpaces is covered here:
    http://blog.gigaspaces.com//2008/04/23/integrating-gigaspaces-persistency-service-into-an-exiting-tier-based-system/
    I believe that you’ll see quite a difference on that regard between the various solutions you outlined above. For example with GigaSpaces there is a complete decoupling (Code and runtime) of any database calls from the data-grid and application processes. This important to truly isolate the application from database hiccups, bottlenecks, failure, as well as configuration changes (Schema, maintenance) between the database and data-grid. Were also able to delegate all updates to central location through a mirror service which simplify the maintenance overhead and provide additional performance optimization due to the fact that the calls to the database are collocated to the same machine as the database.

    Having said all that we do see caching as a component part of broader distributed architecture as your rightly noted. For example we argue that to achieve true linear scalability you need to address the scalability bottleneck across all tiers i.e. messaging, data and business logic -without it your as strong as your weakest link.
    We also argue that without that the complexity of achieving end-to-end scalability, and high availability is going to remain high since you’ll need to deal with synchronizing the application lifecycle with those of the data-grid, as well as handle consistency with the way messages are routed through the messaging system to the way data is distributed as well as synchronizing and maintaining different clusters and high availability models across the different tiers. This is going to have direct impact on ROI as noted in my recent post on that regard:
    http://natishalom.typepad.com/nati_shaloms_blog/2008/06/economies-of-no.html

  2. Coherence is awesome, but simply too expensive. It only makes sense at financial institutions or other places that need the power and a large corporate owner. But for mere mortals, its not an option.

    Teracotta.. cool idea with nice code, but I’d never use it. I hate the notion of a single point of failure (you can only have one server) and that a bad lock will deadlock the whole network. Its too easy to blow up the world, changes the language semantics, and is opposite of the current approaches to scaling. At a tech talk, they indicated that the main usage was simply to automatically distribute ehcache, which shows few trust it.

    Personally, I like the layered write-beside approach with local caches, remote caches, near caches, db per service, shards, etc. If you design the architecture well and use the right tools in the right places, its beautiful. These guys give you a lot of it for free, but distract you from getting it perfect…

  3. I know I’m a little late to the game here but I’ve recently looked at all three of these products to do exactly what you’re talking about here. Your write-up is dead on but I’ll add a few things.

    Coherence is fairly easy to implement, high performance, and just plain rocks, however it has the singular (and in my opinion fatal) misfortune of now being owned by Oracle. It’s very expensive and dealing with Oracle is a nightmare. I’ve had to deal with Oracle on behalf of clients on four separate occasions and after each, I swore to myself, “never again!”

    Terracotta is a very interesting “technology” but that’s also its downside. In most respects, it’s a technology and not a solution. There’s no best practices or already developed solutions that allows it to act as an out-of-the box IMDG that asynchronously persists to a database. I’m not saying it couldn’t be done, and frankly, I’m almost to the point where I”m ready to give it a go, but its still a lot of work to get to a real solution.

    GigaSpaces is where I’ve been focusing my efforts lately. As a company they are incredibly friendly to startups and small businesses in stark contrast to Oracle. As a product, in concept they seem to be right on target. Nati posted a link to his blog where he discussed a Persistence as a Service solution using MySQL. If you read that blog post then you will get incredibly excited about the possibilities. Unfortunately, the devil, as usual, is in the details.

    Above you mentioned your reservations being about the implementation complexity due to the solution being an application server that needs configuration and deployment. That’s just the tip of the iceberg. I found getting a GigaSpaces/OpenSpaces solution setup to be much more complex than expected. Naturally, none of the GigaSpaces people feel this way because the concepts and technologies involved are all second nature to them at this point, but I don’t consider myself a dull cookie and I struggled with it mightily. There’s a million tiny configuration details that have be ‘just’ right or things simply won’t work and you’ll have no idea why. Even when they do work, you’re run across things out of the blue that have a huge impact on your application but were totally unexpected. Just yesterday, I discovered that the because the GigaSpaces solution is intended to be interoperable between multiple technologies (C, .NET, Java…), local caches don’t actually store a reference to an object. They create that object when requested. This has the effect of creating a new copy of object every time it’s requested. This means that in a read heavy app, you might have tens of thousands of copies of the same object in memory whereas in a typical local cache solution, those would all be references to the same object saving tons of memory a alleviating a lot of the garbage collection woes.

    Don’t get me wrong, even with these concerns, we’re still leaning towards GigaSpaces on my current project, but I long for the ease of use that Coherence brought to the table in my last project.

  4. Hi Matt,
    Thanks for sharing your experiences with these products.
    We’re PoC:ing Terracotta now, and what you say is true. There are no or few high level solutions out of the box.
    I think Terracotta Inc has realized that this is a potential showstopper at many customers and are therefore planning on adding a lot of “recipes” on the site now, see http://www.terracotta.org/confluence/display/howto/Recipe?recipe=dirty-read for example.

    However, they could benefit more from bundling these concepts as part of the product.

  5. Also, Oracle’s new pricing per 2008-07-01 is:

    Oracle RDBMS Enterprise Edition $47 500 per 2x cores (up from $40 000)
    Oracle Coherence Grid Edition $25 000 per 2x cores (up from $20 000)
    :)

  6. Pingback: Matt's Blog

  7. I have been involved in evaluating both Gigaspaces and Coherence (this was before Oracle got into the picture) and found that I prefered Coherence for the following reasons:
    1. A lot easier to use and in particular configure
    2. Do not belive JINI will ever take off (Gigsapces is built on it)
    3. Coherence was more flexible for the use-cases at hand and the performance was slightly better
    4. The people I dealt with from Tangosol were more friendly and easy going than the Gigaspaces people

    Sadly I must agree with what has been said about Oracle – there licensing policies are out of this world and it is impossible to get even very reasonable improvement requests taken into consideration unless you happen to be a big Oracle customer :-( so I am not sure what I would recommend using today…

  8. Hi Matt

    I’ll be more then happy to get more direct feedback from you on your experience with GigaSpaces please feel free to contact me directly on that regard.

    I’ll start with small correction:

    “I discovered that the because the GigaSpaces solution is intended to be interoperable between multiple technologies (C, .NET, Java…), local caches don’t actually store a reference to an object. They create that object when requested. This has the effect of creating a new copy of object every time it’s requested”

    This is actually not the case. When you write object only the stateful attributes are written to the space without the containing object. Those attributes are stored by reference and when you read the object from the local-cache the attributes are red by reference as well. The only thing that gets re-created per read is the containing class. This might led you to the assumption that the entire class is new with its values but as i mentioned this is not really the case. BTW this behavior didn’t change and was not effected by our interoperability layer.

    Now with regard to the complexity i believe that we have to measure apples to apples. I would doubt that if you’ll try to implement a full blown in-memory data-grid with reliable synchronization with existing database, with built-in mapping to relational data-base you’ll find anything simpler or even close to what were providing today not just in terms of reliability and performance but even in terms of simplicity.
    I do agree that if what your looking is simple cache, GigaSpaces may be more complex then many of the alternatives in the market but as i said this not really the right comparison as for those purposes i would assume that you wouldn’t probably use something like JBoss cache or something on that level.

    Anyway since simplicity is very important for me i’d be happy to hear your feedback as to whether there are better things that we can do to make things simpler.

    Nati S.

  9. Magnus

    I appreciate your candid response.
    I believe that your referring to an evaluation that happened two or three years ago.
    As you know lots of things can change within such a long period of time and indeed that was the case. If you have the time i would encourage you to look into our OpenSpaces framework – which is essentially a Spring/POJO based framework that makes the entire development experience and configuration completely different then what you probably experienced with our earlier releases.

    See our new tutorial on that regard: http://www.gigaspaces.com/wiki/display/OLH/Enterprise+Data+Grid+Tutorial+A+-+Basic+Topologies

    Nati S.

  10. Terracotta now provides Ehcache as it’s distributed caching product. This has a few advantages, namely:

    1. A very widely used, time-tested caching API based on JSR 107.

    2. More simplicity in the data management layer. All the Terracotta specific configuration is now abstracted and distributing Ehcache is a matter of a few lines of configuration changes to the ehcache.xml.

    3. Direct integration with Hibernate and anything else that uses Ehcache.

    4. All the advantages (better serialization, better scale out, simplicity) of the underlying Terracotta platform.

    5. Caching topologies including write-behind, cache-aside, self-populating, blocking cache and features typically expected from a cache like low overhead eviction strategies, extensibility, Transaction support with JTA, eventing and so on.

    6. Great visibility through the famous and very useful developer and monitoring tools provided by Terracotta.

    7. The Terracotta server array – add as many server stripes (shards) as needed (not just one).

    8. And if you are so inclined, you can use Terracotta beyond the IMDG (Grid) or IMDC (Cache) use cases and utilize Quartz (Job Scheduling), Web Sesssions (for any container) or any other java state.

    9. Powerful cloud and virtualization tools for ease of operations and elasticity in the data center.

    So a lot of exciting leaps forward. Check it out on terracotta.org or ehcache.org.

    Cheers!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s