Have you walked down the ORM road of death?

A friend of mine asked me a really good question tonight:

Hey Stefan,
It would be great if you could please give me a sense for how many development teams get hit by a database bottleneck in JEE / Java / 3-tier / ORM / JPA land? And, how they go about addressing it? What exactly causes their bottleneck?

I think most successful apps – scaling problems are hopefully a sign that people are actually using the stuff, right? – built with Hibernate/JPA hit db contention pretty early on. From what I’ve seen this is usually caused by doing excessive round-trips over the wire or returning too large data sets.

And then we spend time fixing all the obvious broken data access patterns, by first to use HQL over standard eager/lazy fetching, or tuning existing HQL and then direct SQL if needed.

I believe the next step after this is typically to try to scale vertically, both in the db and app tier. Throwing more hardware at the problem may get us quite a bit further at this point.

Then we might get to the point where the app gets fixed so that it actually makes sense to scale horizontally in the app tier. We will probably have to add a load balancer to the mix and use sticky sessions by now.

And then then we will perhaps find out that we will not do that very well without a distributed 2nd level cache, and that all our direct SQL code writing to the DB (that bypass the 2nd level cache) won’t allow us to use a 2nd level cache for reads either…

Here is where I think there are many options and I’m not sure how people tend go from here. Here we might see some people abandoning ORM, while others may try to get the 2nd level cache to work?

Are these the typical steps for scaling up a Java Hibernate/JPA app? What’s your experience?

Web pages are disappearing?

I believe the page (url) is becoming more of a task oriented landing area where the web site will adopt the contents to the requesting user’s needs. I believe the divorce between content and pages is inevitable. It will be interesting to see how this will affect the KPI:s, analytics tools we currently use and search engine optimization practices going forward.

I recently attended a breakfast round-table discussion hosted by Imad Mouline. Imad is the Chief Technology Officer of Gomez. For those who aren’t familiar with Gomez, they specialize in web performance monitoring. It was an interesting discussion with participants from a few different industries. Participants were either CTO:s or CTO direct reports.

Imad shared a few additional trends regarding web pages (aggregated from the Gomez data warehouse):

  • Page weight is increasing (kB/page)
  • The number of page objects are plateauing
  • The number of origin domains per page are increasing

We covered a few different topics, but the most interesting discussion (to me) was related to how web pages are being constructed in modern web sites and what impact this has on measuring service level key performance indicators (KPI:s).

In order to sell effectively you need to create a web site that really stands out. One of the more effective ways of doing this is to use what we know about the user to contribute to this experience.

In general we tend to know a few things about each site visitor:

  • What browsing device is the user using (agent http header)
  • Where the user is (geo-ip lookup)
  • What the user’s preferred language is (browser setting or region)
  • Is the user is a returning customer or not (cookie)
  • The identity of the customer (cookie) and hence possibly age, gender, address etc 🙂
  • What time of day it is

So we basically know the how, who, when, where and what’s. In addition to this we can use data from previous visits to our site, such as click stream analysis, order history or segmentation by data warehouse analysis fed back into the content delivery system to improve the customer experience.

For example, when a user visits our commerce site we can use all of the above to present the most relevant offers in a very targeted manner to that user. We can also cross-sell efficiently and offer bonuses if we think there is a risk of this being a lapsing customer. We can adapt to the user’s device and create a different experience depending on if the user is visiting in the afternoon or late night.

If we do a good job with our one-to-one sales experience, the components and contents delivered on a particular page (url) will in other words vary depending on who’s requesting it, from where the user is requesting it, what device is used, and what time it is. Depending on the application and the level of personalization, this will obviously impact both the non-functional and functional KPI:s: What is the conversion rate for the page? What is the response time for the page?

Sunset

I am a long time fan of Robert X. Cringely and I was looking forward to his comments on the Oracle/Sun debacle. Here’s what he said in his blog – I couldn’t agree more:

it ends with the heart of Sun moving a few miles up 101 to where it will certainly die.

But for the most part what Oracle will do with Sun is show a quick and dirty profit by slashing and burning at a produgious rate, cutting the plenty of fat (and a fair amount of muscle) still at Sun. If you read the Oracle press release, the company is quite confident it is going to make a lot of money on this deal starting right away. How can they be so sure?
It’s easy. First drop all the bits of Sun that don’t make money. Then drop all the bits that don’t fit in Oracle’s strategic vision. Bring the back office entirely into Redwood Shores. The cut what overhead is left to match the restructured business. Sell SPARQ to some Asian OEM. Cut R&D by 80 percent, saving $2.4 billion per year. I’m guessing sell StorageTek, maybe even to IBM. And on and on. Gut Sun and milk what remains.

Read more at http://www.cringely.com/2009/04/sunset/

Regarding my previous post – I think that the acquisition is the start of a long death process for Java open source. I do not expect Oracle to announce the death of anything, but it will never the less die unless fully embraced by Oracle. The sun will surely set on Glassfish and the rest of the projects that doesn’t make any money for Sun, nor is of strategic interest to Oracle.

Oracle kills Open Source Java with a really big rock?

Being known as the guy that called Oracle evil in a blog post, I feel I gotta comment on today’s announcement that Oracle is buying Sun Microsystems for 7.4 billion US dollars. As you can imagine, I’m not very optimistic.

What does the deal really mean to the Open Source Java community? Isn’t this just business as usual? And wouldn’t we be worse off if Big Blue would have bought Sun a couple of weeks back?

As you might have guessed, I would have preferred IBM to buy Sun for many reasons. Perhaps the main thing is that I feel IBM has been embracing open source, whereas Oracle hasn’t. It makes all the difference. Let’s hope Oracle sees the light and doesn’t screw up everything Java!

The Sun assets at stake:
Development Tools: NetBeans
Middleware: OpenSSO, Glassfish, MySQL, Java Hotspot JVM, Java Real Time System
Consumer Technology: OpenOffice, JavaFX / JavaFX Mobile

I’ll try to describe what I think is a likely outcome of the assets above by comparing them to Oracle’s current product line and let’s see how bad this actually can get…

Sun NetBeans vs Oracle JDeveloper

This is easy – no one uses JDeveloper, and it would surprise me if Oracle didn’t bite the bullet and ditch JDeveloper for NetBeans which has become a really good (the best?) IDE recently.

Sun OpenSSO vs Oracle Access Manager

There will be no point for Oracle to invest any money in OpenSSO when they already have a good offering in their Fusion middleware suite. OpenSSO is toast.

Sun Glassfish vs Oracle Weblogic

Oracle has a stronger app server in Weblogic than Glassfish is. I think that Glassfish will be put to the axe. Quickly. Oracle is not known for giving away software and they will only open source software that is needing life support. Some recent examples include TopLink and ADF Faces.

Sun MySQL vs Oracle RDBMS

I think this is _the_ most obvious: MySQL is TheirSQL now and also R.I.P.

Sun Java Hotspot vs Oracle JRockit

Being the cynical person I am, I think Oracle can kill two birds with one stone here. One might think that Oracle would merge the two VM efforts into one, but the result might not be what you think. Let’s just assume that Oracle takes the good stuff from JRockit and puts it into the Hotspot JVM reference implementation. Is this a likely scenario? Hell no. Oracle is in the software business to make money, and if you want to run a production-grade Java Server VM, then you will have to get it from Oracle for a fee. By doing this they also effectively kill Terracotta, the only viable contender to Oracle Coherence. See, Terracotta will not run on JRockit… The consumer JVM will be named Sun JVM. Oracle will of course keep the Java Real Time VM as it’s profitable business.

OpenOffice, JavaFX / JavaFX Mobile

Oracle’s track-record in building consumers applications is, well, not great. Anyone that’s ever tried to install an Oracle product knows what I’m talking about. So I don’t think OpenOffice will survive either. On the other hand Larry may want to keep pushing it just to be a thorn in Microsoft’s side… As for JavaFX, it doesn’t really stand a chance versus Adobe – its too little, and too late. Oracle knows this and will kill it. Quietly.

So, is this good for anyone at all? Yes. Oracle and Microsoft. Everyone else loses. IBM is in a really awkward situation and JavaOne this year may be the ultimate funeral service for (free) open source Java.

Keep in mind that when Oracle says open they generally mean open standards whereas when IBM and Sun says open they generally mean free open source software.

I’ll close with a few Larry Ellison quotes from the conference call:

“Java is the foundation of the Oracle Fusion middleware and its the second most important software asset we’ve ever acquired.”

“We acquired BEA because they had the leading Java virtual machine”

Speed sells

This coming week (first week of February), Unibet launches its revamped website based on the Facelift project I lead. As a part of this effort, we have worked extremely hard in order to lower page loading times. We have invested a substantial amount of time and money focusing on improving performance. Is this really justified?

A 2006 study by Jupiter Research found that the consequences for an online retailer whose site underperforms include diminished goodwill, negative brand perception, and, most important, significant loss in overall sales. Online shopper loyalty is contingent upon quick page loading, especially for high-spending shoppers and those with greater tenure.

The report ranked poor site performance second only to high prices and shipping costs as the main dissatisfaction among online shoppers. Additional findings in the report show that more than one-third of shoppers with a poor experience abandoned the site entirely, while 75 percent were likely not to shop on that site again. These results demonstrate that a poorly performing website can be damaging to a company’s reputation; according to the survey, nearly 30 percent of dissatisfied customers will either develop a negative perception of the company or tell their friends and family about the experience.

+500 ms page load time lead to a -20% drop in traffic at Google

Marissa Mayer ran an experiment where Google increased the number of search results from ten to thirty per page. Traffic and revenue from Google searchers in the experimental group dropped by 20%.

After a bit of looking, they found an uncontrolled variable. The page with 10 results took 400ms to generate. The page with 30 results took 900ms. Half a second delay caused a 20% drop in traffic. Half a second delay killed user satisfaction.

“It was almost proportional. If you make a product faster, you get that back in terms of increased usage”
-Marissa Mayer,VP Search Product and User Experience at Google

The same effect happened with Google Maps. When the company trimmed the 120KB page size down by about 30 percent, the company started getting about 30 percent more map requests.

+100 ms page load time lead to a -1% sales at Amazon

Amazon also performed some A/B testing and found that page load times directly impacted the revenue:

“In A/B tests, we tried delaying the page in increments of 100 milliseconds and found that even very small delays would result in substantial and costly drops in revenue.”
-Greg Linden, Amazon.com

There are a number of tools and best-practices available to improve web-site performance. I particularly like the work of Steve Souders. Steve was the Chief Performance Yahoo! (at Yahoo! obviously) and is now at Google doing web performance and open source initiatives.

When at Yahoo, Steve published a benchmark and tool, called YSlow which is a good indicator of how well the front-end web technology (HTML, javascript and images etc) of your site is implemented. Front-end makes up for almost 90% of the page load times at more e-commerce sites.

At Unibet, our old HTML had a YSlow score of 56/100 in average. This is about average in the e-gaming industry. However, the Facelifted version just out is 96/100. As comparison, eBay start page is 97/100, Yahoo! start-page is 95/100. This should result in reduced wait and based on the research above this will help drive revenue and customer satisfaction.

We have worked extremely hard in order to lower page loading times. We have invested a substantial amount of time and money in doing so. Is this really justified? YES! I am confident that our new site will contribute to increased sales and increased customer lifetime value.

Facelift and EDA

I’m getting back in shape! Since my last post I’ve become a dad again and life is good. I’ve taken up exercise again, and run almost every day. I’ve lost 10 kg:s and is starting to look somewhat fit again…

Operation Facelift

Fortunately for the company – we’re getting in shape at work too! We’re currently moving to XHTML 1.1 strict and a floating layout with skinning support. We’re also moving to 100% YUI and optimizing for SEO and performance. TTM and TCO will decrease significantly too. Man, I love to work with great front-end people. This is going to rock!

Event Driven Architecture

I’ve also kicked off a huge push where all the product teams will start aligning their architectures to an Event Driven Architecture (EDA) model. EDA is great for separation of concerns (the registration module of the customer system will let everyone who cares know that a new customer has registered) and also for getting a scalable architecture (async async async!).

Maven2 and Hudson

We’ve gotten all the product teams up on Maven2 from the god-forgotten shell-script/ant mess we had a year ago. Hudson is used for continuous building. I love Hudson – highly recommended! We also migrated to Subversion (finally).

The (almost) perfect (rich) website

I am personally a fan of light-weight web pages that use W3C standards based elements and layout. However, many commercial web sites seem to want to move to a more “print-like” experience.

The cost of moving to a richer experience is usually higher maintainance cost and round trip time – you need the graphics or flash guys for many changes. SEO (Search Engine Optimization) suffers as the graphics can’t be indexed by the web crawlers, and you usually take a hit on page load times too.

Wouldn’t it be great if you could make a web site that is:

  • Great looking
  • SEO friendly
  • Quick to load and render
  • and is XHTML compliant

We have come a long way at unibet.com, but we made some compromise in look and feel for speed and we also do still have article headers using generated images. This has bothered me for some time. One of our consultant mentioned that he know of someone that used Flash for rendering headlines, and it sounded like a good idea to me. I did some research and stumbled upon sIFR.

sIFR (or Scalable Inman Flash Replacement) is a technology that allows you to replace text elements on screen with Flash equivalents. Put simply, sIFR allows website headings, pull-quotes and other elements to be styled in whatever font the designer chooses – be that Foundry Monoline, Gill Sans, Impact, Frutiger or any other font – without the user having it installed on their machine. sIFR provides some javascript files and a Flash movie in source code format (.fla) that you can embed your fonts into. It’s really easy to set up.

To use sIFR on your website you embed the font (be careful to encode all (but only) the chars you will need) to minimize the size of the Flash movie. Typically the SWF movie is between 8-70kB. This may seem like a lot more than an image, but remember that the SWF will be cached for a very long time in to browser if you’ve set up your web server correctly. Effectively the font flash will only be downloaded once or not at all per site visit.

When you have made the SWF:s you need, just add a few lines of sIFR code into the web page and that’s it.

The following explains the sIFR process in the browser:

  1. A web page is requested and loaded by the browser.
  2. Javascript detects if Flash 6 or greater is installed.
  3. If no Flash is detected, the page is drawn as normal.
  4. If Flash is detected, the HTML element of the page is immediately given the class “hasFlash”. This effectively hides all text areas to be replaced but keeps their bounds intact. The text is hidden because of a style in the style sheet which only applies to elements that are children of the html.hasFlash element.
  5. The javascript traverses through the DOM and finds all elements to be replaced. Once found, the script measures the offsetWidth and offsetHeight of the element and replaces it with a Flash movie of the same dimensions.
  6. The Flash movie, knowing its textual content, creates a dynamic text field and render the text at a very large size (96pt).
  7. The Flash movie reduces the point size of the text until it all fits within the overall size of the movie.

sIFR is a clever hack, but none the less a hack. The result is really amazing however. It’s hardly noticeable to the end user and meets all the four requirements I set up in my “what if…” list above so we’re moving to sIFR for the next release of unibet.com.

While sIFR gives us better typography today, it is clearly not the solution for the next 20 years.

Further reading:

The sIFR3 Wiki

Cache-aside, write-behind, magic and why it sucks being an Oracle customer

I’ve been looking at a few different technologies to improve the scalability of one of our applications. We’re scaling pretty ok to be honest, considering that we currently have a traditional database centric solution. The cost for scaling an database bound application running Oracle is crazy to say the least considering they charge $47 500 per two x86 cores for Oracle Enterprise Edition. On top of this it’s 22% for software updates and support per year. As this wasn’t enough, they also increase the support cost with 4% per year.

You might think that the price above is for production environments, but in fact you have to pay for every single installation throughout the organization. There are no discounts for staging, DR, test or development environments.

I have a piece of advice for all you kids out there considering to run Oracle – just don’t do it.

This advice goes for all Oracle products really, as they all have the same pricing model.

Databases are overrated

My strong recommendation is to build an application that doesn’t rely on an underlying RDBMS. The relational database is an overrated, overly complex form of persistent store. They are slow, and are also usually a single point of failure. Does this mean that databases are dead and a thing from the past? No, but the role of the database will probably change going forward. In my opinion we should use the RDBMS as a System of Record that is mostly up to date.

If you ask me, databases are great at mainly two things:

  1. They make the data accessible for other systems in a standard way and
  2. They have a strong query language that many people know

So, write to databases asynchronously and use it for reporting and extracting data. Store the data in a data grid in the application tier (where it’s used).

What is a Data Grid?

A Data Grid is a horizontally scalable in-memory data management solution. Data grids try to eliminate data source contention by scaling out data management with commodity hardware.

Some underlying philosophies of data grids – according to Oracle (sic!):

  • Keep data in the application tier (where it’s used)
  • Disks are slow and databases are evil
  • Data Grids will solve your application scalability and performance problems

I have been looking at three different data grid vendors; Oracle Coherence, Gigaspaces EDG/XAP and Terracotta DSO.

Oracle Coherence

I really like this product. It focuses solely on being a potent data grid, with abilities to do act as a compute grid as well. Although I haven’t used Coherence for any large projects, its design and concepts are easy to relate to. It supports JTA transactions and consists of a single jar that you drop into your class path. The Coherence configuration doesn’t contain any infrastructural descriptions which means that you can use the same configuration on a your development laptop as in the production environment with multiple servers. The main issue with Coherence is the fact that Oracle owns it since a few years back.

Gigaspaces XAP

Gigaspaces mission seem to be to provide a very scalable application server with XAP – “The Scale-Out Application Server”. The EDG – enterprise data grid – packaging seem to provide about the same feature set as Coherence. The main difference to me, is the fact that the Gigaspaces offerings are both application server infrastructure that needs configuration, deployments and all of that. As I see things, the main drawback is the application server approach – it feels overwhelming. On the other hand, Gigaspaces is still a smaller company and eager to do business and provide great implementation support and the product seems to be a really good application server.

Terracotta DSO

Terracotta has a different approach. They provide Networked Attached Memory for the Java heap. If you can write a thread-safe program, you can scale out using Terracotta with no or minor changes to your application. From a technical point of view it’s a beautiful solution: You declare what objects you want to make available using Terracotta, and then Terracotta will makes your data persistent (if you want) and available on all clustered nodes. When you invoke new() on a clustered object, you will get a reference to the cluster object (if one exists). Another important difference between Terracotta and the others is that they only send the part of an object that’s been changed rather than the full serialized object graph.

I’m in love with this product. Its free and open source too and Terracotta Inc provides commercial support. The main concern I have with Terracotta is that its really a paradigm shift to the average java enterprise developer to start to write multi-threaded programs without having JTA transactions. Another concern is the magic – the low-level hooks they do in the JVM:s. At the time of writing, only Sun and IBM JVM:s are supported. It runs fine on OSX though.

The bottom line

So which one is the better? Well, that depends on a lot of things as always. If you decide to move to the grid it’s going to require retraining of your developers regardless of what solution you go for.

Please do keep in mind that products doesn’t usually solve your problems. And that you can go a long way using a less expensive RDBMS by partitioning the data across multiple servers – sharding. This is what a lot of large sites out there do.

Further reading:
The Coming of the Shard
eBay’s Architectural Principles
Oracle Coherence
Gigaspaces EDG
Terracotta DSO

On the way to and from work

I ride my bike to and from work every day between May and October. It’s a lovely 10 km ride through Stockholm, mostly along the water. The other time of the year I take the tram or the subway.
During this time I prefer listening to podcasts my iPod touch. The iPod touch is great, yet I’m looking forward to replace it with an iPhone 2.0 with high speed internet.

I try to follow the following podcasts:

Entrepreneurial Thought Leaders

The DFJ Entrepreneurial Thought Leaders Seminar (ETL) is a weekly seminar series on entrepreneurship, co-sponsored by BASES (a student entrepreneurship group), Stanford Technology Ventures Program, and the Department of Management Science and Engineering.

http://www.stanford.edu/group/edcorner/uploads/podcast/EducatorsCorner.xml

Some of my personal favorites are:

BusinessWeek – CEO Guide To Technology

I love this BusinessWeek site that has as its mission to educate CEO:s on new trends in technology. It is very useful to listen to the way that BusinessWeek explains tech to management.

http://www.businessweek.com/search/podcasts/guide_to_tech.rss

Robert X Cringely – the Pulpit

I’m a long time fan of Cringely’s since Wired and Accidental Empires. I try to catch his show this PBS as often as I can. Its just under 10 minutes long and usually brings a smile on my face.

http://www.pbs.org/cringely/pulpit/

Javaposse

News, Interviews and more on the Java Language with Tor Norbye (Sun Microsystems), Carl Quinn (Google), Dick Wall (Google), Joe Nuxoll (Navigenics)

http://feeds.feedburner.com/javaposse

Swedish Public Radio

One of the very best podcasts I listen to is P3 Dokumentär (in Swedish) from Sveriges Radio, the Swedish Public Service Radio.

Switched to Mac

I admit it. I’ve been secretly in love with Apple since OS X. I’ve been very impressed with the OS – I love what Apple did with Leopard – but I never really liked the hardware. Until the MacBook Air. It must be the best laptop on the planet.

I have a corporate laptop from HP. Its not a bad laptop, but the company security guys and the PC admins literally killed it. It takes well over two minutes to start, and they do not seem to support sleep mode or hibernation for security reasons. It runs Pointsec, group policies, and an antivirus software that kills it. On top of this they removed administrative rights on all laptops recently and this rendered it totally useless.

So, being an old firewall guy, I set up the VPN client from my home pc (this is what happens when the security dept screws people’s tools up), so I could still work from home, but I was still forced to use the laptop while traveling.

About a week ago, I traveled to London with a consultant from the office. He brought his Macbook Air, and I brought my HP-killed-by-the-IT-dept-laptop.
We decided to do some work on the flight down on a presentation, and we fired up the laptops. He was going at it (flipped the mac open, used spotlight to find the presentation in the mail, and opened it) before I had a chance to enter my Pointsec password. Three minutes later I was still waiting for the drive to settle down and for Windows to give me a cursor. I must’ve looked exactly like the PC guy in the Apple commercials.

I bought my MacBook Air (the SSD version) yesterday. I run Microsoft Office (for OSX) and the recently released Checkpoint VPN client for OSX 10.5 to communicate with work. I also bought VMware to run a couple of apps at work – mainly Smart Draw 2008.

This is the list of software I installed so far:

  • Checkpoint VPN Client
  • Microsoft Office 2008 for OS X
  • TextMate
  • VMware Fusion
  • MSN Messenger 7 for OS X

So far I am extremely happy. I’m blogging from TextMate, a fantastic text editor. I’ve moved to Mac, I am never going going back.