An attempt at a developer friendly build pipe line

Background

I’ve spent some evenings/nights over the christmas holiday improving the deployment of Nomp.se, a site where kids can practice math for free, that we run on EC2.

The situation we had was that we deployed to the EC2 server using a locally installed Jenkins CI-server, which built the artifact (a WAR) and used the maven tomcat plugin to deploy to the local tomcat server, which was a rpm package provided by Amazon (yum install tomcat6). The setup worked pretty ok, but it was a hack. Database changes were applied and tested manually – we had a folder “sql” that contained numbered sql files that should be applied in order.

Clearly a lot of room for improvement in this area!

Goals with the new build pipeline

I wanted to reach the following goals with the new build pipeline:

  • One build from the build server all the way from my local Jenkins through test environments and into production.
  • 100% control over configuration changes of all components (Apache httpd, Apache Tomcat, MySql database), so that changes can be tested in the normal pipeline without relying on manual hacks.
  • It should be developer friendly. A developer with basic understanding of Linux, maven and tomcat should be able to make changes to and work with the build pipe line.
  • Hence, it should only rely on basic tooling (ant, maven, rpm packages) for doing the heavy lifting and use capabilities of other tools, eg Jenkins, Puppet, Capistrano as (non-critical) value add.

After a few iterations I was able to get to the following to deploy any configuration change on to a production server.

on the build server:
 $ mvn deploy

on the target server:
 # yum -y update nomp-web nomp-tomcat nomp-dbdeploy
 # cd /opt/nomp-dbdeploy; ant
 # /etc/init.d/nomp restart

That’s it. Four steps. There are no shell scripts involved. There is no rsync, there is no scp:ing of files. How I did it? Hold on, I will come to that in a minute or two 🙂

System configuration and prerequisites

In order to make sure the server contains the prerequisite packages and configuration I used Puppet.

“Puppet is a declarative language for expressing system configuration, a client and server for distributing it, and a library for realizing the configuration.

Rather than approaching server management by automating current techniques, Puppet reframes the problem by providing a language to express the relationships between servers, the services they provide, and the primitive objects that compose those services. Rather than handling the detail of how to achieve a certain configuration or provide a given service, Puppet users can simply express their desired configuration using the abstractions they’re used to handling, like service and node, and Puppet is responsible for either achieving the configuration or providing the user enough information to fix any encountered problems.”

from http://projects.puppetlabs.com/projects/puppet/wiki/Big_Picture

I’m not going to go into detail how to setup Puppet in this text, but here’s what I do with Puppet in order to support the build pipe line:

  • Ensure that the service accounts and groups exists on the target system
  • Ensure that software I rely on is installed (ant, apache httpd, mysqld)
  • Configuration management of a few configuration files such as httpd.conf, my.cnf etc.

Puppet config file example:

user { nomp:
 ensure => present,
 uid => 300,
 gid => 300,
 shell => '/bin/bash',
 home => '/opt/nomp',
 managehome => true,
 }
group { nomp:
 ensure => 'present',
 gid => 300
 }
package { "ant":
 ensure => "installed"
 }

The above configuration means that Puppet will ensure that the user nomp and group nomp will exist on the system and that the ant package will be installed.
I will do a whole lot more work with configuration management and provisioning with Puppet going forward, but the above is what was needed to meet my project goals.

Getting started

I started with trying package my existing WAR project as an rpm (or .deb). When Googling about for a while I found the RPM Maven Plugin (http://mojo.codehaus.org/rpm-maven-plugin/). It basically lets you build rpms using maven. The downside is that is relies on the “rpm” command installed in order to produce the final RPM from the spec file. In order to get a working maven environment on all platforms, I wrapped the rpm plugin in a maven build profile.

(Later I also found a pure java rpm tool (redline-rpm), but I haven’t looked into it yet).

The trickiest part was to get a good setup for artifact versions and RPM-release versions so that the maven release plugin could still be used without any manual changes.
The rpm-plugin has some funky defaults (http://mojo.codehaus.org/rpm-maven-plugin/ident-params.html#release) that wasn’t going to work with “yum update”.
It was a lot of experimentation, but in the end I settled for the Build Number Maven Plugin (http://mojo.codehaus.org/buildnumber-maven-plugin/).
It’s a pretty simple plugin that checks the SCM for the revision number and exposes that as a maven variable.

Here’s the RPM-part of my WAR POM:

<profiles>
  <profile>
    <id>rpm</id>
    <activation>
      <os>
        <name>linux</name>
      </os>
    </activation>
    <build>
      <plugins>
        <plugin>
          <groupId>org.codehaus.mojo</groupId>
          <artifactId>rpm-maven-plugin</artifactId>
          <version>2.1-alpha-1</version>
          <extensions>true</extensions>
          <executions>
            <execution>
              <goals>
                <goal>attached-rpm</goal>
              </goals>
            </execution>
          </executions>
          <configuration>
            <copyright>Copyright 2011 Selessia AB</copyright>
            <distribution>Nomp</distribution>
            <group>${project.groupId}</group>
            <packager>${user.name}</packager>
            <!-- need to use the build number plugin here in order for yum upgrade to work in snapshots -->
            <release>${buildNumber}</release>
            <defaultDirmode>555</defaultDirmode>
            <defaultFilemode>444</defaultFilemode>
            <defaultUsername>nomp</defaultUsername>
            <defaultGroupname>nomp</defaultGroupname>
            <requires>
              <require>nomp-tomcat</require>
            </requires>
            <mappings>
              <!-- webapps deployment -->
              <mapping>
                <directory>${rpm.install.webapps}/${project.artifactId}</directory>
                <sources>
                  <source>
                    <location>target/${project.artifactId}-${project.version}</location>
                  </source>
                </sources>
              </mapping>
            </mappings>
          </configuration>
        </plugin>
      </plugins>
    </build>
  </profile>
</profiles>

Here’s the build number plugin configuration:

<plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>buildnumber-maven-plugin</artifactId>
    <version>1.0</version>
    <executions>
      <execution>
        <phase>validate</phase>
        <goals>
          <goal>create</goal>
        </goals>
      </execution>
    </executions>
    <configuration>
      <doCheck>true</doCheck>
      <doUpdate>true</doUpdate>
    </configuration>
  </plugin>

What all the configuration above does is that it adds a secondary artifact (the rpm) which gets uploaded to the Nexus maven repository on “mvn deploy”.

I don’t really need the WAR-file anymore, as I pack the RPM exploded. I might change the primary artifact type from WAR to RPM in the future, but I haven’t looked into that yet.

Packaging the app server as an RPM

The next thing I did was that I wanted to package the app server as an RPM as well. I feel it’s more developer friendly to build a tomcat rpm using maven as well rather that just grabbing some arbitrary rpm and using Puppet to fix the configuration. Also, we get full control over where it is installed and where log are.

One thing I really wanted to avoid was to having to check in the Tomcat distribution tar ball into Subversion. I hate blobs in SVN, so I was pleasantly surprised to learn that Nexus handles any types of files. I simply uploaded the latest tomcat distro tar (apache-tomcat-7.0.23.tar.gz) into my Nexus 3rd party repository.

I created a sibling project “tomcat” with a pom that looks like this:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <!-- avoid rpm here as classifier will differ and Nexus search will fail -->
  <packaging>pom</packaging>
  <modelVersion>4.0.0</modelVersion>
  <parent>
    <artifactId>nomp-parent</artifactId>
    <groupId>se.nomp</groupId>
    <version>2.1.0-SNAPSHOT</version>
  </parent>
  <artifactId>nomp-tomcat</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <name>Nomp Tomcat Server</name>
  <description>Tomcat server for Nomp</description>
  <properties>
    <tomcat.version>7.0.23</tomcat.version>
    <tomcat.build.dir>${project.build.directory}/tomcat/apache-tomcat-${tomcat.version}</tomcat.build.dir>
    <rpm.install.basedir>/opt/nomp</rpm.install.basedir>
    <rpm.install.logdir>/var/log/nomp</rpm.install.logdir>
  </properties>
  <profiles>
    <!-- Only run the RPM packaging on Linux as we need to rpm binary to build rpms using the rpm plugin -->
    <profile>
      <id>rpm</id>
      <activation>
        <os>
          <name>linux</name>
        </os>
      </activation>
      <build>
        <plugins>
          <plugin>
            <groupId>org.codehaus.mojo</groupId>
            <artifactId>rpm-maven-plugin</artifactId>
            <version>2.1-alpha-1</version>
            <extensions>true</extensions>
            <executions>
              <execution>
                <goals>
                  <goal>attached-rpm</goal>
                </goals>
              </execution>
            </executions>
            <configuration>
              <copyright>Copyright 2011 Selessia AB</copyright>
              <distribution>Nomp</distribution>
              <group>${project.groupId}</group>
              <packager>${user.name}</packager>
              <!-- need to use the build number plugin here in order for yum upgrade to work in snapshots -->
              <release>${buildNumber}</release>
              <defaultDirmode>755</defaultDirmode>
              <defaultFilemode>444</defaultFilemode>
              <defaultUsername>root</defaultUsername>
              <defaultGroupname>root</defaultGroupname>
              <mappings>
                <mapping>
                  <directory>${rpm.install.basedir}/logs</directory>
                  <sources>
                    <softlinkSource>
                      <location>${rpm.install.logdir}</location>
                    </softlinkSource>
                  </sources>
                </mapping>
                <mapping>
                  <directory>${rpm.install.logdir}</directory>
                  <username>nomp</username>
                  <groupname>nomp</groupname>
                </mapping>
                <mapping>
                  <directory>${rpm.install.basedir}/bin</directory>
                  <filemode>555</filemode>
                  <sources>
                    <source>
                      <location>${tomcat.build.dir}/bin</location>
                    </source>
                  </sources>
                </mapping>
                <mapping>
                  <directory>${rpm.install.basedir}/conf</directory>
                  <sources>
                    <source>
                      <location>${tomcat.build.dir}/conf</location>
                    </source>
                  </sources>
                </mapping>
                <mapping>
                  <directory>${rpm.install.basedir}/lib</directory>
                  <sources>
                    <source>
                      <location>${tomcat.build.dir}/lib</location>
                    </source>
                  </sources>
                </mapping>
                <mapping>
                  <directory>${rpm.install.basedir}/work</directory>
                  <username>nomp</username>
                  <groupname>nomp</groupname>
                </mapping>
                <mapping>
                  <directory>${rpm.install.basedir}/temp</directory>
                  <username>nomp</username>
                  <groupname>nomp</groupname>
                </mapping>
                <mapping>
                  <directory>${rpm.install.basedir}/conf/Catalina</directory>
                  <username>nomp</username>
                  <groupname>nomp</groupname>
                </mapping>
                <mapping>
                  <directory>/etc/init.d</directory>
                  <directoryIncluded>false</directoryIncluded>
                  <filemode>555</filemode>
                  <sources>
                    <source>
                      <location>src/main/etc/init.d</location>
                    </source>
                  </sources>
                </mapping>
              </mappings>
            </configuration>
          </plugin>
        </plugins>
      </build>
    </profile>
  </profiles>

  <build>
    <resources>
      <resource>
         <!-- overlay the contents in the resources src dir ontop of the unpacked tomcat -->
         <directory>src/main/resources</directory>
         <filtering>false</filtering>
       </resource>
     </resources>
    <plugins>
      <plugin>
        <groupId>org.codehaus.mojo</groupId>
        <artifactId>buildnumber-maven-plugin</artifactId>
        <version>1.0</version>
        <executions>
          <execution>
            <phase>validate</phase>
            <goals>
              <goal>create</goal>
            </goals>
          </execution>
        </executions>
        <configuration>
          <doCheck>true</doCheck>
          <doUpdate>true</doUpdate>
        </configuration>
      </plugin>
      <plugin>
        <artifactId>maven-clean-plugin</artifactId>
        <version>2.4.1</version>
        <executions>
          <execution>
            <id>auto-clean</id>
            <phase>initialize</phase>
            <goals>
              <goal>clean</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-resources-plugin</artifactId>
        <version>2.5</version>
        <executions>
          <execution>
            <id>resources</id>
            <!-- need to specify, as this is not default for pom packaging -->
            <phase>process-resources</phase>
            <goals>
              <goal>resources</goal>
            </goals>
            <configuration>
              <encoding>UTF-8</encoding>
              <outputDirectory>${tomcat.build.dir}</outputDirectory>
            </configuration>
          </execution>
        </executions>
      </plugin>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-dependency-plugin</artifactId>
        <version>2.4</version>
        <executions>
          <execution>
            <id>unpack-tomcat</id>
            <phase>generate-resources</phase>
            <goals>
              <!-- unpack the tomcat dependency that's been downloaded from your local 3rd party repo -->
              <goal>unpack-dependencies</goal>
            </goals>
            <configuration>
              <outputDirectory>${project.build.directory}/tomcat</outputDirectory>
            </configuration>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
  <dependencies>
    <!-- the tomcat distro that's been uploaded to the local third party maven repo -->
    <dependency>
       <groupId>org.apache.tomcat</groupId>
       <artifactId>apache-tomcat</artifactId>
       <version>${tomcat.version}</version>
       <type>tar.gz</type>
     </dependency>
   </dependencies>
</project>

Note that the Tomcat artifact is just a normal maven dependency. I used the maven-dependency-plugin to automatically unpack the archive.
I then overlay the configuration files I want to change with the well known maven-resources-plugin.

Okay. Now I was pretty happy. I was building two good RPM:s with proper version and release numbers that were deployed to my Nexus on “mvn deploy”.

Distributing the packages

The next step was then to export these files into a yum repository. Or so I thought…
I was pleasantly surprised, or more like super-excited when I realized that some awesome folks had made a plugin for Nexus (nexus-yum-plugin http://code.google.com/p/nexus-yum-plugin/) that exposes a Nexus Maven repo as a yum repo!

If you have yum installed, just add a repository configuration to your target server (I use Puppet to automate this).

Here’s how it looks:

root@manny:/etc/yum/repos.d# cat nexus-snapshot.repo
 [nexus-snapshots]
 name=Nomp Nexus - Snapshots
 baseurl=http://manny:8082/nexus/content/repositories/snapshots/
 enabled=1
 gpgcheck=0

You need to add one config for your snapshot repo and another for your release repo.
Test your setup with “yum list” (you need to redeploy at least one RPM artifact in each repo in order for the yum-plugin to create the RPM-repo).

root@manny:/etc/yum/repos.d# yum list
Installed Packages
 nomp-dbdeploy.noarch 0.0.2-1788 @maven-snapshots
 nomp-tomcat.noarch 0.0.1-1788 @maven-snapshots
 nomp-web.noarch 2.1.0-1788 @maven-snapshots
Available Packages
 nomp-dbdeploy.noarch 0.0.2-1793 maven-snapshots
 nomp-tomcat.noarch 0.0.1-1793 maven-snapshots
 nomp-web.noarch 2.1.0-1793 maven-snapshots

In order to transfer the RPM packages and install the software, you just type:

# yum -y install nomp-web

or if already installed:

# yum -y update nomp-web nomp-tomcat

Pretty sweet! It’s so easy for anyone to find out what is installed/deployed on a server using rpm packages!

The database is code too

In order to ensure that database scripts are tested throughout the deploy pipeline, we also need to treat our database scripts as code that should be run in each environment.
I like to use dbdeploy (http://code.google.com/p/dbdeploy/) for database patch script packaging. Dbdeploy is a simple Database Change Management tool that applies SQL files in a specified order.  It can be run from the command line or from ant. It has a Maven plugin as well, but I don’t want to use that as I don’t want maven installed on the production servers.

I ended up making a separate rpm with the sql change scripts for the application and packaged the maven dependencies with the rpm. The main application is a build.xml script for nomp.

The build.xml I use for the dbdeploy package looks like this:

<project name="MyProject" default="dbdeploy" basedir=".">
    <description>dbdeploy script for nomp</description>
    <record name="dbdeploy.log" loglevel="verbose" action="start" />
    <path id="dbdeploy.classpath" >
        <fileset dir="lib">
            <include name="*.jar" />
        </fileset>
    </path>

    <taskdef name="dbdeploy" classname="com.dbdeploy.AntTarget" classpathref="dbdeploy.classpath" />

    <target name="dbdeploy" depends="create-log-table">
        <dbdeploy driver="${jdbc.driverClassName}" url="${jdbc.url}" userid="${jdbc.username}" password="${jdbc.password}" dir="sql" />
    </target>

    <target name="create-log-table">
        <sql classpathref="dbdeploy.classpath" driver="${jdbc.driverClassName}" url="${jdbc.url}" userid="${jdbc.username}" password="${jdbc.password}" src="ddl/createSchemaVersionTable.mysql.sql" />
    </target>
</project>

I also improved the dbdeploy distribution mysql script a bit so that it wont fail if it’s run again and again:

CREATE TABLE IF NOT EXISTS changelog (
 change_number BIGINT NOT NULL,
 complete_dt TIMESTAMP NOT NULL,
 applied_by VARCHAR(100) NOT NULL,
 description VARCHAR(500) NOT NULL,
 CONSTRAINT Pkchangelog PRIMARY KEY (change_number)
 );

When the RPM is installed, one only runs “ant” to run the needed sql change sets.

root@manny:/opt/nomp-dbdeploy# ant
 Buildfile: /opt/nomp-dbdeploy/build.xml
create-log-table:
 [sql] Executing resource: /opt/nomp-dbdeploy/ddl/createSchemaVersionTable.mysql.sql
 [sql] 1 of 1 SQL statements executed successfully
dbdeploy:
 [dbdeploy] dbdeploy 3.0M3
 [dbdeploy] Reading change scripts from directory /opt/nomp-dbdeploy/sql...
 [dbdeploy] Changes currently applied to database:
 [dbdeploy] 1, 2
 [dbdeploy] Scripts available:
 [dbdeploy] 1, 2
 [dbdeploy] To be applied:
 [dbdeploy] (none)
BUILD SUCCESSFUL
 Total time: 0 seconds

Final step – setting up Jenkins

I will assume that the reader knows how to setup and configure Jenkins jobs. I did a vanilla Jenkins install, and added the build pipeline plugin (https://wiki.jenkins-ci.org/display/JENKINS/Build+Pipeline+Plugin) for a nice gui and the manual triggers.

My pipeline

The pipeline runs automatically for each check in.

Job #1 – “Nomp build”

Builds the root pom with goal “deploy”. Note: add flags for -Dusername -Dpassword for svn credentials as the build-number-plugin is used)

Job #2 – “Nomp deploy to test”

ssh jenkins@test-server "yum -y update nomp-web nomp-tomcat nomp-dbdeploy;
cd /opt/nomp-dbdeploy; ant; /etc/init.d/nomp restart"

note: you need to add jenkins to sudoers (using the NOPASSWD option) on the target and use ssh key auth of course (Puppet does this for me).

Job #3 – “Nomp deploy to production” (manual trigger)

A manual step after smoke tests have been run (not automated for Nomp yet), to release to production. Exactly like the above, except different target server.

Next steps

For Nomp, the next step will be more Puppet config. I want to be able to build and start up a fully working web server and db server from a standard EC2 AMI without any manual steps. This isn’t hard, but I can’t find the time right now. Need to add new features to the customers too 🙂 After that, I’d love to look at using Capistrano (https://github.com/capistrano/capistrano/wiki) for deploy automation to many hosts. Currently Nomp only has a few servers, so ssh from Jenkins works fine for now.

Thank you for reading all the way to here. I’d love feedback if you think this is useful or not and if you agree on it being “developer friendly”. I have a pretty solid background in *nix admin, but I think most developers will understand and be able to maintain this setup, as compared to a solution more focused on using a sysadmin’s toolbox.

Lastly, please contribute with improvements if you find any.

I’ll try find time and energy to clean up the pom:s and provide a skeleton project that has a simple war, a tomcat and the dbdeploy rpm config for download in a week or so.

Added: Here’s an overview of the current continuous deployment environment at Nomp.se

Nomp Continous Deployment architecture

(click for full size image)

RTWaaS?

A giant hurdle for buying a system/solution as a software is the need to buy hardware, install it, configure and manage it. You need to train people on the products’ operational aspects and retain that skill within the company.

(Free) Open Source Software (FOSS) is great to spread, to get adoption and support for a product. You enable the developers and architects to play around with the stuff! The real challenge for FOSS (and other software) products is to go beyond the happy and content developer and also provide a painless path for the adopters to provide business value without a huge investment hurdle in terms of hardware, software, traning or services.

I think the reason why something like Google Analytics or Salesforce.com is successful is that it is extremely painless to start using it. You can focus on the business problem rather than the IT stuff. Obviously this is nothing new, and the examples I gave has been around for years. Software as a Service is great.

Then, you have all the talk about the real-time web and putting information quickly, as it happens – “real time” – on the users’ desktops. This is what Twitter and Facebook is about, but real-time web is also needed for e-commerce and gaming and a lot of other areas. There are even conferences about it, so it must be happening 😉

Lastly, the final piece of the puzzle are Service Level Agreements. In order to provide “real time web” messaging as a service there is a clear advantage of being close to the information consumers, both in terms of scaling out and in terms of guaranteed latency. I think it is going to be hard to commit meaningful SLA:s without being in the edge.

If you remove the need to invest in infrastructure, the need to train people on the operational aspects and then get excellent scalability and low latency guaranteed by contract, I’d buy it in a second. Who will provide me with the Real Time Web as a service?

Have you walked down the ORM road of death?

A friend of mine asked me a really good question tonight:

Hey Stefan,
It would be great if you could please give me a sense for how many development teams get hit by a database bottleneck in JEE / Java / 3-tier / ORM / JPA land? And, how they go about addressing it? What exactly causes their bottleneck?

I think most successful apps – scaling problems are hopefully a sign that people are actually using the stuff, right? – built with Hibernate/JPA hit db contention pretty early on. From what I’ve seen this is usually caused by doing excessive round-trips over the wire or returning too large data sets.

And then we spend time fixing all the obvious broken data access patterns, by first to use HQL over standard eager/lazy fetching, or tuning existing HQL and then direct SQL if needed.

I believe the next step after this is typically to try to scale vertically, both in the db and app tier. Throwing more hardware at the problem may get us quite a bit further at this point.

Then we might get to the point where the app gets fixed so that it actually makes sense to scale horizontally in the app tier. We will probably have to add a load balancer to the mix and use sticky sessions by now.

And then then we will perhaps find out that we will not do that very well without a distributed 2nd level cache, and that all our direct SQL code writing to the DB (that bypass the 2nd level cache) won’t allow us to use a 2nd level cache for reads either…

Here is where I think there are many options and I’m not sure how people tend go from here. Here we might see some people abandoning ORM, while others may try to get the 2nd level cache to work?

Are these the typical steps for scaling up a Java Hibernate/JPA app? What’s your experience?

Cache-aside, write-behind, magic and why it sucks being an Oracle customer

I’ve been looking at a few different technologies to improve the scalability of one of our applications. We’re scaling pretty ok to be honest, considering that we currently have a traditional database centric solution. The cost for scaling an database bound application running Oracle is crazy to say the least considering they charge $47 500 per two x86 cores for Oracle Enterprise Edition. On top of this it’s 22% for software updates and support per year. As this wasn’t enough, they also increase the support cost with 4% per year.

You might think that the price above is for production environments, but in fact you have to pay for every single installation throughout the organization. There are no discounts for staging, DR, test or development environments.

I have a piece of advice for all you kids out there considering to run Oracle – just don’t do it.

This advice goes for all Oracle products really, as they all have the same pricing model.

Databases are overrated

My strong recommendation is to build an application that doesn’t rely on an underlying RDBMS. The relational database is an overrated, overly complex form of persistent store. They are slow, and are also usually a single point of failure. Does this mean that databases are dead and a thing from the past? No, but the role of the database will probably change going forward. In my opinion we should use the RDBMS as a System of Record that is mostly up to date.

If you ask me, databases are great at mainly two things:

  1. They make the data accessible for other systems in a standard way and
  2. They have a strong query language that many people know

So, write to databases asynchronously and use it for reporting and extracting data. Store the data in a data grid in the application tier (where it’s used).

What is a Data Grid?

A Data Grid is a horizontally scalable in-memory data management solution. Data grids try to eliminate data source contention by scaling out data management with commodity hardware.

Some underlying philosophies of data grids – according to Oracle (sic!):

  • Keep data in the application tier (where it’s used)
  • Disks are slow and databases are evil
  • Data Grids will solve your application scalability and performance problems

I have been looking at three different data grid vendors; Oracle Coherence, Gigaspaces EDG/XAP and Terracotta DSO.

Oracle Coherence

I really like this product. It focuses solely on being a potent data grid, with abilities to do act as a compute grid as well. Although I haven’t used Coherence for any large projects, its design and concepts are easy to relate to. It supports JTA transactions and consists of a single jar that you drop into your class path. The Coherence configuration doesn’t contain any infrastructural descriptions which means that you can use the same configuration on a your development laptop as in the production environment with multiple servers. The main issue with Coherence is the fact that Oracle owns it since a few years back.

Gigaspaces XAP

Gigaspaces mission seem to be to provide a very scalable application server with XAP – “The Scale-Out Application Server”. The EDG – enterprise data grid – packaging seem to provide about the same feature set as Coherence. The main difference to me, is the fact that the Gigaspaces offerings are both application server infrastructure that needs configuration, deployments and all of that. As I see things, the main drawback is the application server approach – it feels overwhelming. On the other hand, Gigaspaces is still a smaller company and eager to do business and provide great implementation support and the product seems to be a really good application server.

Terracotta DSO

Terracotta has a different approach. They provide Networked Attached Memory for the Java heap. If you can write a thread-safe program, you can scale out using Terracotta with no or minor changes to your application. From a technical point of view it’s a beautiful solution: You declare what objects you want to make available using Terracotta, and then Terracotta will makes your data persistent (if you want) and available on all clustered nodes. When you invoke new() on a clustered object, you will get a reference to the cluster object (if one exists). Another important difference between Terracotta and the others is that they only send the part of an object that’s been changed rather than the full serialized object graph.

I’m in love with this product. Its free and open source too and Terracotta Inc provides commercial support. The main concern I have with Terracotta is that its really a paradigm shift to the average java enterprise developer to start to write multi-threaded programs without having JTA transactions. Another concern is the magic – the low-level hooks they do in the JVM:s. At the time of writing, only Sun and IBM JVM:s are supported. It runs fine on OSX though.

The bottom line

So which one is the better? Well, that depends on a lot of things as always. If you decide to move to the grid it’s going to require retraining of your developers regardless of what solution you go for.

Please do keep in mind that products doesn’t usually solve your problems. And that you can go a long way using a less expensive RDBMS by partitioning the data across multiple servers – sharding. This is what a lot of large sites out there do.

Further reading:
The Coming of the Shard
eBay’s Architectural Principles
Oracle Coherence
Gigaspaces EDG
Terracotta DSO

Divide and Conquerer

I listened to Randy Shoup at QCon. Randy works in the architecture team at eBay. The thing that I was impressed by with his presentation was the “just-the-facts-and-nothing-but-the-facts” approach and the complete lack of buzzwords and product talk. It was like listening to a really good and concise O’Reilly book. Although I didn’t learn anything new from listening to Randy, it’s always good to get a distilled and well-presented summary of what really works regardless of technology fads.

Partition everything! Partition your system (“functional split”) and your data (“horizontal split”). It doesn’t matter what tool or technology you use. If you can’t split it you can’t scale it. Simple as that. Regardless if you’re using a fancy grid solution or just multiple databases.

Use asynchronous processing everywhere! If you have synchronous coupled systems they scale together and fail together. The least scalable system will limit scalability and the least available system will limit your uptime. If you use asynchronous, decoupled, systems then the system can scale independently of each other.

In the Limelight

One of the first things I did when I joined Unibet was to set up a Content Delivery Network. I did some research and ended up with a shortlist.

A few factors were limiting my options:

  1. The CDN provider must do business with e-gaming companies
  2. The CDN provider must have an SSL CDN service

The first point effectively rules out Akamai and a number of other companies. The second point rules out even more companies.

I ended up talking to Limelight, and despite some screwups at their London sales office, I must say their CDN service is really awesome. Highly recommended. We currently use them for web site acceleration, to host downloadable clients and for banner serving.