An attempt at a developer friendly build pipe line

Background

I’ve spent some evenings/nights over the christmas holiday improving the deployment of Nomp.se, a site where kids can practice math for free, that we run on EC2.

The situation we had was that we deployed to the EC2 server using a locally installed Jenkins CI-server, which built the artifact (a WAR) and used the maven tomcat plugin to deploy to the local tomcat server, which was a rpm package provided by Amazon (yum install tomcat6). The setup worked pretty ok, but it was a hack. Database changes were applied and tested manually – we had a folder “sql” that contained numbered sql files that should be applied in order.

Clearly a lot of room for improvement in this area!

Goals with the new build pipeline

I wanted to reach the following goals with the new build pipeline:

  • One build from the build server all the way from my local Jenkins through test environments and into production.
  • 100% control over configuration changes of all components (Apache httpd, Apache Tomcat, MySql database), so that changes can be tested in the normal pipeline without relying on manual hacks.
  • It should be developer friendly. A developer with basic understanding of Linux, maven and tomcat should be able to make changes to and work with the build pipe line.
  • Hence, it should only rely on basic tooling (ant, maven, rpm packages) for doing the heavy lifting and use capabilities of other tools, eg Jenkins, Puppet, Capistrano as (non-critical) value add.

After a few iterations I was able to get to the following to deploy any configuration change on to a production server.

on the build server:
 $ mvn deploy

on the target server:
 # yum -y update nomp-web nomp-tomcat nomp-dbdeploy
 # cd /opt/nomp-dbdeploy; ant
 # /etc/init.d/nomp restart

That’s it. Four steps. There are no shell scripts involved. There is no rsync, there is no scp:ing of files. How I did it? Hold on, I will come to that in a minute or two 🙂

System configuration and prerequisites

In order to make sure the server contains the prerequisite packages and configuration I used Puppet.

“Puppet is a declarative language for expressing system configuration, a client and server for distributing it, and a library for realizing the configuration.

Rather than approaching server management by automating current techniques, Puppet reframes the problem by providing a language to express the relationships between servers, the services they provide, and the primitive objects that compose those services. Rather than handling the detail of how to achieve a certain configuration or provide a given service, Puppet users can simply express their desired configuration using the abstractions they’re used to handling, like service and node, and Puppet is responsible for either achieving the configuration or providing the user enough information to fix any encountered problems.”

from http://projects.puppetlabs.com/projects/puppet/wiki/Big_Picture

I’m not going to go into detail how to setup Puppet in this text, but here’s what I do with Puppet in order to support the build pipe line:

  • Ensure that the service accounts and groups exists on the target system
  • Ensure that software I rely on is installed (ant, apache httpd, mysqld)
  • Configuration management of a few configuration files such as httpd.conf, my.cnf etc.

Puppet config file example:

user { nomp:
 ensure => present,
 uid => 300,
 gid => 300,
 shell => '/bin/bash',
 home => '/opt/nomp',
 managehome => true,
 }
group { nomp:
 ensure => 'present',
 gid => 300
 }
package { "ant":
 ensure => "installed"
 }

The above configuration means that Puppet will ensure that the user nomp and group nomp will exist on the system and that the ant package will be installed.
I will do a whole lot more work with configuration management and provisioning with Puppet going forward, but the above is what was needed to meet my project goals.

Getting started

I started with trying package my existing WAR project as an rpm (or .deb). When Googling about for a while I found the RPM Maven Plugin (http://mojo.codehaus.org/rpm-maven-plugin/). It basically lets you build rpms using maven. The downside is that is relies on the “rpm” command installed in order to produce the final RPM from the spec file. In order to get a working maven environment on all platforms, I wrapped the rpm plugin in a maven build profile.

(Later I also found a pure java rpm tool (redline-rpm), but I haven’t looked into it yet).

The trickiest part was to get a good setup for artifact versions and RPM-release versions so that the maven release plugin could still be used without any manual changes.
The rpm-plugin has some funky defaults (http://mojo.codehaus.org/rpm-maven-plugin/ident-params.html#release) that wasn’t going to work with “yum update”.
It was a lot of experimentation, but in the end I settled for the Build Number Maven Plugin (http://mojo.codehaus.org/buildnumber-maven-plugin/).
It’s a pretty simple plugin that checks the SCM for the revision number and exposes that as a maven variable.

Here’s the RPM-part of my WAR POM:

<profiles>
  <profile>
    <id>rpm</id>
    <activation>
      <os>
        <name>linux</name>
      </os>
    </activation>
    <build>
      <plugins>
        <plugin>
          <groupId>org.codehaus.mojo</groupId>
          <artifactId>rpm-maven-plugin</artifactId>
          <version>2.1-alpha-1</version>
          <extensions>true</extensions>
          <executions>
            <execution>
              <goals>
                <goal>attached-rpm</goal>
              </goals>
            </execution>
          </executions>
          <configuration>
            <copyright>Copyright 2011 Selessia AB</copyright>
            <distribution>Nomp</distribution>
            <group>${project.groupId}</group>
            <packager>${user.name}</packager>
            <!-- need to use the build number plugin here in order for yum upgrade to work in snapshots -->
            <release>${buildNumber}</release>
            <defaultDirmode>555</defaultDirmode>
            <defaultFilemode>444</defaultFilemode>
            <defaultUsername>nomp</defaultUsername>
            <defaultGroupname>nomp</defaultGroupname>
            <requires>
              <require>nomp-tomcat</require>
            </requires>
            <mappings>
              <!-- webapps deployment -->
              <mapping>
                <directory>${rpm.install.webapps}/${project.artifactId}</directory>
                <sources>
                  <source>
                    <location>target/${project.artifactId}-${project.version}</location>
                  </source>
                </sources>
              </mapping>
            </mappings>
          </configuration>
        </plugin>
      </plugins>
    </build>
  </profile>
</profiles>

Here’s the build number plugin configuration:

<plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>buildnumber-maven-plugin</artifactId>
    <version>1.0</version>
    <executions>
      <execution>
        <phase>validate</phase>
        <goals>
          <goal>create</goal>
        </goals>
      </execution>
    </executions>
    <configuration>
      <doCheck>true</doCheck>
      <doUpdate>true</doUpdate>
    </configuration>
  </plugin>

What all the configuration above does is that it adds a secondary artifact (the rpm) which gets uploaded to the Nexus maven repository on “mvn deploy”.

I don’t really need the WAR-file anymore, as I pack the RPM exploded. I might change the primary artifact type from WAR to RPM in the future, but I haven’t looked into that yet.

Packaging the app server as an RPM

The next thing I did was that I wanted to package the app server as an RPM as well. I feel it’s more developer friendly to build a tomcat rpm using maven as well rather that just grabbing some arbitrary rpm and using Puppet to fix the configuration. Also, we get full control over where it is installed and where log are.

One thing I really wanted to avoid was to having to check in the Tomcat distribution tar ball into Subversion. I hate blobs in SVN, so I was pleasantly surprised to learn that Nexus handles any types of files. I simply uploaded the latest tomcat distro tar (apache-tomcat-7.0.23.tar.gz) into my Nexus 3rd party repository.

I created a sibling project “tomcat” with a pom that looks like this:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <!-- avoid rpm here as classifier will differ and Nexus search will fail -->
  <packaging>pom</packaging>
  <modelVersion>4.0.0</modelVersion>
  <parent>
    <artifactId>nomp-parent</artifactId>
    <groupId>se.nomp</groupId>
    <version>2.1.0-SNAPSHOT</version>
  </parent>
  <artifactId>nomp-tomcat</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <name>Nomp Tomcat Server</name>
  <description>Tomcat server for Nomp</description>
  <properties>
    <tomcat.version>7.0.23</tomcat.version>
    <tomcat.build.dir>${project.build.directory}/tomcat/apache-tomcat-${tomcat.version}</tomcat.build.dir>
    <rpm.install.basedir>/opt/nomp</rpm.install.basedir>
    <rpm.install.logdir>/var/log/nomp</rpm.install.logdir>
  </properties>
  <profiles>
    <!-- Only run the RPM packaging on Linux as we need to rpm binary to build rpms using the rpm plugin -->
    <profile>
      <id>rpm</id>
      <activation>
        <os>
          <name>linux</name>
        </os>
      </activation>
      <build>
        <plugins>
          <plugin>
            <groupId>org.codehaus.mojo</groupId>
            <artifactId>rpm-maven-plugin</artifactId>
            <version>2.1-alpha-1</version>
            <extensions>true</extensions>
            <executions>
              <execution>
                <goals>
                  <goal>attached-rpm</goal>
                </goals>
              </execution>
            </executions>
            <configuration>
              <copyright>Copyright 2011 Selessia AB</copyright>
              <distribution>Nomp</distribution>
              <group>${project.groupId}</group>
              <packager>${user.name}</packager>
              <!-- need to use the build number plugin here in order for yum upgrade to work in snapshots -->
              <release>${buildNumber}</release>
              <defaultDirmode>755</defaultDirmode>
              <defaultFilemode>444</defaultFilemode>
              <defaultUsername>root</defaultUsername>
              <defaultGroupname>root</defaultGroupname>
              <mappings>
                <mapping>
                  <directory>${rpm.install.basedir}/logs</directory>
                  <sources>
                    <softlinkSource>
                      <location>${rpm.install.logdir}</location>
                    </softlinkSource>
                  </sources>
                </mapping>
                <mapping>
                  <directory>${rpm.install.logdir}</directory>
                  <username>nomp</username>
                  <groupname>nomp</groupname>
                </mapping>
                <mapping>
                  <directory>${rpm.install.basedir}/bin</directory>
                  <filemode>555</filemode>
                  <sources>
                    <source>
                      <location>${tomcat.build.dir}/bin</location>
                    </source>
                  </sources>
                </mapping>
                <mapping>
                  <directory>${rpm.install.basedir}/conf</directory>
                  <sources>
                    <source>
                      <location>${tomcat.build.dir}/conf</location>
                    </source>
                  </sources>
                </mapping>
                <mapping>
                  <directory>${rpm.install.basedir}/lib</directory>
                  <sources>
                    <source>
                      <location>${tomcat.build.dir}/lib</location>
                    </source>
                  </sources>
                </mapping>
                <mapping>
                  <directory>${rpm.install.basedir}/work</directory>
                  <username>nomp</username>
                  <groupname>nomp</groupname>
                </mapping>
                <mapping>
                  <directory>${rpm.install.basedir}/temp</directory>
                  <username>nomp</username>
                  <groupname>nomp</groupname>
                </mapping>
                <mapping>
                  <directory>${rpm.install.basedir}/conf/Catalina</directory>
                  <username>nomp</username>
                  <groupname>nomp</groupname>
                </mapping>
                <mapping>
                  <directory>/etc/init.d</directory>
                  <directoryIncluded>false</directoryIncluded>
                  <filemode>555</filemode>
                  <sources>
                    <source>
                      <location>src/main/etc/init.d</location>
                    </source>
                  </sources>
                </mapping>
              </mappings>
            </configuration>
          </plugin>
        </plugins>
      </build>
    </profile>
  </profiles>

  <build>
    <resources>
      <resource>
         <!-- overlay the contents in the resources src dir ontop of the unpacked tomcat -->
         <directory>src/main/resources</directory>
         <filtering>false</filtering>
       </resource>
     </resources>
    <plugins>
      <plugin>
        <groupId>org.codehaus.mojo</groupId>
        <artifactId>buildnumber-maven-plugin</artifactId>
        <version>1.0</version>
        <executions>
          <execution>
            <phase>validate</phase>
            <goals>
              <goal>create</goal>
            </goals>
          </execution>
        </executions>
        <configuration>
          <doCheck>true</doCheck>
          <doUpdate>true</doUpdate>
        </configuration>
      </plugin>
      <plugin>
        <artifactId>maven-clean-plugin</artifactId>
        <version>2.4.1</version>
        <executions>
          <execution>
            <id>auto-clean</id>
            <phase>initialize</phase>
            <goals>
              <goal>clean</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-resources-plugin</artifactId>
        <version>2.5</version>
        <executions>
          <execution>
            <id>resources</id>
            <!-- need to specify, as this is not default for pom packaging -->
            <phase>process-resources</phase>
            <goals>
              <goal>resources</goal>
            </goals>
            <configuration>
              <encoding>UTF-8</encoding>
              <outputDirectory>${tomcat.build.dir}</outputDirectory>
            </configuration>
          </execution>
        </executions>
      </plugin>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-dependency-plugin</artifactId>
        <version>2.4</version>
        <executions>
          <execution>
            <id>unpack-tomcat</id>
            <phase>generate-resources</phase>
            <goals>
              <!-- unpack the tomcat dependency that's been downloaded from your local 3rd party repo -->
              <goal>unpack-dependencies</goal>
            </goals>
            <configuration>
              <outputDirectory>${project.build.directory}/tomcat</outputDirectory>
            </configuration>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
  <dependencies>
    <!-- the tomcat distro that's been uploaded to the local third party maven repo -->
    <dependency>
       <groupId>org.apache.tomcat</groupId>
       <artifactId>apache-tomcat</artifactId>
       <version>${tomcat.version}</version>
       <type>tar.gz</type>
     </dependency>
   </dependencies>
</project>

Note that the Tomcat artifact is just a normal maven dependency. I used the maven-dependency-plugin to automatically unpack the archive.
I then overlay the configuration files I want to change with the well known maven-resources-plugin.

Okay. Now I was pretty happy. I was building two good RPM:s with proper version and release numbers that were deployed to my Nexus on “mvn deploy”.

Distributing the packages

The next step was then to export these files into a yum repository. Or so I thought…
I was pleasantly surprised, or more like super-excited when I realized that some awesome folks had made a plugin for Nexus (nexus-yum-plugin http://code.google.com/p/nexus-yum-plugin/) that exposes a Nexus Maven repo as a yum repo!

If you have yum installed, just add a repository configuration to your target server (I use Puppet to automate this).

Here’s how it looks:

root@manny:/etc/yum/repos.d# cat nexus-snapshot.repo
 [nexus-snapshots]
 name=Nomp Nexus - Snapshots
 baseurl=http://manny:8082/nexus/content/repositories/snapshots/
 enabled=1
 gpgcheck=0

You need to add one config for your snapshot repo and another for your release repo.
Test your setup with “yum list” (you need to redeploy at least one RPM artifact in each repo in order for the yum-plugin to create the RPM-repo).

root@manny:/etc/yum/repos.d# yum list
Installed Packages
 nomp-dbdeploy.noarch 0.0.2-1788 @maven-snapshots
 nomp-tomcat.noarch 0.0.1-1788 @maven-snapshots
 nomp-web.noarch 2.1.0-1788 @maven-snapshots
Available Packages
 nomp-dbdeploy.noarch 0.0.2-1793 maven-snapshots
 nomp-tomcat.noarch 0.0.1-1793 maven-snapshots
 nomp-web.noarch 2.1.0-1793 maven-snapshots

In order to transfer the RPM packages and install the software, you just type:

# yum -y install nomp-web

or if already installed:

# yum -y update nomp-web nomp-tomcat

Pretty sweet! It’s so easy for anyone to find out what is installed/deployed on a server using rpm packages!

The database is code too

In order to ensure that database scripts are tested throughout the deploy pipeline, we also need to treat our database scripts as code that should be run in each environment.
I like to use dbdeploy (http://code.google.com/p/dbdeploy/) for database patch script packaging. Dbdeploy is a simple Database Change Management tool that applies SQL files in a specified order.  It can be run from the command line or from ant. It has a Maven plugin as well, but I don’t want to use that as I don’t want maven installed on the production servers.

I ended up making a separate rpm with the sql change scripts for the application and packaged the maven dependencies with the rpm. The main application is a build.xml script for nomp.

The build.xml I use for the dbdeploy package looks like this:

<project name="MyProject" default="dbdeploy" basedir=".">
    <description>dbdeploy script for nomp</description>
    <record name="dbdeploy.log" loglevel="verbose" action="start" />
    <path id="dbdeploy.classpath" >
        <fileset dir="lib">
            <include name="*.jar" />
        </fileset>
    </path>

    <taskdef name="dbdeploy" classname="com.dbdeploy.AntTarget" classpathref="dbdeploy.classpath" />

    <target name="dbdeploy" depends="create-log-table">
        <dbdeploy driver="${jdbc.driverClassName}" url="${jdbc.url}" userid="${jdbc.username}" password="${jdbc.password}" dir="sql" />
    </target>

    <target name="create-log-table">
        <sql classpathref="dbdeploy.classpath" driver="${jdbc.driverClassName}" url="${jdbc.url}" userid="${jdbc.username}" password="${jdbc.password}" src="ddl/createSchemaVersionTable.mysql.sql" />
    </target>
</project>

I also improved the dbdeploy distribution mysql script a bit so that it wont fail if it’s run again and again:

CREATE TABLE IF NOT EXISTS changelog (
 change_number BIGINT NOT NULL,
 complete_dt TIMESTAMP NOT NULL,
 applied_by VARCHAR(100) NOT NULL,
 description VARCHAR(500) NOT NULL,
 CONSTRAINT Pkchangelog PRIMARY KEY (change_number)
 );

When the RPM is installed, one only runs “ant” to run the needed sql change sets.

root@manny:/opt/nomp-dbdeploy# ant
 Buildfile: /opt/nomp-dbdeploy/build.xml
create-log-table:
 [sql] Executing resource: /opt/nomp-dbdeploy/ddl/createSchemaVersionTable.mysql.sql
 [sql] 1 of 1 SQL statements executed successfully
dbdeploy:
 [dbdeploy] dbdeploy 3.0M3
 [dbdeploy] Reading change scripts from directory /opt/nomp-dbdeploy/sql...
 [dbdeploy] Changes currently applied to database:
 [dbdeploy] 1, 2
 [dbdeploy] Scripts available:
 [dbdeploy] 1, 2
 [dbdeploy] To be applied:
 [dbdeploy] (none)
BUILD SUCCESSFUL
 Total time: 0 seconds

Final step – setting up Jenkins

I will assume that the reader knows how to setup and configure Jenkins jobs. I did a vanilla Jenkins install, and added the build pipeline plugin (https://wiki.jenkins-ci.org/display/JENKINS/Build+Pipeline+Plugin) for a nice gui and the manual triggers.

My pipeline

The pipeline runs automatically for each check in.

Job #1 – “Nomp build”

Builds the root pom with goal “deploy”. Note: add flags for -Dusername -Dpassword for svn credentials as the build-number-plugin is used)

Job #2 – “Nomp deploy to test”

ssh jenkins@test-server "yum -y update nomp-web nomp-tomcat nomp-dbdeploy;
cd /opt/nomp-dbdeploy; ant; /etc/init.d/nomp restart"

note: you need to add jenkins to sudoers (using the NOPASSWD option) on the target and use ssh key auth of course (Puppet does this for me).

Job #3 – “Nomp deploy to production” (manual trigger)

A manual step after smoke tests have been run (not automated for Nomp yet), to release to production. Exactly like the above, except different target server.

Next steps

For Nomp, the next step will be more Puppet config. I want to be able to build and start up a fully working web server and db server from a standard EC2 AMI without any manual steps. This isn’t hard, but I can’t find the time right now. Need to add new features to the customers too 🙂 After that, I’d love to look at using Capistrano (https://github.com/capistrano/capistrano/wiki) for deploy automation to many hosts. Currently Nomp only has a few servers, so ssh from Jenkins works fine for now.

Thank you for reading all the way to here. I’d love feedback if you think this is useful or not and if you agree on it being “developer friendly”. I have a pretty solid background in *nix admin, but I think most developers will understand and be able to maintain this setup, as compared to a solution more focused on using a sysadmin’s toolbox.

Lastly, please contribute with improvements if you find any.

I’ll try find time and energy to clean up the pom:s and provide a skeleton project that has a simple war, a tomcat and the dbdeploy rpm config for download in a week or so.

Added: Here’s an overview of the current continuous deployment environment at Nomp.se

Nomp Continous Deployment architecture

(click for full size image)

Advertisements

Freemarker, slf4j and spring

I’ve just spent three hours trying to get Freemarker to stop spitting out “DEBUG cache:81” messages in my Spring application.

Freemarker recently hacked in SLF4J support into 2.3, but I had a hard time finding out how to enable it, so I reckoned I’d share my experiences.

FreeMarker 2.3 looks for logging libraries in this order (by default) with the class-loader of the FreeMarker classes: Log4J, Avalon, java.util.logging. The first that it founds in this list will be the one used for logging.

I found out that you can override this behavior in 2.3.18 by calling:

freemarker.log.Logger.
    selectLoggerLibrary(freemarker.log.Logger.LIBRARY_SLF4J);

However, this code need to run before any Freemarker classes are initialized.

After trying a few different tricks, such as having a load-on-startup Servlet’s init() configure the logger, I ended up with a fairly clean solution.

I extended Spring’s FreeMarkerConfigurer class like this:

 public class PluxFreeMarkerConfigurer extends FreeMarkerConfigurer {
    private Logger logger = LoggerFactory
            .getLogger(PluxFreeMarkerConfigurer.class);

    @Override
    public void afterPropertiesSet() throws IOException, TemplateException {
        fixFreemarkerLogging();
        super.afterPropertiesSet();
    }

    private void fixFreemarkerLogging() {
        try {
            freemarker.log.Logger
              .selectLoggerLibrary(freemarker.log.Logger.LIBRARY_SLF4J);
            logger.info("Switched broken Freemarker logging to slf4j");
        } catch (ClassNotFoundException e) {
            logger.warn("Failed to switch broken Freemarker logging to slf4j");
        }
    }
}

and changed my Spring-config to use my class to initialize Freemarker instead:

  <!-- FreeMarker engine that configures Freemarker for SLF4J-->
  <bean id="freemarkerConfig" class="com.selessia.plux.web.PluxFreeMarkerConfigurer"
 ...
 </bean>

Hope this helps someone.

RTWaaS?

A giant hurdle for buying a system/solution as a software is the need to buy hardware, install it, configure and manage it. You need to train people on the products’ operational aspects and retain that skill within the company.

(Free) Open Source Software (FOSS) is great to spread, to get adoption and support for a product. You enable the developers and architects to play around with the stuff! The real challenge for FOSS (and other software) products is to go beyond the happy and content developer and also provide a painless path for the adopters to provide business value without a huge investment hurdle in terms of hardware, software, traning or services.

I think the reason why something like Google Analytics or Salesforce.com is successful is that it is extremely painless to start using it. You can focus on the business problem rather than the IT stuff. Obviously this is nothing new, and the examples I gave has been around for years. Software as a Service is great.

Then, you have all the talk about the real-time web and putting information quickly, as it happens – “real time” – on the users’ desktops. This is what Twitter and Facebook is about, but real-time web is also needed for e-commerce and gaming and a lot of other areas. There are even conferences about it, so it must be happening 😉

Lastly, the final piece of the puzzle are Service Level Agreements. In order to provide “real time web” messaging as a service there is a clear advantage of being close to the information consumers, both in terms of scaling out and in terms of guaranteed latency. I think it is going to be hard to commit meaningful SLA:s without being in the edge.

If you remove the need to invest in infrastructure, the need to train people on the operational aspects and then get excellent scalability and low latency guaranteed by contract, I’d buy it in a second. Who will provide me with the Real Time Web as a service?

Web pages are disappearing?

I believe the page (url) is becoming more of a task oriented landing area where the web site will adopt the contents to the requesting user’s needs. I believe the divorce between content and pages is inevitable. It will be interesting to see how this will affect the KPI:s, analytics tools we currently use and search engine optimization practices going forward.

I recently attended a breakfast round-table discussion hosted by Imad Mouline. Imad is the Chief Technology Officer of Gomez. For those who aren’t familiar with Gomez, they specialize in web performance monitoring. It was an interesting discussion with participants from a few different industries. Participants were either CTO:s or CTO direct reports.

Imad shared a few additional trends regarding web pages (aggregated from the Gomez data warehouse):

  • Page weight is increasing (kB/page)
  • The number of page objects are plateauing
  • The number of origin domains per page are increasing

We covered a few different topics, but the most interesting discussion (to me) was related to how web pages are being constructed in modern web sites and what impact this has on measuring service level key performance indicators (KPI:s).

In order to sell effectively you need to create a web site that really stands out. One of the more effective ways of doing this is to use what we know about the user to contribute to this experience.

In general we tend to know a few things about each site visitor:

  • What browsing device is the user using (agent http header)
  • Where the user is (geo-ip lookup)
  • What the user’s preferred language is (browser setting or region)
  • Is the user is a returning customer or not (cookie)
  • The identity of the customer (cookie) and hence possibly age, gender, address etc 🙂
  • What time of day it is

So we basically know the how, who, when, where and what’s. In addition to this we can use data from previous visits to our site, such as click stream analysis, order history or segmentation by data warehouse analysis fed back into the content delivery system to improve the customer experience.

For example, when a user visits our commerce site we can use all of the above to present the most relevant offers in a very targeted manner to that user. We can also cross-sell efficiently and offer bonuses if we think there is a risk of this being a lapsing customer. We can adapt to the user’s device and create a different experience depending on if the user is visiting in the afternoon or late night.

If we do a good job with our one-to-one sales experience, the components and contents delivered on a particular page (url) will in other words vary depending on who’s requesting it, from where the user is requesting it, what device is used, and what time it is. Depending on the application and the level of personalization, this will obviously impact both the non-functional and functional KPI:s: What is the conversion rate for the page? What is the response time for the page?

Speed sells

This coming week (first week of February), Unibet launches its revamped website based on the Facelift project I lead. As a part of this effort, we have worked extremely hard in order to lower page loading times. We have invested a substantial amount of time and money focusing on improving performance. Is this really justified?

A 2006 study by Jupiter Research found that the consequences for an online retailer whose site underperforms include diminished goodwill, negative brand perception, and, most important, significant loss in overall sales. Online shopper loyalty is contingent upon quick page loading, especially for high-spending shoppers and those with greater tenure.

The report ranked poor site performance second only to high prices and shipping costs as the main dissatisfaction among online shoppers. Additional findings in the report show that more than one-third of shoppers with a poor experience abandoned the site entirely, while 75 percent were likely not to shop on that site again. These results demonstrate that a poorly performing website can be damaging to a company’s reputation; according to the survey, nearly 30 percent of dissatisfied customers will either develop a negative perception of the company or tell their friends and family about the experience.

+500 ms page load time lead to a -20% drop in traffic at Google

Marissa Mayer ran an experiment where Google increased the number of search results from ten to thirty per page. Traffic and revenue from Google searchers in the experimental group dropped by 20%.

After a bit of looking, they found an uncontrolled variable. The page with 10 results took 400ms to generate. The page with 30 results took 900ms. Half a second delay caused a 20% drop in traffic. Half a second delay killed user satisfaction.

“It was almost proportional. If you make a product faster, you get that back in terms of increased usage”
-Marissa Mayer,VP Search Product and User Experience at Google

The same effect happened with Google Maps. When the company trimmed the 120KB page size down by about 30 percent, the company started getting about 30 percent more map requests.

+100 ms page load time lead to a -1% sales at Amazon

Amazon also performed some A/B testing and found that page load times directly impacted the revenue:

“In A/B tests, we tried delaying the page in increments of 100 milliseconds and found that even very small delays would result in substantial and costly drops in revenue.”
-Greg Linden, Amazon.com

There are a number of tools and best-practices available to improve web-site performance. I particularly like the work of Steve Souders. Steve was the Chief Performance Yahoo! (at Yahoo! obviously) and is now at Google doing web performance and open source initiatives.

When at Yahoo, Steve published a benchmark and tool, called YSlow which is a good indicator of how well the front-end web technology (HTML, javascript and images etc) of your site is implemented. Front-end makes up for almost 90% of the page load times at more e-commerce sites.

At Unibet, our old HTML had a YSlow score of 56/100 in average. This is about average in the e-gaming industry. However, the Facelifted version just out is 96/100. As comparison, eBay start page is 97/100, Yahoo! start-page is 95/100. This should result in reduced wait and based on the research above this will help drive revenue and customer satisfaction.

We have worked extremely hard in order to lower page loading times. We have invested a substantial amount of time and money in doing so. Is this really justified? YES! I am confident that our new site will contribute to increased sales and increased customer lifetime value.

Facelift and EDA

I’m getting back in shape! Since my last post I’ve become a dad again and life is good. I’ve taken up exercise again, and run almost every day. I’ve lost 10 kg:s and is starting to look somewhat fit again…

Operation Facelift

Fortunately for the company – we’re getting in shape at work too! We’re currently moving to XHTML 1.1 strict and a floating layout with skinning support. We’re also moving to 100% YUI and optimizing for SEO and performance. TTM and TCO will decrease significantly too. Man, I love to work with great front-end people. This is going to rock!

Event Driven Architecture

I’ve also kicked off a huge push where all the product teams will start aligning their architectures to an Event Driven Architecture (EDA) model. EDA is great for separation of concerns (the registration module of the customer system will let everyone who cares know that a new customer has registered) and also for getting a scalable architecture (async async async!).

Maven2 and Hudson

We’ve gotten all the product teams up on Maven2 from the god-forgotten shell-script/ant mess we had a year ago. Hudson is used for continuous building. I love Hudson – highly recommended! We also migrated to Subversion (finally).

The (almost) perfect (rich) website

I am personally a fan of light-weight web pages that use W3C standards based elements and layout. However, many commercial web sites seem to want to move to a more “print-like” experience.

The cost of moving to a richer experience is usually higher maintainance cost and round trip time – you need the graphics or flash guys for many changes. SEO (Search Engine Optimization) suffers as the graphics can’t be indexed by the web crawlers, and you usually take a hit on page load times too.

Wouldn’t it be great if you could make a web site that is:

  • Great looking
  • SEO friendly
  • Quick to load and render
  • and is XHTML compliant

We have come a long way at unibet.com, but we made some compromise in look and feel for speed and we also do still have article headers using generated images. This has bothered me for some time. One of our consultant mentioned that he know of someone that used Flash for rendering headlines, and it sounded like a good idea to me. I did some research and stumbled upon sIFR.

sIFR (or Scalable Inman Flash Replacement) is a technology that allows you to replace text elements on screen with Flash equivalents. Put simply, sIFR allows website headings, pull-quotes and other elements to be styled in whatever font the designer chooses – be that Foundry Monoline, Gill Sans, Impact, Frutiger or any other font – without the user having it installed on their machine. sIFR provides some javascript files and a Flash movie in source code format (.fla) that you can embed your fonts into. It’s really easy to set up.

To use sIFR on your website you embed the font (be careful to encode all (but only) the chars you will need) to minimize the size of the Flash movie. Typically the SWF movie is between 8-70kB. This may seem like a lot more than an image, but remember that the SWF will be cached for a very long time in to browser if you’ve set up your web server correctly. Effectively the font flash will only be downloaded once or not at all per site visit.

When you have made the SWF:s you need, just add a few lines of sIFR code into the web page and that’s it.

The following explains the sIFR process in the browser:

  1. A web page is requested and loaded by the browser.
  2. Javascript detects if Flash 6 or greater is installed.
  3. If no Flash is detected, the page is drawn as normal.
  4. If Flash is detected, the HTML element of the page is immediately given the class “hasFlash”. This effectively hides all text areas to be replaced but keeps their bounds intact. The text is hidden because of a style in the style sheet which only applies to elements that are children of the html.hasFlash element.
  5. The javascript traverses through the DOM and finds all elements to be replaced. Once found, the script measures the offsetWidth and offsetHeight of the element and replaces it with a Flash movie of the same dimensions.
  6. The Flash movie, knowing its textual content, creates a dynamic text field and render the text at a very large size (96pt).
  7. The Flash movie reduces the point size of the text until it all fits within the overall size of the movie.

sIFR is a clever hack, but none the less a hack. The result is really amazing however. It’s hardly noticeable to the end user and meets all the four requirements I set up in my “what if…” list above so we’re moving to sIFR for the next release of unibet.com.

While sIFR gives us better typography today, it is clearly not the solution for the next 20 years.

Further reading:

The sIFR3 Wiki