High Performance Webapps – Data URIs

Oleg VaraksinApril 26th, 2012Last Updated: October 21st, 2012

1 55 8 minutes read

I continue to write tips for perfomance optimization of websites. The last post was about jQuery objects. This post is about data URIs. Data URIs are an interesting concept on the Web. Read ” Data URIs explained” please if you don’t know what it does mean. Data URIs are a technique for embedding resources as base 64 encoded data, avoiding the need for extra HTTP requests. It gives you the ability to embed files, especially images, inside of other files, especially CSS. Not only images are supported by data URIs, but embedded inline images are the most interesting part of this technique. This technique allows separate images to be fetched in a single HTTP request rather than multiple HTTP requests, what can be more efficient. Decreasing the number of requests results in better page performance. “Minimize HTTP requests” is actually the first rule of the ” Yahoo! Exceptional Performance Best Practices“, and it specifically mentions data URIs.

” Combining inline images into your (cached) stylesheets is a way to reduce HTTP requests and avoid increasing the size of your pages… 40-60% of daily visitors to your site come in with an empty cache. Making your page fast for these first time visitors is key to a better user experience.”

Data URI format is specified as

data:[<mime type>][;charset=<charset>][;base64],<encoded data>

We are only interesting for images, so that mime types can be e.g. image/gif, image/jpeg or image/png. Charset should be omitted for images. The encoding is indicated by ;base64. One example of a valid data URI:

<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUA
        AAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO
        9TXL0Y4OHwAAAABJRU5ErkJggg==" alt="Red dot">

HTML fragments with inline images like the above example are not really interesting because they are not cached. Data URIs in CSS files (style sheets) are cached along with CSS files and that brings benefits. Some advantages describing in Wikipedia:

HTTP request and header traffic is not required for embedded data, so data URIs consume less bandwidth whenever the overhead of encoding the inline content as a data URI is smaller than the HTTP overhead. For example, the required base64 encoding for an image 600 bytes long would be 800 bytes, so if an HTTP request required more than 200 bytes of overhead, the data URI would be more efficient.
For transferring many small files (less than a few kilobytes each), this can be faster. TCP transfers tend to start slowly. If each file requires a new TCP connection, the transfer speed is limited by the round-trip time rather than the available bandwidth. Using HTTP keep-alive improves the situation, but may not entirely alleviate the bottleneck.
When browsing a secure HTTPS web site, web browsers commonly require that all elements of a web page be downloaded over secure connections, or the user will be notified of reduced security due to a mixture of secure and insecure elements. On badly configured servers, HTTPS requests have significant overhead over common HTTP requests, so embedding data in data URIs may improve speed in this case.
Web browsers are usually configured to make only a certain number (often two) of concurrent HTTP connections to a domain, so inline data frees up a download connection for other content.

Furthermore, data URIs are better than sprites. Images organized as CSS sprites (many small images combined to one big) are difficult to be maintained. Maintenance costs are high. Imagine, you want to change some small images in the sprite, their position, size, color or whatever. Well, there are tools allowing to generate sprites, but later changes are not easy. Especially changes in size cause a shift of all positions and a lot of CSS changes. And don’t forget – a sprite still requires one HTTP request :-).

What browsers support data URIs? Data URIs are supported for all modern browsers: Gecko-based (Firefox, SeaMonkey, Camino, etc.), WebKit-based (Safari, Google Chrome), Opera, Konqueror, Internet Explorer 8 and higher. For Internet Explorer 8 data URIs must be smaller than 32 KB. Internet Explorer 9 does not have this 32 KB limitation. IE versions 5-7 lack support of data URIs, but there is MHTML – when you need data URIs in IE7 and under.

Are there tools helping with automatic data URI embedding? Yes, there are some tools. The most popular is a command line tool CSSEmbed. Especially if you need to support old IE versions, you can use this command line tool which can deal with MHTML. Maven plugin for web resource optimization, which is a part of PrimeFaces Extensions project, has now a support for data URIs too. The plugin allows to embed data URIs for referenced images in style sheets at build time. This Maven plugin doesn’t support MHTML. It’s problematic because you need to include CSS files with conditional comments separately – for IE7 and under and all other browsers. How does the conversion to data URIs work?

Plugin reads the content of CSS files. A special java.io.Reader implementation looks for tokens #{resource[…]} in CSS files. This is a syntax for image references in JSF 2. Token should start with #{resource[ and ends with ]}. The content inside contains image path in JSF syntax. Theoretically we can also support other tokens (they are configurable), but we’re not interested in such kind of support :-) Examples:
```
.ui-icon-logosmall {
    background-image: url("#{resource['images/logosmall.gif']}") !important;
}

.ui-icon-aristo {
     background-image: url("#{resource['images:themeswitcher/aristo.png']}") !important;
}
```
In the next step the image resource for each background image is localized. Images directories are specified according to the JSF 2 specification and suit WAR as well as JAR projects. These are ${project.basedir}/src/main/webapp/resources and ${project.basedir}/src/main/resources/META-INF/resources. Every image is tried to be found in those directories.
If the image is not found in the specified directories, then it doesn’t get transformed. Otherwise, the image is encoded into base64 string. The encoding is performed only if the data URI string is less than 32KB in order to support IE8 browser. Images larger than that amount are not transformed. Data URIs looks like
```
.ui-icon-logosmall {
    background-image: url("data:image/gif;base64,iVBORw0KGgoAAAANSUhEUgA ... ASUVORK5CYII=") !important;
}

.ui-icon-aristo {
    background-image: url("data:image/png;base64,iVBORw0KGgoAAAANSUhEUgA ... BJRU5ErkJggg==") !important;
}
```

Configuration in pom.xml is simple. To enable this feature set useDataUri flag to true. Example:

<plugin>
    <groupId>org.primefaces.extensions</groupId>
    <artifactId>resources-optimizer-maven-plugin</artifactId>
    <configuration>
        <useDataUri>true</useDataUri>
        <resourcesSets>
            <resourcesSet>
                <inputDir>${project.build.directory}/webapp-resources</inputDir>
            </resourcesSet>
        </resourcesSets>
    </configuration>
</plugin>

Enough theory . Now, i will describe a practice part. I will expose some measurements, screenshots and give tips how large images should be, where CSS should be placed, what is the size of CSS file with data URIs and whether a GZIP filter can help here. Read on.

The first question is if it’s worth to put data URIs in style sheets? Yes, it’s worth. First, I would like to point you to this great article ” Data URIs for CSS Images: More Tests, More Questions” where you can try to test all three scenarios for your location. Latency is different depending on your location. But you can see a tendency that a web page containing data URIs is loaded faster. We can see one of the main tricks to achieve better performance with data URIs:

Split your CSS in two files – one with main data and one with data URIs only and place the second one in the footer. “In the footer” means close to the HTML body tag. Page rendering feels faster then because of the progressive rendering. In the second article you can see that this technique really accelerates page rendering. Style sheet in footer leads to a nice effect that large images download in parallel with the data URI style sheet. Why? Well, browser thinks stuff placed in footer can not have any impact on page structure above included files and doesn’t block resource loading. I also read that in this case all browsers (except old IE versions) render a page immediately without waiting until CSS with data URIs has been loaded. The same is valid for JavaScript files, as far as I know. Is it valid at all to put CSS files in page footer? Well, it’s not recommended in the HTML specification. But it’s valid in practice and it’s not bad at all in special cases. There is an interesting discussion on Stackoverflow ” How bad is it to put a CSS include in the middle of the body?”

The second tip is to use data URIs for small images, up to 1-2 KB. It’s not worth to use data URIs for large images. A large image has a very long data URI string (base64 encoded string) which can increase the size of CSS file. Files with a big size can block loading of other files. Remember, browsers have connection limitations. They can normally open 2-8 conection to the same domain. That means only 2-8 files can be loaded parallel at the same time. After reading some comments in internet I got an acknowledge about my assumption with 1-2 KB images.

We can soften this behavior by using of GZIP filter. A GZIP filter reduces size of resources. I have read that sometimes the size of an image encoded as data URI is even smaller than the size of original image. A GZIP filter is appled to web resources like CSS, JavaScript and (X)HTML files. But it’s not recommended to apply it to images and PDF files e.g. So, not encoded images aren’t going through the filer, but CSS files are going through. In 99%, if you gzip your CSS file, the resulting size is about the same as the regular image URL reference! And that was the third tip – use a GZIP filter.

I would like to show now my test results. My test environment: Firefox 11 on Kubuntu Oneiric. I prepared the showcase of PrimeFaces Extensions with 31 images which I added to the start page. These images display small themes icons in PNG format. Every image has the same size 30 x 27 px. Sizes in kilobytes lie in range 1.0 – 4.6 KB. CSS file without data URIs was 4.8 KB and with data URIs 91,6 KB. CSS files were included quite normally in HTML head section, by the way. I deployed showcases with and without data URIs on my VPS with Jetty 8 server. First without a GZIP filer. I cleared browser cache and opened Firebug for each showcase. Here results:

Without data URIs:
65 requests. Page loading time 3.84s (onload: 4.14s).

That means, document ready event occured after 3.84 sek. and window onload after 4.14 sek. Subsequent calls for the same page (resources were fetched from browser cache) took 577 ms, 571 ms, 523 ms, …

With data URIs:
34 requests. Page loading time 3.15s (onload: 3.33s).

That means, fewer requests (remember 31 embedded images), document ready event occured after 3.15 sek. and window onload after 3.33 sek. Subsequent calls for the same page (resources were fetched from browser cache) took 513 ms, 529 ms, 499 ms, …

There isn’t much difference for subsequent calls (page refreshes), but there is a significant difference for the first time visiting. Especially onload event occurs faster with data URIs. No wonder. Images being loading after document is ready. Because they can not be loaded parallel (number of opened connection is limited), they get blocked. I took some pictures from Google Chrome Web Inspector. Below you can see timing for an image (vader.png) for the first (regular) case without data URI.

And the second case for the same image encoded as data URI.

You see in the second picture there isn’t any blocking at all. Tests with a GZIP Filter didn’t have much impact in my case (don’t know why, maybe I haven’t too much resources). Average times after a couple of tests with empty cache:

Without data URIs:
65 requests. Page loading time 3.18s (onload: 3.81s).

With data URIs:
34 requests. Page loading time 3.03s (onload: 3.19s).

Reference: High Performance Webapps. Use Data URIs. Practice, High Performance Webapps. Use Data URIs. Theory from our JCG partner Oleg Varaksin at the Thoughts on software development blog.