Software Development

An introduction to REST

REST, or Representational State Transfer is an architectural style, or more simply, a set of constraints.

We will look at the constraints REST imposes for web apps, but some highlights are:
 
 
 
 
 
 
 

  • Uniform interfaces: all resources are identified by URIs (think: links)
  • It relies on a stateless, client-server, cacheable communications protocol (think: HTTP).
  • Interaction with resources is via a set of standard methods (think: HTTP verbs)

REST can be viewed as a lightweight alternative to mechanisms like RPC (Remote Procedure Calls) and Web Services protocols (SOAP, WSDL, etc)., but it is much more than that too! It is not an exaggeration to say that REST has been used to guide the design and development of the architecture for the modern Web.

The term REST was defined in 2000 by Roy Fielding in his doctoral dissertation at UC Irvine.

  • Background
  • What is REST?
  • HTTP
  • HATEAOS
  • Summary
  • Terminology
  • Sources, references, bibliography

Background

A brief history of WWW

Back in 1989, Tim Berners-Lee first proposed the “WorldWideWeb” project. Berners-Lee was a software engineer working at at CERN, the large particle physics laboratory in Switzerland. Many scientists worked at CERN for periods of time, then returned to their own labs around the world and so there was a need for them to be able to share and link their research documents. To facilitate this, Berners-Lee proposed three technologies that would become the foundation of Web:

  • HTTP: Hypertext Transfer Protocol. HTTP is a protocol, or a formal set of rules, for exchanging information over the web. It allows for the retrieval of linked resources from across the Web.
  • HTML: HyperText Markup Language. The publishing format for the Web, including the ability to format documents and link to other documents and resources.
  • URI: Uniform Resource Identifier. A kind of “address” that is unique to each resource on the Web.

(we are not going to delve into HTML here, instead the focus is on HTTP and a little on URIs)

HTTP 1.0

The first documented version of HTTP was HTTP V0.9 (1991) and had only one method, namely GET, which would make a request to a server and the server would respond with HTML page. It was a good start, but need many enhancements to support the exploding popularity of the Web.

So, Berners-Lee teamed up with researcher Roy Fielding, and others, to develop HTTP 1.0. HTTP 1.0 transformed HTTP from a trivial request/response application to a true messaging protocol. It described a complete message format for HTTP, and explained how it should be used for client requests and server responses, and supported multiple media types.

Unfortunately, some of the limitations of HTTP 1.0 were increasingly causing problems as web usage grew. For example, a separate connection to the server is made for every resource request. There was also a lack of support for caching and proxying.

HTTP 1.1

Jump forward to 1994. The web was growing really fast. It was an exciting time. The WWW was becoming a buzzword and getting a huge amount of press. Sites like hotmail, yahoo, altavista were taking off.  Google didn’t even exist yet.

But the architecture and technologies on which the web was built were beginning to creak at the seams. So, TBL, Fielding, who were researchers at MIT and UCI respectively, and a number of other leading technologists, including folks from Compaq, Xerox and Microsoft, got together to specify and improve the WWW infrastructure through the IETF working groups on URI, HTTP, and HTML.

Through this work, HTTP 1.1 was born.

Some of the big improvements introduced in HTTP 1.1 were:

  • Multiple Host Name Support: Allows one Web server to handle requests for many different virtual hosts.
  • Persistent Connections: Allows a client to send multiple requests for documents in a single TCP session.
  • Partial Resource Selection: A client can ask for only part of a resource rather than the entire document, reducing load and required bandwidth
  • Better Caching and Proxying Support
  • Content Negotiation: Allows the client and server to exchange information to help select the best resource when multiple are available.
  • Better Security: Defines authentication methods and is generally more “security aware”

Work began on HTTP 1.1 in 1994, and it was official released in 1997.

And what version of HTTP 1.1 is in use today? Still 1.1, over 25 years later! Considering how quickly technology changes, that is an incredible achievement. How many projects have you worked on that have stood the test of time so well?

Lessons learned

Fielding had been involved in the web from its infancy and experienced first hand its rapid growth, both as a user and as an architect. He understood better than most the reasons for its success and so after the release of HTTP1.1, Fielding begin to write about what he had learned working on HTTP, and the other web technologies (Fielding has also been involved in the development of HTML, URIs and was a co-founder of the Apache HTTP Server project). He took the knowledge of web’s architectural principles and presented them as a framework of constraints, or as he called them, an architectural style. Specifically, Fielding wrote a PhD thesis focused on the rationale behind, and key architectural principles of, the design of the modern Web architecture.

Fielding’s thesis was published in 2000, and was called Architectural Styles and the Design of Network-based Software Architectures. I have to admit that I have not read many PhD theses, but his most be among the most readable of them. It even contains Monty Python quotes!

In it, Fielding discusses Network-based Application Architectures and Architectural Styles, before introducing and defining the term REST. Although introduced in Fielding’s paper, Fielding noted that “REST has been used to guide the design and development of the architecture for the modern Web”. So, while the term REST didn’t come about until afterwards, it is the design style behind HTTP. Fielding didn’t ‘invent’ REST in his paper, instead he developed it in collaboration with his colleagues while working on HTTP and URIs, but it was in his paper that the term was coined and defined.

Fielding tried to answer the question of why the Web has been such a successful platform by explaining it guiding principles, and how they can be correctly applied when building distributed systems?

So, want to build a distributed web app? Not sure what architecture to use? Why not base it on the Web’s architecture!

Before diving in to what REST is, feel free to read the terminology section at the end.

What is REST?

REST is an architectural style, or a set of constraints, for distributed hypermedia systems.

Constraints

Imagine you were designing a freeway. You might impose rules such as cars only (no trucks, pedestrians or bicycles), all traffic must travel between 40 and 70 mph, and no traffic lights (only on and off ramps). Although these rules constrain the system, they make it work better overall; in this case allow more traffic to flow freer and faster.

REST imposes constraints on web apps, or distributed hypermedia systems, in order to enable those apps to scale and perform as desired.

What were the constraints that Fielding suggested?

1) Client Server

By separating the user interface concerns from the data storage concerns, we improve the portability of the user interface across multiple platforms and improve scalability by simplifying the server components. Separation also allows the components to evolve independently.

2) Stateless

Communication must be stateless. Each request from client to server must contain all of the information necessary to understand the request. Session state is kept entirely on the client. Reliability is improved because it eases the task of recovering from partial failures. Scalability is improved because not having to store state between requests allows the server component to quickly free resources, and simplifies implementation.

3) Cache

Cache constraints require that the data within a response to a request be labeled as cacheable or non-cacheable. If a response is cacheable, then a client cache is given the right to reuse that response data for later, equivalent requests.

4) Uniform Interface

The central feature that distinguishes the REST architectural style from other network-based styles is its emphasis on a uniform interface between components. Implementations are decoupled from the services they provide, which encourages independent evolvability.

5) Layered System

The layered system style allows an architecture to be composed of layers by constraining component behavior such that each component cannot “see” beyond the immediate layer with which they are interacting.

6) Code-On-Demand

The final addition to our constraint set for REST comes from the code-on-demand style. REST allows client functionality to be extended by downloading and executing code in the form of applets or scripts. This simplifies clients by reducing the number of features required to be pre-implemented. Allowing features to be downloaded after deployment improves system extensibility. However, it also reduces visibility, and thus is only an optional constraint within REST.

Those are the constraints that make up REST. Next, HTTP.

HTTP

HTTP has a very special role in web architecture, and with REST in particular.

Note however that REST doesn’t have to use HTTP. There are other application-level protocols that could, possibly, be candidates for use with REST: The Gopher was widely used in the early days of the web, although was overtaken by HTTP; Fielding himself has been working on a new http-like protocol called waka; There is also a Google developed protocol called SPDY that has goals of reducing web page load latency and improving web security.

However in practice REST and HTTP are closely related. Fielding not only introduced REST, he was also one of the principal authors of the HTTP specification, so it is not too surprising that the two are closely linked.

We will dive in to HTTP and look at some example requests & responses and the HTTP methods and response codes that are commonly used.

Example Request

An example of a HTTP request:
GET /index.html HTTP/1.1
Host: www.example.com

This is made up of the following components:

  • Method: GET
  • URI:  /index.html
  • Version: HTTP/1.1
  • Headers: Host: www.example.com
  • Body: empty in this case

Example Response

Version/Status code; Reason phrase

HTTP/1.1 200 OK Version/Status code; Reason phrase
Date: Mon, 23 May 2005 22:38:34 GMT  HEADERS
Server: Apache/1.3.3.7 (Unix) (Red-Hat/Linux)
Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT
ETag: "3f80f-1b6-3e1cb03b"
Content-Type: text/html; charset=UTF-8
Content-Length: 131
Accept-Ranges: bytes
Connection: close
 <html> BODY
   <head>
     <title>An Example Page</title>
   </head>
   <body>
     Hello World
   </body>
 </html>

In the above request example, the verb is GET. HTTP verbs are also known as methods, are there are 8 supported in the HTTP 1.1 (RFC 2616). First we will look at the 4 most commonly used verbs: GET, PUT, DELETE, POST. Then we will look at the lesser used ones: HEAD, OPTIONS, TRACE and CONNECT

However, before we dive in to the methods, let’s take a look at some characteristics, or groupings, of the messages. Specifically, the concept of safe methods and idempotency.

HTTP Methods

Safe Methods

Safe methods are methods that do not modify resources, they are used only for retrieval. (Strictly speaking, some things may change, e.g. logs, caches etc, but the representation of the resource in question must not).
Safe methods are:  HEAD, GET, OPTIONS and TRACE. By contrast, non-safe methods such as POST, PUT, DELETE and PATCH are intended to cause side effects either on the server.

Idempotent Methods

Idempotent methods can be called many times without different outcomes. Call it once, or 1 thousand times, the result will be the same. For example, multiplying by 1 is an idempotent operation. So is the assignment ‘a=4;’

More formally “Methods can also have the property of idempotence in that (aside from error or expiration issues) the side-effects of N>0 identical requests is the same as for a single request.” [7]

The methods GET, HEAD, PUT and DELETE share this property. Also, the methods OPTIONS and TRACE SHOULD NOT have side effects, and so are inherently idempotent.

Common Methods

And now, a look at the 4 most commonly used verbs: GET, PUT, DELETE, POST

GET

Retrieve the resource identified by the URI. The simplest and most common method! The one you use every time you access a web page.

PUT

Store the supplied entity under the supplied URI. If already exists, update (and return either the 200 OK or 204 No Content). If not create with that URI (and return ‘201 Created’ response).

POST

Request to accept the entity as a new subordinate of the resource identified by the URI. For example

  • Submit data from a form to a data-handling process;
  • Post a message to a mailing list or blog

In plain english, create a resource.

DELETE

Requests that the server delete the resource identified by the URI.

PUT vs POST

OK, before we go on to the other lesser used HTTP, verbs, let’s take a look at 2 of the above commonly used verbs that are often most confusing: PUT and POST.

The office HTTP 1.1 doc (RFC 2616) states:

“The fundamental difference between the POST and PUT requests is reflected in the different meaning of the Request-URI. The URI in a POST request identifies the resource that will handle the enclosed entity. That resource might be a data-accepting process, a gateway to some other protocol, or a separate entity that accepts annotations. In contrast, the URI in a PUT request identifies the entity enclosed with the request — the user agent knows what URI is intended and the server MUST NOT attempt to apply the request to some other resource.”

That however is a bit of a mouthful!

PUT and POST can both be used to create or update a resource, but here are some (sometimes contradictory!) rules of thumb:

  • PUT is for update; POST is for create
  • PUT idempotent; POST is not;
  • Who creates the URL of the resource?
    • PUT is for creating when you know the URL of the thing you will create;
    • POST is for creating when the server decides the URL for you (you just know the URL of the “factory” or manager that does the creation)
  • There is also a recent argument (from Thoughtworks for example) that says don’t use Put, always Post (and post events instead).

Short answer? There is no short answer! Use your best judgement.

See some useful discussions at this stackoverflow posting.

Less Common Methods

The other 4 lesser use HTTP verbs are:  HEAD, OPTIONS, TRACE and CONNECT.

OPTIONS

Request for information about the capabilities of a server, e.g. request a list of HTTP methods that may be used on this resource. It would look something like this: 200 OK

Allow: HEAD,GET,PUT,DELETE,OPTIONS

A somewhat obscure part of the HTTP standard. Potentially useful but few web services actual seem to make it available.

HEAD

Identical to GET except that the server MUST NOT return a message-body in the response. Used for obtaining meta-information about the entity implied by the request without transferring the entity-body itself.

Why use? Useful for testing links, e.g. for validity, accessibility.

TRACE

Used to invoke a remote, application-layer loop- back of the request message. Plain english: Echoes back the received request so that a client can see what (if any) changes or additions have been made by intermediate servers. Trace is often disabled since can represent a security risk.

CONNECT

Connect is for use with a proxy that can dynamically switch to being a tunnel.

Converts the request connection to a transparent TCP/IP tunnel, usually to facilitate SSL-encrypted communication (HTTPS) through an unencrypted HTTP proxy.

HTTP Response codes

See http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html

CodeMeaningPlain English(From user perspective)
1xxInformational; indicates a provisional response,e.g. 100FYI, OK so far and client should continue with the request
2xxSuccessfulAll good
3xxRedirectionSomething moved
4xxClient ErrorYou messed up
5xxServer ErrorWe messed up

Why REST and HTTP?

Because HTTP provides all the characteristics required by REST.

  1. Client Server

    Http is a “protocol in the client-server computing model”, so meets the first requirement of REST. With HTTP, often the client is a web browser and the server is a piece of software serving content such as Apache, IIS or Nginx. With the “Internet of Things” however, things are becoming less conventional. The client could be your toaster!

  2. Stateless

    HTTP is a stateless protocol. HTTP servers are not required to keep any information or state between requests.

    This can be circumvented by using things like cookies and sessions, but Fielding makes it clear in his dissertation that he strongly disagrees with cookies.

  3. Cache

    HTTP supports caching via three basic mechanisms: freshness, validation, and invalidation.

  4. Uniform Interface

    Using interfaces to decouple a client/caller from the implementation is a common concept on software.

    • Identification of resources

      HTTP supports hyperlinks. Anything of interest can be a resource, and those resources can be identified uniquely by a URI.

      How do you identify a book? example.com/books/1234

      How do you identify a user? example.com/users/sabram

      All resources are identified by a uniform interface – the URI

    • Manipulation of resources through these representations

      URIs, in conjunction with the HTTP methods, can be used to manipulate resources.

    • Self-descriptive messages

      In HTTP, messages can describe themselves using media (MIME) types, status codes, and headers to, for example, indicate their cacheability.

    • Hypermedia as the engine of application state (A.K.A. HATEOAS)

    More later! See below.

    Uniform Interface, in plain English.

    OK, that covers what Fielding had to say in his dissertation about Uniform Interfaces, but what does it all mean in plain English?

    I mentioned earlier that using interfaces to decouple a client/caller from the implementation is a common concept on software. Similarly, when designing GUIs, you ideally have a very simple user interface, but one that still allows the user to carry out complex tasks. Generally, a simple interface that provides the client/user all the capabilities they need while hiding the underlying complexities of the implementations is the ideal goal, but tough to achieve. But that is exactly what Fielding achieved with REST. The interface is simply a link (or more specifically, a URI)! Which is about the the simplest interface you can think of.

    Combined with the other HTTP capabilities such as methods and media types and suddenly you have an incredibly powerful but deceptively simple, and widely understood method of communicating intentions.

  5. Layered System

    The idea behind a layered system is that a client doesn’t know (or care) whether it is connected to the end server, or to an intermediary one. This feature can improve scalability via load-balancing and caches etc. Layers may also enforce security policies.

    HTTP supports layering via proxy servers and caching.

  6. Code-On-Demand

    This is actually an optional constraint in REST. For example, you may request a resource, and get that resource with some JavaScript.

HATEAOS

Clients know a few simple fixed entry points to the application but have no knowledge beyond that. Instead, they transition (states) by using those links, and the links they lead to. In other words, state transitions are driven by the client based on options the server presents.

If you think of Hypermedia as simply links, then “Hypermedia as the engine of application state” is simply using the links you discover to navigate (or transition state) through the application.

And remember that it doesn’t need to be a user clicking on links; it can just as easily be another software component that is initiating the state transitions.

To quote from Fielding himself:

“Representational State Transfer is intended to evoke an image of how a well-designed Web application behaves:

a network of web pages (a virtual state-machine), where the user progresses through an application by selecting links (state transitions), resulting in the next page (representing the next state of the application) being transferred to the user and rendered for their use.”

Summary

What is REST?

  • Pretty URLs?
  • An alternative to SOAP or RPC?

Really it is an architectural style, or a set of constraints, that captures the fundamental principles that underlie the Web.

The emphasis of REST is on simplicity, and utilizing the power of the existing web technologies and standards such as HTTP and URI

  • Uniform interfaces: All resources are identified by URIs
  • HTTP Methods: All resources can be created/accessed/updated/deleted by standard HTTP methods
  • Stateless: There is no state on the server

Terminology

Let’s define some useful terminology that is relevant in any discussion of REST.

Architecture

Wikipedia: Software architecture refers to the high level structures of a software system, the discipline of creating such structures, and the documentation of these structures. The architecture of a software system is a metaphor, analogous to the architecture of a building.

Fielding: A software architecture is an abstraction of the run-time elements of a software system during some phase of its operation [1]

Fowler: Architecture is a shared understanding of the system design, including how the system is divided into components and how the components interact through interfaces. [3]

Architectural style

Fielding: An architectural style is a named, coordinated set of architectural constraints that restricts the roles and features of architectural elements [1]

An architectural style is a named collection of architectural design decisions that (1) are applicable in a given development context, (2) constrain architectural design decisions that are specific to a particular system within that context, and (3) elicit beneficial qualities in each resulting system [4]

REST or RESTful?

What is the difference between the terms REST and RESTful? From what I have read, there is not a lot of difference. We know that REST is an architectural style for distributed software. Services conforming to that architectural style.

Conforming to the REST constraints is referred to as being ‘RESTful’. Or to put it another way: REST is a noun, RESTful is an adjective.

Hypertext

In plain English: Hypertext is text with links.In plain English:

Wikipedia: Hypertext is text displayed on a computer display or other electronic devices with references (hyperlinks) to other text which the reader can immediately access, or where text can be revealed progressively at multiple levels of detail.

Roy Fielding: The simultaneous presentation of information and controls such that the information becomes the affordance through which the user obtains choices and selects actions [slide #50]

Hypermedia

In plain English: Interactive multimedia. If you see a booth at a mall with video, sound etc that is multimedia. If you can interact with it – click links, or control the content using buttons or the like, it is hypermedia.

Wikipedia: Hypermedia, an extension of the term hypertext, is a nonlinear medium of information which includes graphics, audio, video, plain text and hyperlinks.

Roy Fielding: Hypermedia is defined by the presence of application control information embedded within, or as a layer above, the presentation of information. [1]

Resource

In plain English: A resource can be anything real, but typical examples would be files, web pages, customers, accounts etc.

Wikipedia: any physical or virtual component of limited availability within a computer system.

Roy Fielding: Any information that can be named can be a resource: a document or image, a collection of other resources, a non-virtual object (e.g. a person). In other words, any concept that might be the target of an author’s hypertext reference must fit within the definition of a resource. [1]

REST in practice: A resource is anything we expose to the Web, from a document or video clip to a business process or device. From a consumer’s point of view, a resource is anything with which that consumer interacts while progressing toward some goal.[6]

URI – Uniform Resource Identifier

Wikipedia: a string of characters used to identify a name of a resource

W3: Uniform Resource Identifiers (URIs, aka URLs) are short strings that identify resources in the web: documents, images, downloadable files, services, electronic mailboxes, and other resources.

What is the difference between a URI and  URL?

The difference between a URI and a URL is subtle, and I don’t think terribly important. A URI identifies a resource either by location and/or a name. A URI does not have to specify the location of a specific representation. If it does, it is also a URL.

A Uniform Resource Locator (URL) is a subset of the Uniform Resource Identifier (URI) that specifies where an identified resource is available and the mechanism for retrieving it”.

So all URLs are URIs, but all URIs are not URLs. URIs can also be URN (Universal Resource Name).

Or: URLs and URNs are special forms of URIs.

For the most part, I think you can think or URI and URLs as being the same thing. I may be flamed for saying that, but it keeps things simpler!

Sources, references, bibliography

  1. Architectural Styles and the Design of Network-based Software Architectures (Fielding, 2000)
  2. A little REST and relaxation (Fielding)
  3. Who Needs An Architect? (Fowler)
  4. Software architecture: Foundations, Theory and Practice; R. N. Taylor, N. Medvidović and E. M. Dashofy, . Wiley, 2009.
  5. Representational state transfer (Wikipedia)
  6. REST in practice (Webber; Parastatidis; Robinson)
  7. HTTP 1.1 (RFC 2616)
Reference: An introduction to REST from our JCG partner Shaun Abram at the Shaun Abram blog blog.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Venkat
Venkat
10 years ago

Thanks, Great article with necessary points to be remembered.

heillemann
heillemann
10 years ago

The best introductory article on REST I have read to date.

Back to top button