Simplifying RESTful Search
Overview
REST architectural pattern is based around two basic principles:
- Resources as URLs: A resource is something like an entity or a noun in modelling lingo. Anything on a web is identified as a resource and each unique resource is identified by a unique URL.
- Operations as HTTP methods: REST leverages existing HTTP methods, particularly GET, PUT, POST, and DELETE which map to resource’s read, create, modify and removal operations respectively.
Any action performed by a client over HTTP, contains an URL and a HTTP method. The URL represents the resource and the HTTP method represents the action which needs to be performed over the resource.
Being a broad architectural style, REST always have different interpretations. The ambiguity is exacerbated by the fact that there aren’t nearly enough HTTP methods to support common operations. One of the most common examples is the lack of a ‘search’ method. Search being one of the most extensively used features across different applications, but there have been no standards for implementing this feature. Due to this different people tend to design search in different ways. Given that REST aims to unify service architecture, any ambiguity must be seen as weakening the argument for REST.
Further in this document, we shall be discussing how search over REST can be simplified. We are not aiming at developing standards for RESTful search, but we shall be discussing how this problem can be approached.
Search Requirements
Search being mostly used feature across different web applications, supports almost similar features around different applications. Below is the list of some common constituents of search features:
- Search based on one or more criteria at a time
- Search red colored cars of type hatchback
- color=red && type=hatchback
- Relational and conditional operator support
- Search red or black car with mileage greater than 10
- Colour=red|black && mileage > 10
- Wild card search
- Search car manufactured from company name starting with M
- company=M*
- Pagination
- List all cars but fetch 100 results at a time
- upperLimit=200 && lowerLimit=101
- Range searches
- Get me all the cars launched between 2000 and 2010
- launch year between (2000, 2010)
When we support search with such features, search interface design itself becomes complex. And when implemented in a REST framework, meeting all these requirements (while still conforming to REST!) is challenging.
Coming back to the basic REST principles, we are now left with following two questions:
- Which HTTP method to use for “search”?
- How to create effective resource URL for search?
- Query parameters versus Embedded URLs
- Modelling filter criteria
HTTP Method Selection
Query Criteria vs. Embedded Criteria: Effectively, REST categorizes the operations by its nature and associates well-defined semantics with these categories. The idempotent operations are GET, PUT and DELETE (GET for read-only, PUT for update, DELETE for remove). While POST method is used for non-idempotent procedures like create.
By the definition itself, search is a read only operation, which is used to request for a collection of resources, filtered based on some criteria. So, GET HTTP method for search feature is an obvious choice. However, with GET, we are constrained with respect to URL size if we add complex criteria in the URL.
URL Representation
Let’s discuss this using an example: a user wish to search four-doored sedan cars of blue color; how shall the resource URL for this request look like? Below two different URLs are syntactically different but semantically same:
- /cars/?color=blue&type=sedan&doors=4
- /cars/color:blue/type:sedan/doors:4
Both of the above URLs conform to RESTful way of representing a resource query, but are represented differently. First one uses URL query criteria to add filtering details while the later one goes by an embedded URL approach.
The embedded URL approach is more readable and can take advantage of the native caching mechanisms that exist on the web server for HTTP traffic. But this approach limits user to provide parameter in a specific order. Wrong parameter positions will cause an error or unwanted behaviour. Below two looks same but may not give you correct results
- /cars/color:red/type:sedan
- /cars/type:sedan/color:red
Also, since there’s no standardization for embedding criteria, people may tend to device their own way of representation.
So, we consider query criteria approach over the embedded URL approach, though the representation is a bit complex and lacks readability
Modeling Filter Criteria: A search-results page is fundamentally RESTful even though its URL identifies a query. The URL shall be able to incorporate SQL like elements. While SQL is meant to filter data fetched from relational data, the new modelling language shall be able to filter data from hierarchical set of resources. This language shall help in devising a mechanism to communicate complex search requirements over URLs. In this section further, two such styles are discussed in detail.
- Feed Item Query Language (FIQL): The Feed Item Query Language (FIQL, pronounced “fickle”) is a simple but flexible, URI-friendly syntax for expressing filters across the entries in a syndicated feed. These filter expressions can be mapped at any RESTful service and can help in modelling complex filters. Below are some samples of such web URLs against their respective SQLs.
SQL | REST Search URLs |
select * from actors where firstname=’PENELOPE’ and lastname=’GUINESS’ | /actors?_s=firstname==PENELOPE;lastname==GUINESS |
select * from actors where lastname like ‘PEN%’ | /actors?_s=lastname==PEN* |
select * from films where filmid=1 and rentalduration <> 0 | /films?_s=filmid==1;rentalduration!=0 |
select * from films where filmid >= 995 | /films?_s=filmid=ge=995 |
select * from films where release date < ‘27/05/2005’ | /film?_s=releasedate=le=2005-05-27T00:00:00.000%2B00:00 |
- Resource Query Language (RQL) : Resource Query Languages (RQL) defines a syntactically simple query language for querying and retrieving resources. RQL is designed to be URI friendly, particularly as a query component of a URI, and highly extensible. RQL is a superset of HTML’s URL encoding of form values, and a superset of Feed Item Query Language (FIQL). RQL basically consists of a set of nestable named operators which each have a set of arguments and operate on a collection of resources.
Casestudy: Apache CXF advance search features
To support advance search capabilities Apache CXF introduced FIQL support with its JAX-RS implementation since 2.3.0 release. With this feature, users can now express complex search expressions using URI. Below is the detailed note on how to use this feature:
To work with FIQL queries, a SearchContext needs be injected into an application code and used to retrieve a SearchCondition representing the current FIQL query. This SearchCondition can be used in a number of ways for finding the matching data.
@Path("books") public class Books { private Map books; @Context private SearchContext context;@GET public List getBook() {SearchCondition sc = searchContext.getCondition(Book.class); //SearchCondition is method can also be used to build a list of// matching beans iterate over all the values in the books map and // return a collection of matching beans List found = sc.findAll(books.values()); return found; } }
SearchCondition can also be used to get to all the search requirements (originally expressed in FIQL) and do some manual comparison against the local data. For example, SearchCondition provides a utility toSQL(String tableName, String… columnNames) method which internally introspects all the search expressions constituting a current query and converts them into an SQL expression:
// find all conditions with names starting from 'ami' // and levels greater than 10 : // ?_s="name==ami*;level=gt=10" SearchCondition sc = searchContext.getCondition(Book.class); assertEquals("SELECT * FROM table WHERE name LIKE 'ami%' AND level > '10'", sq.toSQL("table"));
Conclusion
Data querying is a critical component of most applications. With the advance of rich client-driven Ajax applications and document oriented databases, new querying techniques are needed; these techniques must be simple but extensible, designed to work within URIs and query for collections of resources. The NoSQL movement is opening the way for a more modular approach to databases, and separating out modelling, validation, and querying concerns from storage concerns, but we need new querying approaches to match more modern architectural design.
Reference: Guava’s Strings Class from our JCG partner Dustin Marx at the Inspired by Actual Events blog.