Software Development
Application Scalability: Still elusive for Enterprises Apps
The advent of the Consumer Business application like Facebook, Twitter has changed the definition of Application Scalability. Decade back 10 million+ was a large user base, Facebook will touch 1Billion+ users by end of this year. There are hordes of applications in 100+ million user range. The techniques and approaches employed by these large Consumer Business Applications are different from the traditional enterprise application design and architecture techniques.
If we study the architectures principles employed by these large Consumer Business Applications, then we can conclude to the following
- All the large consumer business applications are build and make use of Cloud Computing
- Applications are build using a combination of open source products and platforms
- Create solutions where the current set of solution do not meet requirement or scale up (e.g. HipHop, Hadoop, ChaosMonkey etc)
- Constant knowledge sharing within the community (FB/Twitter/Google open sources lot of their internal products)
From the Application scalability perspective, the following key architecture patterns have emerged that are being used to scale the application
- Stateless Applications – Modern Web applications maintain state, which means that they remember what you were did as part of your last request, and they remember all this data as part of the Session. The Stateful nature of the application means if the server holding that state goes down, the entire user state is lost. Traditional techniques like sync user session data across server nodes or persisting in data store do not scale very well. To overcome this shortcoming, the whole concept of creating stateless applications where the servers do not maintain state. The application uses cookies and makes use of RESTful API’s to craft the user experience. Newer frameworks like Play Framework, Node.js, Vertx.io all promote this stateless application development that scales with the increasing user load very well. Another option used is to offload session state is high throughput memory based key value store, which is common across stateless nodes.
- Data Sharding – Another issue encountered with scale is the increasing load on DB servers. DB Servers do employ techniques like Master/Slave or Clustering techniques coupled with large powerful boxes but beyond 100+ million users, these techniques also fail. Further, the costly hardware and software licenses make the traditional DB options very costly. So, companies have used the MySQL as the base and created lots of solutions/topologies around the same. One of those techniques is data sharding. A Data shard is a horizontal partition of the data, meaning rows of the tables are held separately. There are numerous advantages to this partitioning approach. Since the tables are divided and distributed into multiple servers, the total number of rows in each table in each database is reduced. This reduces index size, which generally improves search performance. A database shard can be placed on separate hardware, and multiple shards can be placed on multiple machines. This enables a distribution of the database over a large number of machines, allowing database performance to be spread over multiple machines, greatly improving overall performance.Many new non-relational databases (commonly known as NoSQL databases) such as Cassandra, MongoDb and Hbase have been created with database sharding as a key feature.
- Bring Data closer to user (Caching) – Caching has been adopted and used very well by most of the Consumer Business Apps. Open source products like memcache provide caching options across the tiers (web tier/app tier and data tier). Memcache provides a reliable alternative to the traditional caching solutions – coherence, terracotta. Further, the NoSQL solutions employ the caching engine to speed up the database read and write operations. Couchdb uses the memcache to provide an in memory read/write solution.
- Service provider/consumer model – Another pattern that has emerged is the Service provider and consumer. It is derivative of the SOA model but leaner and simpler. The business functionality is exposed as a set of services via RESTful API with JSON data format. The presentation layers are the consumers that use these services to build up the user experience. The technologies to build the service and provider can be completely separate. Technologies are chosen based on what is best for the use case. Contracts are enforced by Service version and definition only. Front ends are typically built using PHP or Ruby over rails. Services are built using Scala, Akka, C++, Java etc.
Why Enterprises are looking at these patterns?
- Enterprises have started adopting cloud (public/private) by and large. Elastic scaling means responding to the changing load pattern on real time basis. In the absence of application ability to scale, all that investment into cloud is waste
- The traditional CPU/Core based licensing models are proving to be costly in the cloud world. Enterprise need to adopt the OSS frameworks/solutions that do not come with licensing baggage
- The user expectations have gone up by few notches. Enterprise applications are compared and expected to have same resiliency & performance as any other Internet Scale application
- Scale up model(adding more CPUs) for enterprise applications comes with its own headache of additional capital expenditure. Scale out models(add more smaller servers) that rely on commodity servers or can run over a hybrid cloud are becoming more cost effective solutions
- Enterprises are going global which requires the systems to be up and available 24X7. Currently tightly coupled application designs do not lend very well to the scalability pattern
Reference: Application Scalability: Still elusive for Enterprises Apps from our JCG partner Munish K Gupta at the Tech Spot blog.