Software Development

Why You Need a Strategic Data Service

It’s no longer even a question that data is a strategic advantage. Every business is a data business now, and it’s no longer sufficient to store and archive data, you need to be able to act on it: protect, nurture, develop, buy and sell it. Billion-dollar businesses are built around it. But many businesses are running into the reality that their legacy platforms are not built to treat data as such a valuable asset. We continually see companies that are boxed out of opportunities because of software design decisions made years ago without the foresight to anticipate this trend.

If you refer back to classic software design principles and best practices you’ll see blueprints for building data layer abstractions and compartmentalizing data functionality from the rest of the system. Yet to this day I see developers questioning why these abstractions are needed–wondering what the payoff is. But the day of reckoning is either here or approaching fast for most companies, and if you don’t have properly constructed data architecture it won’t be capable of supporting the business as it responds to this transition.

Based on what I’ve seen, here are my observations on why data services are necessary for just about every business today.
 
data-service-1

Multiple Data Stores are Key

One of the main reasons why any software abstraction exists is to allow you to easily swap one component out for another. You may outgrow a database or realize new business requirements that are outside the capabilities of your current solution, and have to switch. Writing your software to an interface whose underlying implementation can be swapped out allows you to do this. This is called decoupling and it’s just good software design.

data-service-2

But we’re entering a world of data store specialization now. Different data stores have unique reasons for being, and they’re good at different things. Some exist for the sole purpose of storing very specific types of data and doing specific things with it. Eventually you’ll probably want or need to use those unique capabilities as a competitive advantage or even key value proposition. We’re seeing a diversification of data sources in the marketplace, particularly in the open source world. Data stores have specialties now. And you will probably want to use more than one of them at some point, if not now.
 
data-service-3

 

forward-message-dialog-box-150x150

A perfect example use case for this is a message in a social network. This piece of data has a number of potential uses, and not all of them are easily achievable using a single data store. But that’s ok, because you’re decoupled (right?). Now you can record the message in your social graph database so that you can cluster users by interest and predict relationships. You can search for the message later, after you’ve written it to your distributed search data store, which is perfect for that. And you can do analytics, trending, and dashboards on top of your relational database, which holds the output of your machine learning models.

Aside from just features, from a technical perspective you’ll often have to trade off between the consistency, availability, and partition tolerance from the CAP algorithm. So far, no one data store has been able to have its cake and eat it too–but with a Data Service, you CAN.

Service with a Smile…

A properly built data abstraction layer will probably end up being a stateful service (as opposed to stateless services, which don’t really do anything on its own). These services stand alone in your architecture, components that are capable of talking with other components – and having their own behavior. This comes in VERY handy when dealing with data. For example, some data stores will require you to verify write persistence after the fact if you care about availability. If your service stands on its own it can do this work at the appropriate time, transparently to whatever or whomever is using it. Or, you might want to mine the data as it comes in by having the Data Service pipe the data to machine learning models to categorize it or do sentiment analysis. Maybe you want to look up customer demographic data in Census data based on their location and predict income level using that information.
 
data-service-4

For implementing this type of data-related behavior, I’m a huge fan of using an actor system in a Data Service. Your Data Service can host an actor system (or whatever executes your workflow logic) to handle your entire data workflow–ensuring availability, mining, transmitting, whatever you need to do with it. You will eventually want to take data you receive and enrich it (if you don’t already today): geolocate it, classify it, compute on it, raise alerts, and so on. For example, you may want to take transactional data as it comes into the system and roll it up at different intervals so that you can run machine learning models on it to predict future trends. This is the perfect place to do it.

The brains of your data service doesn’t have to be an actor model, there are plenty of other options out there for carrying out data work. Hadoop is a classic example, but newcomers like Spark and Storm will accomplish many of the same things. Most of these frameworks have hooks available to extend them, which is super important if they’re going to serve you well into the future. (Again, thought, the face that it’s compartmentalized into a Data Service will let you use even more than one of these if you need to.) The key is that the data processing and workflow should be controlled and orchestrated by the Data Service itself–the users of the Data Service shouldn’t need to worry about what happens to the data, they should just have the ability to get the data in and read it back out in some form.

If you like it then you shoulda put an API on it

Want to be a pure-play data company? These are the companies who only provide a public API and don’t have to support a complex user interface. Having a properly-built Data Service allows you to do this very easily. Many application frameworks will let you turn an interface into a standards-compliant REST API with almost zero work. Just stand up the service in a Web server and let the framework look at it and turn it into an API. Even if your company isn’t selling the API outright your customers will surely love it, if not demand it.

data-service-5

It’s always disappointing to see companies building an API as a separate project when they could have had it almost for nothing. It’s an indication of a code base that wasn’t properly built in the first place–technical debt that has to be addressed before the business can move forward.

The End?

Certainly these are not the only reasons to locate your Data Service centrally in your architectural blueprint. (But seriously, you need more reasons?) I’d love to hear your comments and thoughts on this in the Hacker News thread.

Reference: Why You Need a Strategic Data Service from our JCG partner Jason Kolb at the Jason Kolb blog blog.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Martin Tapp
Martin Tapp
10 years ago

Nice article! BTW, CAP is a theorem, not an algorithm.

Back to top button