Software Development

Time to break free from the SQL-92 mindset

Are you stuck in the 90s?

If you are only using the SQL-92 language reference, then you are overlooking so many great features like:

Some test data

In my previous article I imported some CSV Codahale metrics into PostgreSQL for further analysis.

Our time series data consists of a total request count and a data recording time stamp:

countt
1511398778568
1691398778583
1691398778598
1691398778613
1691398778628
1691398778643
15871398778658
64731398778673
119851398778688

Arrival velocity

I want to calculate the arrival velocity which can be defined as:

λ = arrival_velocity = Δcount / Δt

For each time event we need to subtract the current and previous count and time stamp values.

Window functions allow us to aggregate/reference previous/next rows without restricting the SELECT clause to a single result row:

SELECT
	t as "Current time stamp",
	prev_t as "Previous time stamp",
	current_count as "Current total request count",
	prev_count as "Previous total request count",
	ROUND(((current_count - prev_count)::numeric/(t - prev_t)::numeric), 3) as "Velocity [req/sec]"	
FROM (
	SELECT
		t,
		lag(t, 1) over () as prev_t,				
		count as current_count,
		lag(count, 1) over () as prev_count
	FROM
		connection_lease_millis
	WINDOW grouping AS (			
		ORDER BY t
		ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
	)
) raw_data

Giving us the arrival velocity:

Current time stampPrevious time stampCurrent total request countPrevious total request countVelocity [req/sec]
1398778568151
139877858313987785681691511.200
139877859813987785831691690.000
139877861313987785981691690.000
139877862813987786131691690.000
139877864313987786281691690.000
13987786581398778643158716994.533
1398778673139877865864731587325.733
13987786881398778673119856473367.467

Arrival acceleration

But what if we want to calculate the arrival acceleration (e.g. so we can figure out how the arrival rate flactuates), which is

arrival_acceleration = Δarrival_velocity/ Δt

This is how we can do it:

SELECT
	t as "Current time stamp",
	prev_t as "Previous time stamp",
	velocity "Velocity [Req/sec]",
	ROUND((velocity - lag(velocity, 1) over ())::numeric/(t - prev_t)::numeric, 3) as "Acceleration [req/sec2]"		
FROM (
	SELECT
		t,
		prev_t,
		current_count,
		prev_count,
		ROUND(((current_count - prev_count)::numeric/(t - prev_t)::numeric), 3) as velocity	
	FROM (
		SELECT
			t,
			lag(t, 1) over () as prev_t,				
			count as current_count,
			lag(count, 1) over () as prev_count
		FROM
			connection_lease_millis
		WINDOW grouping AS (			
			ORDER BY t
			ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
		)
	) raw_data	
) velocity_data

Giving us:

Current time stampPrevious time stampVelocity [Req/sec]Acceleration [req/sec2]
1398778568
139877858313987785681.200
139877859813987785830.000-0.080
139877861313987785980.0000.000
139877862813987786130.0000.000
139877864313987786280.0000.000
1398778658139877864394.5336.302
13987786731398778658325.73315.413
13987786881398778673367.4672.782

Giving us a nice overview over the arrival rate distribution:

arrival_velocity_acceleration1

Conclusion

SQL has more to offer than the standard aggregate functions. The window functions allow you to group rows while still retaining the select criteria.

How many of you are still using the 1.0 versions of Java, C# or Python? Shouldn’t we benefit from the latest SQL features the same way we do with any other programming language we use on a daily basis?

In case you’re still skeptic, this great article may shatter your doubts.

 

Vlad Mihalcea

Vlad Mihalcea is a software architect passionate about software integration, high scalability and concurrency challenges.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button