-
Software Development
R: Vectorising all the things
After my last post about finding the distance a date/time is from the weekend Hadley Wickham suggested I could improve…
Read More » -
Software Development
R: Time to/from the weekend
In my last post I showed some examples using R’s lubridate package and another problem it made really easy to…
Read More » -
Software Development
R: Cleaning up and plotting Google Trends data
I recently came across an excellent article written by Stian Haklev in which he describes things he wishes he’d been…
Read More » -
Software Development
R: Applying a function to every row of a data frame
In my continued exploration of London’s meetups I wanted to calculate the distance from meetup venues to a centre point…
Read More » -
Scala
Spark: Write to CSV file
A couple of weeks ago I wrote how I’d been using Spark to explore a City of Chicago Crime data…
Read More » -
Scala
Spark: Write to CSV file with header using saveAsFile
In my last blog post I showed how to write to a single CSV file using Spark and Hadoop and…
Read More » -
Scala
Spark: Parse CSV file and group by column value
I’ve found myself working with large CSV files quite frequently and realising that my existing toolset didn’t let me explore…
Read More » -
Enterprise Java
Neo4j: Cypher – Avoiding the Eager
Although I love how easy Cypher’s LOAD CSV command makes it to get data into Neo4j, it currently breaks…
Read More » -
Software Development
Conceptual Model vs Graph Model
We’ve started running some sessions on graph modelling in London and during the first session it was pointed out that…
Read More »