-
Software Development
Neo4j: Cypher – Detecting duplicates using relationships
I’ve been building a graph of computer science papers on and off for a couple of months and now that…
Read More » -
Software Development
Neo4j vs Relational: Refactoring – Extracting node/table
In my previous blog post I showed how to add a new property/field to a node with a label/record in…
Read More » -
Software Development
Neo4j: A procedure for the SLM clustering algorithm
In the middle of last year I blogged about the Smart Local Moving algorithm which is used for community detection…
Read More » -
Clojure
Clojure: First steps with reducers
I’ve been playing around with Clojure a bit today in preparation for a talk I’m giving next week and found…
Read More » -
Enterprise Java
Neo4j: Specific relationship vs Generic relationship + property
For optimal traversal speed in Neo4j queries we should make our relationship types as specific as possible. Let’s take a…
Read More » -
Enterprise Java
Hadoop: HDFS – java.lang.NoSuchMethodError: org.apache.hadoop.fs.FSOutputSummer.(Ljava/util/zip/Checksum;II)V
I wanted to write a little program to check that one machine could communicate a HDFS server running on the…
Read More » -
Software Development
R: Querying a 20 million line CSV file – data.table vs data frame
As I mentioned in a couple of blog posts already, I’ve been exploring the Land Registry price paid data set…
Read More » -
Software Development
SparkR: Add new column to data frame by concatenating other columns
Continuing with my exploration of the Land Registry open data set using SparkR I wanted to see which road in…
Read More » -
Software Development
Unix: Redirecting stderr to stdout
I’ve been trying to optimise some Neo4j import queries over the last couple of days and as part of the…
Read More »