DSL based approach to input Graph data in graph theory based java programs
Most of us have coded some programs which deal with graph theory algorithms like finding the shortest path between two vertices, finding the minimum spanning tree for a given graph and so on. In each of these algorithms the programatic way to represent the graph is to used either adjacency matrix or adjacency list. Both of them are not very intuitive way of defining the graph input. The adjacency matrix for example can lead to errors if the entries are not made in the right columns and rows. Moreover at run time you are not very sure which row/column represents which edge and things get even more complicated when it comes to input for a graph with large number of vertices.
During my engineering studies I had implemented quite a few graph algorithms in Java and in all of them I had nested for-loops for taking the adjacency matrix input. Recently when I was reading Martin Fowlers DSL book, I hit upon the idea of create a DSL for providing graph input i.e a DSL which would allow the user to specify the vertices, edges and their weights. I picked the graph algorithms which I had implemented and just stripped of the adjacency matrix input and instead made use of the DSL which I created. The algorithm worked like a charm.
In this post I will show the valid syntax for the DSL by taking different graph inputs and showing the DSL for them. Then I will show you the library which I created which consists of the Semantic Model of the Graph, the parser and lexer of the DSL and a simple builder API which populates the semantic model from the DSL script. The parser and lexer were generated by using ANTLR and hence this library requires ANTLR Jar to be available in the classpath. And finally I will show how this DSL can be used to find the minimum spanning tree using Kruskals Algorithm.
DSL Syntax and some examples
The DSL for the below graph (g1):

1 2 3 4 5 6 | Graph { A1 -> B2 (12.3) B2 -> C3(0.98) C3->D4 (2) D4 ->E5 (12.45) } |
Note that there are different spaces between the elements in the DSL above. This is just to show the different ways in which the DSL can be written.
The DSL for the below graph(g2) would be:

1 2 3 4 5 6 | Graph{ A1 -> B2 (12.3) B2 -> C3 (0.98) C3 -> D4 (2) E5 } |
Note that there is no space between ‘Graph’ and ‘{‘. This is just to show different ways it can be written.
The DSL for the below graph (g3) would be:

1 2 3 4 5 6 | Graph { A -> B (12.3) B -> C (0.98) C -> D (2) D -> E (12.45) } |
Now to show some of the invalid DSL scripts:
1 2 3 4 | Graph { 1A -> B (12.3) B -> C (0.98) } |
The above is invalid because the vertex name begins with a digit.
1 2 | Graph { } |
The above is invalid because the Graph expects atleast one vertex to be defined. But it can have zero or more edges.
Library for DSL based graph input
I have made use of ANTLR which does all the task of creating lexer and parser for the grammar which I defined for my DSL. This way I need not worry about creating parser or worry about creating tokens from the DSL input script.
The parser and lexer classes along with the semantic model classes are package into a jar and this jar along with the ANTLR jar have to be included to make use of writing a DSL for graph input.
The structure of the DSL jar can be seen in the screenshot below:

The classes in the graph package correspond to the semantic model i.e these are generic classes and can be used irrespecitve of whether someone is using a DSL or not. The classes in graph.dsl correspond to the ANTLR generated java classes for lexer and parser.
The grammar which is used by ANTLR for lexical analysis and parsing is:
01 02 03 04 05 06 07 08 09 10 11 12 13 14 | grammar Graph; graph: GRAPH_START (edge|vertex)+ GRAPH_END; edge: (vertex) TO (vertex) weight; vertex: ID; weight: '(' NUM ')' ; GRAPH_START : 'Graph' ([ ])* '{' ; GRAPH_END : '}' ; WEIGHT_START: '(' ; WEIGHT_END: ')' ; TO: '->' ; ID: ^[a-zA-Z][a-zA-Z0-9]*; NUM: [0-9]*[.]?[0-9]+; WS: [ \t\r\n]+ -> skip; |
The above grammar has scope for improvement but as my first attempt I have tried to keep it to this level.
- Download the DSL jar from here(GraphDSL.jar).
- Download the ANTLR jar from here(antlr-4.1-complete.jar).
Note: This DSL is developed using ANTLR Version 4.
A recommended book for external DSLs using ANTLR is Language Implementation Patterns: Create Your Own DomainSpecific and General Programming Languages
Kruskals Algorithm to find minimum spanning tree
The graph which is used to test this algorithm implementation is:

and the DSL for the same is:
01 02 03 04 05 06 07 08 09 10 11 12 13 | Graph { A -> B (7) B -> C (8) A -> D (5) B -> D (9) D -> E (15) D -> F (6) E -> F (8) E -> C (5) B -> E (7) E -> G (9) F -> G (11) } |
Lets look at the implementation:
001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 063 064 065 066 067 068 069 070 071 072 073 074 075 076 077 078 079 080 081 082 083 084 085 086 087 088 089 090 091 092 093 094 095 096 097 098 099 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 | package kruskalsalgo; import java.util.ArrayList; import java.util.Collections; import java.util.Scanner; import graph.Edge; import graph.Graph; import graph.Vertex; import graph.GraphBuilder; import java.io.IOException; import java.util.Comparator; public class KruskalsAlgorithm { public static void main(String[] args) throws IOException { //Load the graph data from the DSL Graph g = new GraphBuilder().buildGraph( "graph.gr" ); ArrayList<Set> forest = new ArrayList<Set>(); ArrayList<Edge> finalEdgeSet = new ArrayList<Edge>(); //Creating disjoint set of vertices which represents the initial forest for (Vertex v : g.getVertices()) { Set newSet = new Set(); newSet.getVertexList().add(v); forest.add(newSet); //Creating Disjoint Sets } //sort the edges in the graph based on their weight Collections.sort(g.getEdges(), new Comparator<Edge>(){ public int compare(Edge o1, Edge o2) { return o1.getWeight().compareTo(o2.getWeight()); } }); for (Edge edge : g.getEdges()) { //Find in which set the vertices in the edges belong int rep1 = Set.findRep(edge.getFromVertex(), forest); int rep2 = Set.findRep(edge.getToVertex(), forest); //If in different sets then merge them into one set and pick the edge. if (rep1 != rep2) { finalEdgeSet.add(edge); Set.Union(rep1, rep2, forest); } } System.out.println( "The Minimum Spanning tree is" ); for (Edge edge : finalEdgeSet) { System.out.println( "Vertex: " + edge.getFromVertex().getLabel() + " to Vertex: " + edge.getToVertex().getLabel()); } System.out.println( "" ); } //End of Main } class Set { private ArrayList<Vertex> vertexList; private int representative; static int count; public Set() { vertexList = new ArrayList<Vertex>(); this .representative = ++(Set.count); } //Find the set identifier in which the given vertex belongs to. public static int findRep(Vertex vertex, ArrayList<Set> forest) { int rep = 0 ; for (Set set : forest) { for (Vertex v : set.getVertexList()) { if (v.getLabel().equals(vertex.getLabel())) { return set.getRepresentative(); } } } return rep; } //Find the set given the step identifier. public static Set findSet( int rep, ArrayList<Set> forest) { Set resultSet = null ; for (Set set : forest) { if (set.getRepresentative() == rep) { return set; } } return resultSet; } //Merge the set into another and remove it from the main set. public static void Union( int rep1, int rep2, ArrayList<Set> forest) { Set set1 = Set.findSet(rep1, forest); Set set2 = Set.findSet(rep2, forest); for (Vertex v : set2.getVertexList()) { set1.getVertexList().add(v); } forest.remove(set2); } public ArrayList<Vertex> getVertexList() { return vertexList; } public int getRepresentative() { return representative; } } |
The above code loads the graph data from the dsl- graph.gr. The DSL scripts have to be placed in the resources package so that the DSL library can locate it.
The output for the above code:
1 2 3 4 5 6 7 | The Minimum Spanning tree is Vertex: A to Vertex: D Vertex: E to Vertex: C Vertex: D to Vertex: F Vertex: A to Vertex: B Vertex: B to Vertex: E Vertex: E to Vertex: G |
and showing the same diagramatically

Some suggestions: wouldn’t it be better to choose something else than Set as class name? Also, VertexList should be a List, not an ArrayList.