All You Ever Need to Know About Recursive SQL
Oracle SYNONYMs are a great feature. You can implement all sorts of backwards-compatibility tweaks simply by creating SYNONYMs in your database. Consider the following schema:
CREATE TABLE my_table (col NUMBER(7)); CREATE SYNONYM my_table_old FOR my_table; CREATE SYNONYM my_table_bak FOR my_table_old;
Now you can query your same old table through three different names, it’ll all result in the same output:
SELECT * FROM my_table; -- Same thing: SELECT * FROM my_table_old; SELECT * FROM my_table_bak;
The trouble is, when you see my_table_bak
in code (or some even more obfuscated name), do you immediately know what it really is?
Use this query to find out
We can use the ALL_SYNONYMS table to figure this one out. This query will already give a simple overview:
SELECT * FROM ALL_SYNONYMS WHERE TABLE_OWNER = 'PLAYGROUND'
The output is:
OWNER SYNONYM_NAME TABLE_OWNER TABLE_NAME --------------------------------------------------- PLAYGROUND MY_TABLE_BAK PLAYGROUND MY_TABLE_OLD PLAYGROUND MY_TABLE_OLD PLAYGROUND MY_TABLE
But as you can see, this is boring, because we have transitive synonyms in there and I don’t want to go through the complete table to figure out that MY_TABLE_BAK -> MY_TABLE_OLD -> MY_TABLE
.
So let’s use CONNECT BY!
Oracle (as well as Informix and CUBRID) have this awesome CONNECT BY
clause for hierarchical SQL. There is also the possibility to express hierarchical SQL using the more powerful common table expressions, if you dare.
But let’s see how we can transitively resolve our tables. Here’s how:
SELECT s.OWNER, s.SYNONYM_NAME, -- Get to the root of the hierarchy CONNECT_BY_ROOT s.TABLE_OWNER TABLE_OWNER, CONNECT_BY_ROOT s.TABLE_NAME TABLE_NAME FROM ALL_SYNONYMS s WHERE s.TABLE_OWNER = 'PLAYGROUND' -- The magic CONNECT BY clause! CONNECT BY s.TABLE_OWNER = PRIOR s.OWNER AND s.TABLE_NAME = PRIOR s.SYNONYM_NAME
First off, there is CONNECT BY
, which allows to “connect” hierarchies by their hierarchical predecessors. On each level of the hierarchy, we’ll connect the TABLE_NAME
with its previous (“PRIOR”) SYNONYM_NAME
. This will recurse as long as the chain doesn’t end (or if it runs into a cycle).
What’s also interesting is the CONNECT_BY_ROOT
keyword, which, for each path through the hierarchy, displays the root of the path. In our case, that’s the target TABLE_NAME
.
The output can be seen here:
OWNER SYNONYM_NAME TABLE_OWNER TABLE_NAME --------------------------------------------------- PLAYGROUND MY_TABLE_OLD PLAYGROUND MY_TABLE PLAYGROUND MY_TABLE_BAK PLAYGROUND MY_TABLE PLAYGROUND MY_TABLE_BAK PLAYGROUND MY_TABLE_OLD <-- Useless
If you’re confused by the records that are displayed, just add the LEVEL
pseudo-column to display the recursion level:
SELECT -- Add level here LEVEL, s.OWNER, s.SYNONYM_NAME, CONNECT_BY_ROOT s.TABLE_OWNER TABLE_OWNER, CONNECT_BY_ROOT s.TABLE_NAME TABLE_NAME FROM ALL_SYNONYMS s WHERE s.TABLE_OWNER = 'PLAYGROUND' CONNECT BY s.TABLE_OWNER = PRIOR s.OWNER AND s.TABLE_NAME = PRIOR s.SYNONYM_NAME
LEVEL OWNER SYNONYM_NAME TABLE_OWNER TABLE_NAME ---------------------------------------------------------- 1 PLAYGROUND MY_TABLE_OLD PLAYGROUND MY_TABLE 2 PLAYGROUND MY_TABLE_BAK PLAYGROUND MY_TABLE 1 PLAYGROUND MY_TABLE_BAK PLAYGROUND MY_TABLE_OLD ^^^^^^ Awesome!
Getting rid of “bad records” using START WITH
As you can see, some of the results are now synonyms pointing directly to the target table, whereas the last record still points to an intermediate element from the synonym path. This is because we’re recursing into the path hierarchies from every record in the table, also from the “intermediate” synonym references, whose TABLE_NAME
is yet another synonym.
Let’s get rid of those as well, using the optional START WITH
clause, which allows to limit tree traversals to those trees whose roots fulfil a given predicate:
SELECT s.OWNER, s.SYNONYM_NAME, CONNECT_BY_ROOT s.TABLE_OWNER TABLE_OWNER, CONNECT_BY_ROOT s.TABLE_NAME TABLE_NAME FROM ALL_SYNONYMS s WHERE s.TABLE_OWNER = 'PLAYGROUND' CONNECT BY s.TABLE_OWNER = PRIOR s.OWNER AND s.TABLE_NAME = PRIOR s.SYNONYM_NAME -- Start recursing only from non-synonym objects START WITH EXISTS ( SELECT 1 FROM ALL_OBJECTS WHERE s.TABLE_OWNER = ALL_OBJECTS.OWNER AND s.TABLE_NAME = ALL_OBJECTS.OBJECT_NAME AND ALL_OBJECTS.OWNER = 'PLAYGROUND' AND ALL_OBJECTS.OBJECT_TYPE <> 'SYNONYM' )
So, essentially, we’re requiring the TABLE_NAME
to be any object from ALL_OBJECTS
that is in our schema, but not a SYNONYM
. (yes, synonyms work for all objects, including procedures, packages, types, etc.)
Running the above query gets us the desired result:
OWNER SYNONYM_NAME TABLE_OWNER TABLE_NAME --------------------------------------------------- PLAYGROUND MY_TABLE_OLD PLAYGROUND MY_TABLE PLAYGROUND MY_TABLE_BAK PLAYGROUND MY_TABLE
What about PUBLIC synonyms?
Most often, you will not use local synonyms, though, but PUBLIC ones. Oracle has this quirky PUBLIC
pseudo-schema, in which you cannot create objects, but in which you can create synonyms. So, let’s create some more synonyms for backwards-compatibility purposes:
CREATE PUBLIC SYNONYM my_table_bak2 FOR my_table_bak; CREATE SYNONYM bak_backup_old FOR my_table_bak2;
Unfortunately, this will break our chain, because for some reason only Oracle and the Oracle of Delphi knows, PUBLIC
is well reported as a OWNER
of the synonym, but not as the TABLE_OWNER
. Let’s see some raw data with:
SELECT * FROM ALL_SYNONYMS WHERE TABLE_OWNER = 'PLAYGROUND'
… and thus:
OWNER SYNONYM_NAME TABLE_OWNER TABLE_NAME ------------------------------------------------------ PLAYGROUND MY_TABLE_OLD PLAYGROUND MY_TABLE PLAYGROUND MY_TABLE_BAK PLAYGROUND MY_TABLE_OLD PUBLIC MY_TABLE_BAK2 PLAYGROUND MY_TABLE_BAK PLAYGROUND BAK_BACKUP_OLD PLAYGROUND MY_TABLE_BAK2 <-- Not PUBLIC
As you can see, the PUBLIC SYNONYM MY_TABLE_BAK2
is reported to be in the PLAYGROUND
schema! This breaks recursion, of course. We’re missing a record:
OWNER SYNONYM_NAME TABLE_OWNER TABLE_NAME ------------------------------------------------------ PLAYGROUND MY_TABLE_OLD PLAYGROUND MY_TABLE PLAYGROUND MY_TABLE_BAK PLAYGROUND MY_TABLE PUBLIC MY_TABLE_BAK2 PLAYGROUND MY_TABLE <-- Hmm?
In order to work around this issue, we’ll have to tweak our original data set. Any object reported as (TABLE_OWNER, TABLE_NAME)
might in fact be a synonym called ('PUBLIC', TABLE_NAME)
. The trick is thus to simply duplicate all input data as such:
SELECT s.OWNER, s.SYNONYM_NAME, CONNECT_BY_ROOT s.TABLE_OWNER TABLE_OWNER, CONNECT_BY_ROOT s.TABLE_NAME TABLE_NAME -- Tweaked data set FROM ( SELECT OWNER, SYNONYM_NAME, TABLE_OWNER, TABLE_NAME FROM ALL_SYNONYMS UNION ALL SELECT OWNER, SYNONYM_NAME, 'PUBLIC', TABLE_NAME FROM ALL_SYNONYMS ) s -- Add the synthetic PUBLIC TABLE_OWNER as well WHERE s.TABLE_OWNER IN ( 'PLAYGROUND', 'PUBLIC' ) CONNECT BY s.TABLE_OWNER = PRIOR s.OWNER AND s.TABLE_NAME = PRIOR s.SYNONYM_NAME START WITH EXISTS ( SELECT 1 FROM ALL_OBJECTS WHERE s.TABLE_OWNER = ALL_OBJECTS.OWNER AND s.TABLE_NAME = ALL_OBJECTS.OBJECT_NAME AND ALL_OBJECTS.OWNER = 'PLAYGROUND' AND ALL_OBJECTS.OBJECT_TYPE <> 'SYNONYM' )
There it is, our missing record!
OWNER SYNONYM_NAME TABLE_OWNER TABLE_NAME --------------------------------------------------- PLAYGROUND MY_TABLE_OLD PLAYGROUND MY_TABLE PLAYGROUND MY_TABLE_BAK PLAYGROUND MY_TABLE PUBLIC MY_TABLE_BAK2 PLAYGROUND MY_TABLE PLAYGROUND BAK_BACKUP_OLD PLAYGROUND MY_TABLE <-- Yep!
Displaying the hierarchy
There is also a quirky function called SYS_CONNECT_BY_PATH
, which can be used to actually display the whole hierarchy in a string form (VARCHAR2, with max 4000 characters!). Here’s how:
SELECT -- Magic function SUBSTR( sys_connect_by_path( s.TABLE_OWNER || '.' || s.TABLE_NAME, ' <- ' ) || ' <- ' || s.OWNER || '.' || s.SYNONYM_NAME, 5 ) FROM ( SELECT OWNER, SYNONYM_NAME, TABLE_OWNER, TABLE_NAME FROM ALL_SYNONYMS UNION ALL SELECT OWNER, SYNONYM_NAME, 'PUBLIC', TABLE_NAME FROM ALL_SYNONYMS ) s WHERE s.TABLE_OWNER IN ( 'PLAYGROUND', 'PUBLIC' ) CONNECT BY s.TABLE_OWNER = PRIOR s.OWNER AND s.TABLE_NAME = PRIOR s.SYNONYM_NAME START WITH EXISTS ( SELECT 1 FROM ALL_OBJECTS WHERE s.TABLE_OWNER = ALL_OBJECTS.OWNER AND s.TABLE_NAME = ALL_OBJECTS.OBJECT_NAME AND ALL_OBJECTS.OWNER = 'PLAYGROUND' AND ALL_OBJECTS.OBJECT_TYPE <> 'SYNONYM' )
The above query will now output the following records:
PLAYGROUND.MY_TABLE <- PLAYGROUND.MY_TABLE_OLD PLAYGROUND.MY_TABLE <- PLAYGROUND.MY_TABLE_OLD <- PLAYGROUND.MY_TABLE_BAK PLAYGROUND.MY_TABLE <- PLAYGROUND.MY_TABLE_OLD <- PLAYGROUND.MY_TABLE_BAK <- PUBLIC.MY_TABLE_BAK2 PLAYGROUND.MY_TABLE <- PLAYGROUND.MY_TABLE_OLD <- PLAYGROUND.MY_TABLE_BAK <- PUBLIC.MY_TABLE_BAK2 <- PLAYGROUND.BAK_BACKUP_OLD
Impressive, eh?
Remark: In case you have stale synonyms
If you have “stale” synonyms, i.e. synonyms that point to nowhere, Oracle may report them to be pointing to themselves. That’s unfortunate and creates a CYCLE in CONNECT BY
. To prevent this from happening, simply add another predicate like so:
SELECT SUBSTR( sys_connect_by_path( s.TABLE_OWNER || '.' || s.TABLE_NAME, ' <- ' ) || ' <- ' || s.OWNER || '.' || s.SYNONYM_NAME, 5 ) FROM ( SELECT * FROM ( SELECT OWNER, SYNONYM_NAME, TABLE_OWNER, TABLE_NAME FROM ALL_SYNONYMS UNION ALL SELECT OWNER, SYNONYM_NAME, 'PUBLIC', TABLE_NAME FROM ALL_SYNONYMS ) s -- Add this predicate to prevent cycles WHERE (s.OWNER , s.SYNONYM_NAME) != ((s.TABLE_OWNER , s.TABLE_NAME)) ) s CONNECT BY s.TABLE_OWNER = PRIOR s.OWNER AND s.TABLE_NAME = PRIOR s.SYNONYM_NAME START WITH EXISTS ( SELECT 1 FROM ALL_OBJECTS WHERE s.TABLE_OWNER = ALL_OBJECTS.OWNER AND s.TABLE_NAME = ALL_OBJECTS.OBJECT_NAME AND ALL_OBJECTS.OWNER = 'PLAYGROUND' AND ALL_OBJECTS.OBJECT_TYPE <> 'SYNONYM' )
Can the above query be written in jOOQ?
Yes of course. In jOOQ, pretty much everything is possible, if you can write it in SQL. Here’s how we use a query similar to the above to resolve Oracle Synonmys in the jOOQ code generator:
// Some reusable variables AllObjects o = ALL_OBJECTS; AllSynonyms s1 = ALL_SYNONYMS; AllSynonyms s2 = ALL_SYNONYMS.as("s2"); AllSynonyms s3 = ALL_SYNONYMS.as("s3"); Field<String> dot = inline("."); String arr = " <- "; // The actual qeury DSL .using(configuration) .select( s3.OWNER, s3.SYNONYM_NAME, connectByRoot(s3.TABLE_OWNER).as("TABLE_OWNER"), connectByRoot(s3.TABLE_NAME).as("TABLE_NAME"), substring( sysConnectByPath( s3.TABLE_OWNER.concat(dot) .concat(s3.TABLE_NAME), arr ) .concat(arr) .concat(s3.OWNER) .concat(dot) .concat(s3.SYNONYM_NAME), 5 )) .from( select() .from( select( s1.OWNER, s1.SYNONYM_NAME, s1.TABLE_OWNER, s1.TABLE_NAME) .from(s1) .union( select( s1.OWNER, s1.SYNONYM_NAME, inline("PUBLIC"), s1.TABLE_NAME) .from(s1)) .asTable("s2")) .where(row(s2.OWNER, s2.SYNONYM_NAME) .ne(s2.TABLE_OWNER, s2.TABLE_NAME)) .asTable("s3")) .connectBy(s3.TABLE_OWNER.eq(prior(s3.OWNER))) .and(s3.TABLE_NAME.eq(prior(s3.SYNONYM_NAME))) .startWith(exists( selectOne() .from(o) .where(s3.TABLE_OWNER.eq(o.OWNER)) .and(s3.TABLE_NAME.eq(o.OBJECT_NAME)) .and(o.OBJECT_TYPE.ne("SYNONYM")) .and(o.OWNER.in(getInputSchemata())) )) .fetch();
Download jOOQ today and try it yourself!
Conclusion
If you have an intrinsically hierarchical data set, then you will be very unhappy with these simplistic hierarchical SQL features (also with commont table expressions). They don’t perform very well, and they’re very hard to express if hierarchies get more complex. So you may as well consider using an actual graph database like Neo4j.
But every now and then, a little hierarchy may sneak into your otherwise “standard” relational data model. When it does, be sure to have this useful CONNECT BY
clause ready for action.
CONNECT BY
is supported by (at least):
- CUBRID
- Informix
- Oracle
Recursive common table expressions (the SQL standard’s counterpart for CONNECT BY
are supported by (at least):
- DB2
- Firebird
- HSQLDB
- Oracle
- PostgreSQL
- SQL Server
- Sybase SQL Anywhere
and…
- H2 has some experimental support
In a future post, we’re going to be looking into how to do the same thing with recursive CTE.
Reference: | All You Ever Need to Know About Recursive SQL from our JCG partner Lukas Eder at the JAVA, SQL, AND JOOQ blog. |