One segment of the Horse Racing Information System I worked on in college involved finding horses from a given race that had raced against each other.
This is the more general example where I have a given horse and wish to find all the horses they have run against:
SELECT h2.horsename FROM horse h1 JOIN raceentrants r1 ON h1.horseid = r1.horseID JOIN raceentrants r2 ON r1.raceID = r2.raceID JOIN horse h2 ON h2.horseid = r2.horseID WHERE r1.horseID =1 AND h2.horsename NOT IN ( 'Horse A' );
My representation of the steps involved are:
- Select the correct horse from the Horse table
- Join this to the RaceEntrants table on horseId to find all the horse's races
- Join this to the RaceEntrants table on raceId to find all other entrants in these races
- Join this to the Horse table to find the other horses
- Make sure we do not return the original horse
If I really wantd just the names I could have used 'distinct' but it is more likely that in a ful database you would be returning on all horse-race combinations.
This is quite cumbersome but it gets worse ...
The following query is the similar to the query above but for all the horses in a given race. I preconfigured 'arace' as a view of all the horses to reduce the clunkiness.
SELECT h1.horsename AS Horse, h3.horsename AS Raced_Against, race.* FROM horse h1 JOIN arace a1 ON h1.horseid = a1.horseID JOIN raceentrants r1 ON a1.horseid = r1.horseID JOIN raceentrants r2 ON r1.raceID = r2.raceID JOIN raceentrants r3 ON r2.horseID = r3.horseID JOIN raceentrants r4 ON r4.raceID = r3.raceID JOIN arace a2 ON r4.horseID = a2.horseID JOIN horse h2 ON h2.horseid = a2.horseID JOIN horse h3 ON h3.horseid = r2.horseid JOIN race on race.raceID = r2.raceID WHERE h2.horsename <> h1.horsename AND r2.horseID NOT IN ('1','2')
That got scary quick and I thought I had noticed a mistake in line 9 but transitivity saved the day. I actually relish working on stuff this complex but, unfortunately, my processor does not.
Joins are functions that produce Cartesian Products. This means that growth is exponential. In contrast, with Neo4j, I can set up the database with relationships from horses to the races. Now I can add all of each horse's races to a set and use some variant of the following:
public Setcompare(Horse otherHorse){ TreeSet bothRaced = new TreeSet (); for (Race race : thisHorsesRaces){ if (otherHorse.hasRacedIn(race)){ bothRaced.add(race); } } return bothRaced; }
Nice examples! Thanks a lot for sharing.
ReplyDeleteDo you still use Neo4j? Would be interesting where you've gotten in they year in between.
Did you look into Cypher?