Site icon Benji's Blog

Joins in pure-java database queries

I’ve been thinking about how to express joins in my pure Java query library.

Hibernate and other ORMs allow you to map collection fields via join tables. They will take care of cascading inserts and deletes from multiple tables for you.

While this is nice and easy, it can to cause problems with bigger object graphs because it hides the performance implications of what it is going on from you. You may appear to be saving a single object but that might translate into modifying millions of rows.

There is still more complexity, as you can map the same relational structure into different Java datatypes. Different hibernate collections have different performance characteristics. Sets have a query performance penalty from ensuring uniqueness, Lists have an ordering penalty, and bags have a persistence cost from recreating the collection on save.

This complexity makes me think that transparently doing joins and mapping collection properties is not a good idea at least without a lot of thought.

The nice thing about what I had so far was that there were no Strings needed (Except for values). All properties were referred to as Java properties. There was also no code generation needed (Unlike JOOQ).

This is what I’ve come up with for creating and querying many to many relationships. Let’s say we have Person and Conspiracy entities. Persisted as per my last post.

class Person {
	String getFirstName();
	String getLastName();
	// ...
}
class Conspiracy {
	String getName();	
	// ...
}

A person can be in multiple conspiracies, and conspiracies can have many people involved, so we need another relationship table in the database schema.

I’ve got the following code creating a conspiracy_person table with appropriate foreign keys. It’s reasonably nice. Unfortunately we have to have separate fieldLeft and fieldRight methods for fields referencing the right and left hand side of the relationship. It’s type erasure’s fault as usual. We can’t have a methods that differ only by their generic type parameters.

create(relationship(Conspiracy.class, Person.class))
    .fieldLeft(Conspiracy::getName)
    .fieldRight(Person::getFirstName)
    .fieldRight(Person::getLastName)
    .execute(this::openConnection);

Equivalent to

CREATE TABLE IF NOT EXISTS conspiracy_person ( 
    conspiracy_name text, 
    person_first_name text, 
    person_last_name text 
); 

We can delete in the same manner as we did with simple tables. Again I can’t see a way to avoid having left/right in the method names.

delete(relationship(Conspiracy.class, Person.class))
    .whereLeft(Conspiracy::getName)
    .equalTo("nsa")
    .andRight(Person::getLastName)
    .equalTo("smith")
    .execute(this::openConnection);

Equivalent to

DELETE FROM conspiracy_person WHERE conspiracy_name = ? AND person_last_name = ?;

Now for saving a collection. Saving all the items in a collection at once is something we’re likely to want to do. We can do the following and for each person in our conspiracy, persist the relationship between the conspiracy and the person. This is going to be slow, but at least we’ve made it obvious what we’re doing.

nsa.getMembers().forEach(agent -> {
    insert(nsa, agent)
        .valueLeft(Conspiracy::getName)
        .valueRight(Person::getLastName)
        .valueRight(Person::getFirstName)
        .execute(this::openConnection);
});

Equivalent to repeated

INSERT INTO conspiracy_person (
    conspiracy_name, 
    person_last_name, 
    person_first_name
) VALUES ( ?, ?, ? )

Finally, let’s query using our new join tables.

Mapper personMapper = Mapper.mapper(Person::new)
    .set(Person::setFirstName)
    .set(Person::setLastName)
    .set(Person::setFavouriteNumber);

Optional person = from(Person.class)
    .where(Person::getLastName)
    .equalTo("smith")
    .join(relationship(Conspiracy.class, Person.class).invert())
    .using(Person::getFirstName, Person::getLastName)
    .join(Conspiracy.class)
    .using(Conspiracy::getName)
    .where(Conspiracy::getName)
    .equalTo("nsa")
    .select(personMapper, this::openConnection);

This generates and executes the following SQL, and returns us an instance of a Person.

SELECT * FROM person 
JOIN conspiracy_person 
ON person.first_name = conspiracy_person.person_first_name 
AND person.last_name = conspiracy_person.person_last_name 
JOIN conspiracy 
ON conspiracy_person.conspiracy_name = conspiracy.name 
WHERE person.last_name = ? 
AND conspiracy.name = ?

There are some slightly clunky things. You have to specify any “where” conditions before joining to another table. This is so the type system can help you pass properties on the correct type. In the above example the first where call that takes a Person::getLastName would fail to compile if passed a Conspiracy::getName. This annoyingly means the query reads in a different order to the SQL generated.

You can also only join on tables in an chainable order. If you are querying table A you can only join to table B if there is a key to join to A with. If you query table A, then join to B, you can not then join to another table that could join to A. This is obviously a problem for many queries and needs addressing, but type erasure has made my head hurt enough for one afternoon :)

There’s lots of other things to think about too. e.g. More complex boolean join conditions. More complex join tables; One-to-many relationships; Pluggable naming conventions/Database support.

The code is on github here.