Saturday, January 07, 2012

equals and hashCode on @Entity classes: just say no

I've come to the conclusion that you should avoid defining equals and hashCode on JPA @Entity objects unless you have a really good reason to.

Doing it right is non-trivial, with all the gotchas and caveats.  First, the logistics:
  • You can't use the primary key because multiple unsaved objects all have a null PK and would be "equal".  
  • Using the PK with object identity as fallback means an object won't be equal to itself before and after being persisted, and hash-based collections won't work properly if the hash code changes during the collection's lifespan.
  • A unique business key could work, if that's an option - but you have to remember to maintain the equals and hashCode method if those properties or relations change.  Also if that key isn't immutable, you have a similar problem as with PK + fallback above.
  • Creating a UUID in the constructor just to make equals and hashcode work is... yucky.
  • Often these methods are insufficiently covered by unit tests.
Second: there's the deeper question of what should "equals" mean in the context of a mutable business entity?  What comparison makes sense is often context-dependent.  Sometimes you might want to compare based on primary key, other times you might want to compare on some other property.

So what's the worst thing that happens if you just don't do anything, and take the default based on object identity?

Usually, you won't even miss these methods.  The default implementation will work fine unless you're trying to compare objects loaded in two different persistence sessions/transactions,with either equals() or contains().   I've found this is the exception case - much of my collection manipulation in practice is manipulating objects all loaded in the same session to hand off to the view layer, so HashSet recognizes duplicates and contains() works just fine.  And I almost never use an entity as a key in a hashmap.

The price you pay is more verbosity when you do need to compare objects between a session-scoped cache and the current request.  You have to remember that obj1.equals(obj2) doesn't work, and needs to be replaced with obj1.getId().equals(obj2.getId()).  Contains is a bit more verbose, and a utility method like containsById(collection, obj) might be helpful.  Some people will say it's confusing that equals doesn't "just work" but I find it less confusing to be explicit about what you're comparing on - and less confusing than a broken or unmaintained equals() method.

Finally, if this were C#, we wouldn't even have this discussion.  With closures and LINQ extension methods we would just say "collection.Where(obj => obj.id = foo.id)" and be done with it!

Related links:
http://community.jboss.org/wiki/EqualsAndHashCode

http://onjava.com/pub/a/onjava/2006/09/13/dont-let-hibernate-steal-your-identity.html?page=3

http://stackoverflow.com/questions/1929445/to-equals-and-hashcode-or-not-on-entity-classes-that-is-the-question

http://stackoverflow.com/questions/1638723/equals-and-hashcode-in-hibernate

https://forum.hibernate.org/viewtopic.php?t=928172

http://www.ibm.com/developerworks/java/library/j-jtp05273/index.html: "For mutable objects, the answer is not always so clear. Should equals() and hashCode() be based on the object's identity (like the default implementation) or the object's state (like Integer and String)? There's no easy answer -- it depends on the intended use of the class... It is not common practice to use a mutable object like a List as a key in a HashMap."

http://burtbeckwith.com/blog/?p=53
Post a Comment