Published: 14 Jul 2008
By: Dino Esposito

Dino Esposito discusses object equality.

Contents [hide]

Introduction

The .NET Framework exists since early 2002, not to mention pre-release versions which have been available for two years before official release. In the past six years, five different versions of the .NET Framework succeeded one after the next. Six years in software development is like a geological era; nonetheless, in all this time I never cared about the Equals method of a .NET object.

As many of you, in six years I probably wrote thousands of lines of code where I actually compared .NET objects for equality. And most of the time I did that through the equality operator in both C# and Visual Basic. Only a few times, I decided to use the Equals method to check two objects for equality. The equality operator (= or ==) is not necessarily the same as Equals but, at the end of the day, it worked well for me for over six years.

So comparing .NET objects for equality has never been a problem for me; and I never felt the need to override the method Equals. Not until recently, at least. It happened that I was involved in the creation of a custom object model to represent the domain of a problem. At some point, we were to implement a caching mechanism and needed to know whether a given object just fetched out of the database had a match in the cache. Clearly, taken as individual instances of some type, the two objects were different objects. However, looking at the real content stored in each they could be the same object. How would you check which is the case?

The simple answer entails checking all the values in the members of the object. This approach works and is the only possible approach to the problem. The key point is all another: how to make it simple to code and easy to use for developers? Enter the Equals method and its surroundings.

Background on Equals

The Equals method is defined on the System.Object class and it determines whether the specified object is equal to the current object. Here’s the method’s signature:

The Equals method on the System.Object class has a relatively general implementation that only looks at the memory address of the objects being compared. If the objects happen to have the same address they are considered the same object and Equals returns true. This behavior is also mapped to the C# == operator and, honestly, it represents an approach that works well in the vast majority of situations.

If we delve deeper into details, though, we find out a couple of interesting remarks. First, the Equals method is designed as a tool for checking object equality. It is not really a tool to check object identity. To check object identity (that is, the correspondence between respective memory address) you are better off using the static ReferenceEquals method. As you can see, the implementation of the method couldn’t be clearer:

The method Equals is designed to be overridden in derived classes to provide a class-specific test for equality that goes in full accordance with the class internal implementation. As an example, consider how the String class implements Equals. The code below is an excerpt from .NET Reflector:

The EqualsHelper method is an internal method that just compares all characters on a string. The method Equals is also explicitly bound to the == operator, as the excerpt below demonstrates:

So far so good; but what’s the whole point here? I want to be able to check my objects for equality using either the == operator or the Equals method. In doing so, I want to use a content-aware algorithm over which I have full control. As a result, I had to override the Equals method on all the classes in my object model where equality was a key factor.

Overriding Equals on a Custom Class

There are quite a few rules to fulfill when you override Equals on a custom class. In particular, your method implementation must guarantee that if the objects are not the same type, they are not recognized to be equal. In addition, if the object to compare is null, the two objects must be determined to be different. Finally, if the objects share the same memory address (i.e., they are the same instance), then they must be equal. Here’s some sample code for a class named Order:

In addition to these checks, you should implement your own logic for object equality. This logic largely depends on the class being involved and mostly consists on checking for value equality a few key properties. For an Order type, it may be sufficient that you check the OrderID property. In this case, you leverage your knowledge of the class and the underlying object model: the OrderID property represents a table primary key and two orders with the same ID are to be the same object. Here’s a revised version of the previous code:

Needless to say, the logic that determines object equality is up to you. You can make it as complex as you wish and need. In case of a domain model, it also depends on the business rules you have to build into the objects. For an Order class, it is undisputable that two objects with a different ID are different. But what if you have two Order instances with the same ID but different values for some other properties, such as OrderDate and Status? Should Equals return true or false? Again it depends on your view of the model and you’re using it. In general, I would feel a bit unsafe by leaving Equals return true if only the OrderID property matches. I would rather employ a more sophisticated logic in Equals to check the value of all properties and return true only when all properties match. At the same time, it might useful to know when two instances point to the same logical record—that is, when only the OrderID property matches. This can be obtained by defining an extra method, as below:

A method to check whether two instances of the same type refer to the same logical entity is not a far-fetched idea. In the implementation of the domain model, it can help you to quickly figure out whether the two instances refer to different states and versions of the same entity.

Hash Codes

If you override the Equals method on a custom class and then compile, you receive a warning if you omit to override also the GetHashCode method. Why is it so? What’s the inner relationship between the two methods?

In first place, there’s no relationship between the behavior of Equals and GetHashCode. A sort of subtle dependency exists between the two methods because of a guideline for the .NET Base Class Library: all objects in the .NET Framework should be easy to place and manage in a hash table. For this to happen, it is necessary that each object returns a hash code that refers to its content and state. Hash tables require that two contained objects return the same hash value if they’re equal. Here’s why a dependency comes to existence between Equals and GetHashCode. When you override Equals, the compiler expects that you override GetHashCode too in order to make the generation of the hash code respect the new algorithm for equality.

GetHashCode is expected to return an integer and it is your responsibility to build a unique integer value out of a bunch of properties, not necessarily of type string. A possibility is creating a string that includes unique information such as type, properties and values (i.e., a JSON serialization of the object) and returning the hash code of the resulting string.

Summary

Equals and GetHashCode are two methods on the System.Object class that not so many developers cared about for years. Working with domain models—either handmade models or models created using O/RM tools such as NHibernate—raises the need to use custom algorithms to check objects equality. In this article, I provided an overview of the issue and what it really means to override Equals and GetHashCode.

A larger explanation of the whys and wherefores of object equality in the .NET Framework is provided by Jeffrey Richter in his “Applied Microsoft .NET Framework Programming” book from Microsoft Press. A NHibernate-only perspective of the problem can be read here.

<<  Previous Article Continue reading and see our next or previous articles Next Article >>

About Dino Esposito

Dino Esposito is one of the world's authorities on Web technology and software architecture. Dino published an array of books, most of which are considered state-of-the-art in their respective areas. His most recent books are “Microsoft ® .NET: Architecting Applications for the Enterprise” and “...

This author has published 35 articles on DotNetSlackers. View other articles or the complete profile here.

Other articles in this category


Introduction to 3-Tier Architecture
Brian Mains explains the benefits of a 3-tier architecture.
Delegates in .NET
Rupesh Kumar Nayak explains delegates in .NET.
ORM in .NET 3.5
This article covers a general introduction to ORM concepts, the approach that .NET 3.5 takes, and ho...
Settings Manager for Windows Vista Sidebar Gadgets
SettingsManager is a JavaScript library that allows Windows Vista Sidebar gadgets to persist common ...
SuperToolTip
Office 2007 offers great new features, one of them is the SuperTooltip which provides much more info...

You might also be interested in the following related blog posts


5 Minute Overview of MVVM in Silverlight read more
EF4: Lazy Loading on By Default but what about pre Beta 2 Models? read more
The "Error creating window handle" exception and the Desktop Heap read more
Introducing Recurring Appointments for Web.UI Scheduler ASP.NET AJAX read more
Html Encoding Code Blocks With ASP.NET 4 read more
RELEASED ASP.NET MVC 2 Preview 2 read more
Spec Explorer: A Model-Based Testing tool read more
September's Toolbox Column Now Online read more
Some Tidbits of Entity Framework 4 in Visual Studio 2010 Beta 2 read more
Chat room questions from the EF Tips & Tricks webcast read more
Top
 
 
 

Discussion


Subject Author Date
placeholder Dino, Thanks for getting this critical knowledge out there Damon Carr 7/25/2008 5:38 PM

Please login to rate or to leave a comment.

Product Spotlight