Don’t forget that if you want to get my upcoming video courses with 50% discount, then fill the following form.
C# provides many ways to compare objects, not only to compare class instances, but also structures. Actually, there are so many ways, that it requires to put them in order. All those possible options of comparing confuse people in case of misunderstanding them and their possible implementations.
Let’s go. System.Object provides the following methods:
[code lang=”csharp”]
public static bool ReferenceEquals(object objA, object objB)
{
return objA == objB;
}
public static bool Equals(object objA, object objB)
{
return objA == objB || (objA != null && objB != null && objA.Equals(objB));
}
public virtual bool Equals(object obj)
{
return RuntimeHelpers.Equals(this, obj);
}
[/code]
And of course there is the equality operator:
[code lang=”csharp”]
public static bool operator == (Foo left, Foo right);
[/code]
Also, there is an option to inherit from IEquatable and IStructuralEquatable.
ReferenceEquals
ReferenceEquals compares two references. If references are identical it returns true. So, it actually means that it compares on identity, rather than on equality. In case of comparing to value-type instances by this method it always returns false. This is because value-types will be boxed, thus they will get different references.
It also important to mention a string comparison by this method. For example:
[code lang=”csharp”]
class Program
{
static void Main(string[] args)
{
string a = "Hello";
string b = "Hello";
if (object.ReferenceEquals(a, b))
Console.WriteLine("Same objects");
else
Console.WriteLine("Not the same objects");
Console.ReadLine();
}
}
[/code]
This program can output “Same objects”. This is because of strings interning. This is a little bit different story, so I’ll not touch this topic.
Static Equals
[code lang=”csharp”]
public static bool Equals(object objA, object objB)
[/code]
At first this method checks instances on identity and if objects are not identical, then it checks them on nulls and passes them for comparing to the virtual Equals method.
Virtual Equals
[code lang=”csharp”]
public virtual bool Equals(object obj)
[/code]
This method behaves itself exactly like ReferenceEquals. Although, it is overridden for value types and in System.ValueType looks like this:
[code lang=”csharp”]
public override bool Equals(object obj)
{
if (obj == null)
{
return false;
}
RuntimeType runtimeType = (RuntimeType)base.GetType();
RuntimeType left = (RuntimeType)obj.GetType();
if (left != runtimeType)
{
return false;
}
if (ValueType.CanCompareBits(this))
{
return ValueType.FastEqualsCheck(this, obj);
}
FieldInfo[] fields = runtimeType.GetFields(BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic);
for (int i = 0; i < fields.Length; i++)
{
object obj2 = ((RtFieldInfo)fields[i]).InternalGetValue(this, false);
object obj3 = ((RtFieldInfo)fields[i]).InternalGetValue(obj, false);
if (obj2 == null)
{
if (obj3 != null)
{
return false;
}
}
else
{
if (!obj2.Equals(obj3))
{
return false;
}
}
}
return true;
}
[/code]
Heaven forbid to use this implementation on a bug number of objects. BCL developers know nothing about our value-types, so they compare them using reflection. Its pretty obvious that this can lead to performance degradation. That’s why you should override this method for your value-types, because no one except you know how to logically compare them.
It’s not necessary to override this method for reference types if you don’t want to compare them by their internal field values.
Let’s look at an example of correct overriding of this method and after that implement IEquatable:
[code lang=”csharp”]
class Vehicle : IEquatable<Vehicle>;
{
protected int speed;
public int Speed
{
get { return this.speed; }
set { this.speed = value; }
}
protected string name;
public string Name
{
get { return this.name; }
set { this.name = value; }
}
public Vehicle() { }
public Vehicle(int speed, string name)
{
this.speed = speed;
this.name = name;
}
public override bool Equals(object other)
{
//Sequence of checks should be exactly the following.
//If you don’t check "other" on null, then "other.GetType()" further can
//throw NullReferenceException
if (other == null)
return false;
//If references point to the same address, then objects identity is
//guaranteed.
if (object.ReferenceEquals(this, other))
return true;
//If this type is on top of a class hierarchy, or just doesn’t have any
//inheritors, then you just can do the following:
//Vehicle tmp = other as Vehicle; if(tmp==null) return false;
//After that you can immediately call this.Equals(tmp)
if (this.GetType() != other.GetType())
return false;
return this.Equals(other as Vehicle);
}
public bool Equals(Vehicle other)
{
if (other == null)
return false;
// Comparing by references here is not necessary.
// If you’re sure that many compares will end up be references comparing
// then you can implement it
if (object.ReferenceEquals(this, other))
return true;
//If parent and inheritor instances can possibly be treated as equal then
//you can immediately move to comparing their fields.
if (this.GetType() != other.GetType())
return false;
if (string.Compare(this.Name, other.Name, StringComparison.CurrentCulture) == 0 && this.speed.Equals(other.speed))
return true;
else
return false;
}
}
[/code]
That comment about the top of hierarchy caused by the following possible situation.
If you create an inheritor of the Vehicle class (Bike, for example), which overrides the virtual Equals method and tries to cast a type like
[code lang=”csharp”]
Bike tmp = other as Bike;
if(tmp!=null)
this.Equals(tmp);
[/code]
rather than check types by GetType, then in this case the following code could cause a problem:
[code lang=”csharp”]
Vehicle vehicle = new Vehicle();
Bike bike = new Bike();
object vehicleObj = vehicle;
object bikeObject = bike;
//Base type can’t be casted to inheritor. Thus, symmetry property of comparing
//objects is violated.
bike.Equals(vehicleObj);
[/code]
Equality Operator ==
[code lang=”csharp”]
public static bool operator == (Foo left, Foo right)
[/code]
As a rule of thumb, you should always override equality operator as well as virtual Equals method for value types.
Its better to not override equality operator for reference types, because developers expect the behavior of ReferenceEquals method from the equality operator.
IStructuralEquatable
IStructuralEquatable goes hand by hand with the interface IEqualityComparer. IStructuralEquatable is implemented by such classes as System.Array and System.Tuple. As Bill Wagner says, IStructutalEquality declares that an implementer can compose greater objects of value-types semantics. Unlikely you are going to implement this interface by yourself ever. Though, there is nothing so special in the implementation of this interface. It’s enough just to look at the implementation of this interface by System.Array:
[code lang=”csharp”]
bool IStructuralEquatable.Equals(object other, IEqualityComparer comparer)
{
if (other == null)
{
return false;
}
if (object.ReferenceEquals(this, other))
{
return true;
}
Array array = other as Array;
if (array == null || array.Length != this.Length)
{
return false;
}
for (int i = 0; i < array.Length; i++)
{
object value = this.GetValue(i);
object value2 = array.GetValue(i);
if (!comparer.Equals(value, value2))
{
return false;
}
}
return true;
}
[/code]
The algorithm here is the following:
- Checking of null.
- Checking of identity.
- Casting to the underlying type and comparing by length.
- If length is not equal, then start item-by-item comparing by delegating it to the IEqualityComparer.Equals method.
That’s all can I say about objects comparing in C#. Though, there is one thing left, GetHashCode() method.
GetHashCode
[code lang=”csharp”]
public virtual int GetHashCode()
[/code]
Basically, the standard implementation of this method generates a unique identifier.
The disadvantage of such an approach is that semantically identical objects may return different hash-values. Richter complains that the standard implementation is slow in addition.
A correct implementation of GetHashCode is problematic. It requires to calculate a hash-value very fast and provide sufficiently large variance, in order to avoid duplicate returns. As a matter of fact, in major cases the implementation of GetHashCode is simple. It relies on bit shifts, bitwise OR and bitwise AND. Richter by himself gives an example of a structure with two Int32 fields. He suggests to implement GetHashCode in such a case like this:
[code lang=”csharp”]
sealed class Point
{
private int a;
private int b;
public override int GetHashCode()
{
return a ^ b;
}
}
[/code]
And here is how GetHashCode is overridden in System.Char:
[code lang=”csharp”]
public override int GetHashCode()
{
return (int)this | (int)this << 16;
}
[/code]
Resharper generates something similar with bit shifts by default.
That’s all. Thank you for attention. Subscribe to my blog!