Understanding Hashable, Equatable, and Set Membership

Posted 7/1/2018.

The Hashable protocol in the Swift Standard Library allows us to use our own custom types as a key in a dictionary or as a member of a set. Conforming to Hashable where appropriate can make our code safer and improve performance. However, it's important to understand how Hashable and Equatable work together in dictionary lookups and when determining set membership in order to avoid unintended behavior.

Value Types

First, let's look at an implementation of Hashable with a value type, for example a struct that holds an RGB color value:

struct RGBColor: Hashable {

    let red: Double
    let green: Double
    let blue: Double
    let alpha: Double

    static func ==(lhs: RGBColor, rhs: RGBColor) -> Bool {
        return lhs.red == rhs.red && lhs.green == rhs.green && lhs.blue == rhs.blue
    }

    var hashValue: Int {
        return red.hashValue ^ green.hashValue ^ blue.hashValue ^ alpha.hashValue
    }

}

This is pretty standard. The Equatable conformance checks that each component value is identical. To compute the hash value, we just XOR each component value. In fact, as of Swift 4.1 default implenetations of Hashable and Equatable are automatically synthesized for value types (structs and enums), so we no longer need to write the conformances ourselves in most cases. Taking advantage of automatic synthesis is recommended, since Swift uses a better hashing function than the common XOR approach used here.

Let's see how this works with Swift sets:

let gray = RGBColor(red: 0.5, green: 0.5, blue: 0.5, alpha: 1.0)
let identicalGray = RGBColor(red: 0.5, green: 0.5, blue: 0.5, alpha: 1.0)

let colors: Set<RGBColor> = [gray]
print(colors.contains(identicalGray))

This code should print true. So why does Hashable conform to Equatable? The answer is to prevent collisions. Let's take two different colors, gray with 50% alpha and white:

let gray = RGBColor(red: 0.5, green: 0.5, blue: 0.5, alpha: 0.5)
let white = RGBColor(red: 1, green: 1, blue: 1, alpha: 1)

The equality check should print false:

print(gray == white)

However, the following will print true:

print(gray.hashValue == white.hashValue)

Our simple XOR approach results in hash values of 0 for both colors. If Hashable simply relied on the computed hash value property to determine set membership, this collision would result in unintended behavior. This is also why I recommended using Swift's new autosynthesis of Hashable conformance if you're working with value types since the default hashing function is more reliable.

Reference Types & Identity

We saw above the potential pitfalls of Hashable, but in practice you'll rarely run into issues when working with basic values. With values like RGB colors, geometric points, geographic coordinates, etc. we usually want the default implementation, where all component values are checked for equality and incorporated into the hashing function. However, what about the common case where you have a client-server application and are fetching updates to data you have locally?

For example, consider a Comment class:

class Comment: Hashable {

    let id: String
    let text: String
    let likeCount: Int

    init(id: String, text: String, likeCount: Int) {
        self.id = id
        self.text = text
        self.likeCount = likeCount
    }

    var hashValue: Int {
        return id.hashValue
    }
    
    static func == (lhs: Comment, rhs: Comment) -> Bool {
        return lhs.id == rhs.id && lhs.text == rhs.text && lhs.likeCount == rhs.likeCount
    }

}

Here we are representing a comment with three component values: a unique identifier, the text of the comment, and its like count (in billions). This is a standard implementation for an object with identity. Since Comment has a unique identifier, it makes sense to use that to compute the hash value. At the same time, we may reasonably decide to only treat two Comment instances as equal if all component values are equal.

So where does this go wrong? Let's say you have a comment locally and fetch an update from the server:

let id = UUID().uuidString
let text = "Swift is the greatest programming language in the world."
let comment = Comment(id: id, text: text, likeCount: 1)
let updatedComment = Comment(id: id, text: text, likeCount: 2)

You also have a set containing the now outdated comment and check if it contains the updated version:

let comments: Set<Comment> = [comment]
print(comments.contains(updatedComment))

This will print false. This may be what you want. However, if we had assumed that the hash value was all that mattered and expected this code to print true, we would see unexpected behavior in our app.

When done correctly, conforming our custom types to Hashable can make our code safer and more efficient. However, it's important to understand the relationship between Hashable and Equatable and to think carefully about what equality means for each type. With basic values we can almost always take advantage of autosynthesis, as Swift's default implementations will generally do what we want. When dealing with reference types, we need to be more careful. The concept of equality for objects is more complicated, and we need to be deliberate about the behavior we want.

Next Steps

For more on autosynthesized hash functions in Swift 4.1, see Daniel Lemire's article here

Todd Kramer

Understanding Hashable, Equatable, and Set Membership

Value Types

Reference Types & Identity

Next Steps