Defines #= but not #hash

Identity is such a fundamental Smalltalk notion that if you override the `==` method Smalltalk ignores your override. By contrast, you can override the `=` method at will.

# Exercise: Anomaly of the Disappearing Element. dmx , books (p. 76–77), pdf (p. 92), github

Write a subclass `Book` of `Object` with an Instance Variable `isbn` and methods `setIsbn:` and `getIsbn` that simply set and answer the instance variable.

* Variable Types in Smalltalk * Redefining instance variables of a Smalltalk class. stackoverflow Fred Rivard, Smalltalk: a Reflective Language

Override the `=` method so that it compares ISBN numbers:

= anotherBook ^ self getIsbn = anotherBook getIsbn

**Note**: Class defines #=, but not a corresponding #hash method. This may cause problems when using hashed collections.



The following code will construct a library with an initial capacity of 100 holdings, add a holding, then test the library for the holding. *Execute* the first two lines, one at a time, then display the last line:

library := Set new: 100. library add: (Book new setIsbn: '0-671-S0158-1'). library includes: (Book new setIsbn: '0-671-20158-1').

The result, `false` demonstrates the anomaly of the disappearing element, (The Set is large enough that it is statistically unlikely for the result to be `true` but if it is, adjust the size of the set.) Now amend class `Book` so that the anomaly does not occur.

hash ^ self getIsbn hash

⇒ Hash Analysis Tool. page (August 15, 2013 by 'valloud') ⇒ Hashing in Smalltalk: Theory and Practice. page

⇒ Databases for testing hash functions. groups

* Why Simple Hash Functions Work: Exploiting the Entropy in a Data Stream. pdf , page * Empirical evaluation of hash functions for multipoint measurements. acm

Once you’ve solved the preceding exercise, here is an additional wrinkle. An objects identity may occasionally change. Perhaps the book has been re-assigned a different ISBN number. This change affects future searches through the library: the book will again not be found. Why not?


**Redefining both = and hash**. A difficult error to spot is when you redefine `=` but not `hash`. The symptoms are that you will lose elements that you put in sets or other strange behaviour. One solution proposed by Kent Beck is to use `bitXor:` to redefine `hash`. post , pdf (p. 220–221 and Listing 14-3)

Suppose that we want two books to be considered equal if their titles and authors are the same. Then we would redefine not only `=` but also `hash` as follows:

Book >> = aBook self class = aBook class ifFalse: [ ^ false ]. ^ title = aBook title and: [ authors = aBook authors ]

Book >> hash ^ title hash bitXor: authors hash

Another nasty problem arises if you use a mutable object, i.e., an object that can change its hash value over time, as an element of a Set or as a key to a Dictionary. Don’t do this unless you love debugging!


Constrained Counting and Sampling: Bridging the Gap between Theory and Practice. pdf

> In this thesis, we introduce a novel hashing-based algorithmic framework for constrained sampling and counting that combines the classical algorithmic technique of universal hashing with the dramatic progress made in combinatorial reasoning tools, in particular, SAT and SMT, over the past two decades.

By exploiting the connection between definability of formulas and variance of the distribution of solutions in a cell defined by 3-universal hash functions, we introduced an algorithmic technique, MIS, that reduced the size of XOR constraints employed in the underlying universal hash functions by as much as two orders of magnitude.