Review of Kanerva’s SDM

Here, we present a short overview of SDM. A deeper review on the motivations behind SDM and the features that make it biologically plausible can be found in [13, 15]. SDM provides an algorithm for how memories (patterns) are stored in, and retrieved from, neurons in the brain. There are three primitives that all exist in the space of n dimensional binary vectors:

Patterns (p) - have two components: the pattern address, paμ 2{0, 1}n, is the vector representation of a memory; the pattern “pointer”, ppμ 2{0, 1}n, is bound to the address and points to itself when autoassociative or to a different pattern address when heteroassociative. A heteroassociative example is memorizing the alphabet where the pattern address for the letter a points to pattern address b, b points to c etc. For tractability in analyzing SDM, we assume our pattern addresses and pointers are random. There are m patterns and they are indexed by the superscript μ 2{1,...,m}.

Neurons (x) - in showing SDM’s relationship to Attention it is sufficient to know there are r neurons with fixed addresses x⌧a 2{0, 1}n that store a set of all patterns written to them. Each neuron will sum over its set of patterns to create a Superposition. This creates minimal noise interference between patterns because of the high dimensional nature of the vector space and enables all patterns to be stored in an n dimensional storage vector denoted x⌧v 2 Zn+, constrained to the positive integers. Their biologically plausible features are outlined in [13, 15]. When we assume our patterns are random, we also assume our neuron addresses are randomly distributed. Of the 2n possible vectors in our binary vector space, SDM is “sparse” because it assumes that r ⌧ 2n neurons exist in the space.

Query (⇠) - is the input to SDM, denoted ⇠ 2{0, 1}n. The goal in the Best Match Problem is to return the pattern pointer stored at the closest pattern address to the query. We will often care about the maximum noise corruption that can be applied to our query, while still having it read out the correct pattern. An autoassociative example is wanting to recognize familiar faces in poor lighting. Images of faces we have seen before are patterns stored in memory and our query is a noisy representation of one of the faces. We want SDM to return the noise-free version of the queried face, assuming it is stored in memory.

SDM uses the Hamming Distance metric between any two vectors defined: d(a, b) := 1Tn |a b|. The all ones vector 1n is of n dimensions and |a b| takes the absolute value of the element-wise difference between the binary vectors. When it is clear what two vectors the Hamming distance is between, we will sometimes use the shorthand dv := d(a, b).

[…]

pattern pointer ⇒

Pile Systems Inc. wayback

Address busses works well and are at the heart of both processor and memory design on a variety of scales. Address busses make computers a logical machine for when they are properly clocked we can reason knowing all elements have been considered. But this pattern is rare or nonexistent in nature. Let's understand why.

⇒ Hamming Distance