John Hopfield in 1982, proposed a neat idea. It was a network that could store memories and recall them when presented with a partial cue. There are many tutorials out there on the Hopfield network (my personal favorite is https://neuronaldynamics.epfl.ch/online/Ch17.S1.html, the original paper is a great read as well.).
These networks are highly studied. Despite their limitations and non-biological nature, they are still very popular among neuroscientists for their simplicity and tractability. Their storage capacity is barely enough: a network with N neurons stores roughly 0.14N memories (still linear though).
This note is about recent progress on these networks proposed by Krotov and Hopfield in 2016, which has shown great promise in practice e.g., Hopfield Networks is All You Need (vis-à-vis Attention is all you need).
There are two major feats achieved in this work.
- Dense networks can store many more memories than the number of neurons using rectified polynomials (ee note: a polynomial signal generator and a diode in series) in the update rule probably because higher order polynomials can easily tease apart two patterns that looks tightly correlated to the traditional update rule. These are very non-biological!
- They pointed out a connection or a duality between recurrent networks of Hopfield type and feedforward networks. Specifically, they work out how the activation function of the feedforward network is related to the update rule in the Hopfield network.
A sidenote
A long time ago, a Russian psychologist wrote a book The Mind of a Mnemonist, about a person who couldn’t forget anything. He could recall all the details from any meeting years ago almost flawlessly. The cost of this seemingly endless memory capacity was the inability to generalize or to see patterns. For him, there is no difference between these two lists of numbers: [1,2,3,4,5,6,7…,9] and [9,1,2,4,7,8..]. If he were a deer, he would never be able to learn that “all tigers are dangerous”; only that particular tiger with that particular stripe pattern who attacked him on that particular day is dangerous!
The point is that there is a trade-off between the ability to generalize (feature learning) and the ability to remember examples (prototype learning). Animals with higher cognitive facilities, such as mice, learn prototypes first, then quickly learn the features. Flies, however, learn prototypes but can’t generalize well (perhaps not at all).
Today’s neural network, AFAIK, seems to take the reverse direction that mice and humans. They learn features first and, when overtrained, learn prototypes.
https://github.com/dilawar/algorithms/tree/master/MemoryNetwork/DenseAssociativeNetworks has a Python3 implementation for this paper for the XOR function.