We humans (and other animals) posses the amazing ability of experiencing and remembering things along with higher modes of cognition like thoughts, ideas and feelings.
The name Engram has been given to denote the physical location and mechanisms required to process and store these bits of information. The theory seems sound, there are places in our brains where the aforementioned information is stored (where else would they be) but accessing them is still not possible, at least not directly. We make do with our communication skills to convey what we experience and know internally. The best next thing could be to take a peek inside the brain as it is working with information, but that has proven to be quite invasive and so far lacks the resolution needed to make sense of the data ( see Neuroimaging ), so we are left with the next next best thing, we can model parts of the brain artificially, create analogs and see how they experience things, the end product would be an artificial engram and that’s what we are after here.
Note: This is an excerpt/draft on a book section I've been working on part 2 of Conscious Artificial Intelligence. Why is this important ? Current AIs borrow and combine elements from Neuroscience, Statistics and other fields, the end result are specialized tools and systems that while effective at certain tasks generalize poorly, our human trait in contrast is that we generalize easily, understanding engrams then could give us a better chance at creating new AIs. There is also value in answering the questions of how, where and when we store information.
The big picture
We experience the environment in real time, a never ending influx of images, sounds and other types of information; we are mostly and blissfully unaware that our brain is doing a lot of processing all the time discarding most information and being selective about what gets stored, to complicate matters we don’t store information in a database of films or photos ( let’s stick with vision for now ), we seem to store sparse, distributed and invariant representations of the information we encounter. Beyond this immediate experience we can revisit the past and experience both old and new things with our minds, engrams should ideally share a format we can trace to the initial perception and encoding as well as further retrieval and manipulation.
High level schematic of engrams (the diagonally shaded squares )three instances of them... As you experience the environment a succession of engrams are perceived by your receptors and cortex, we generally perceive the latest influx (#5 in the diagram) as our current reality in time (the black arrow), some of the information is stored as memories which you can later retrieve (this process can be both conscious and unconscious). A third instance are ideas or complex thoughts believed to be composed of combinations of engrams or engram fragments, note that these are not the only ones, as cognitive tasks increase in complexity an unknown number of engrams might be recruited and formed ( see cognitive neuroscience for an ever growing list ).
Vision is the preferred domain here because it is both practical to convey and well studied, ( but we’ll also cover other domains later ) , at this point we can start with neural networks as a means to translate biology into artificial perception.
Our first input is the humble symbol
Upon entering your eyes the above image is converted into action potentials across several neural networks or regions in your brain, the sum of them presumably gives you the experience of it, our artificial neural network in contrast is a simple grid of bits that stand for neurons connected to a camera.
Some finer points...our artificial neural network (ANN) is a 60x60 (3,600 neurons) grid where light is registered as white pixels or 0s and darker areas as black pixels or 1s (the video feed is thresholded to only allow black and white), it is not equivalent to a retina mainly because it lacks resolution, doesn't deal with color or even grayscale/brightness, additionally encoding of visual information is not done through zeros and ones but through action potentials (think sputtering sparks without the actual sparks) which code for diverse information, so both light and dark can be encoded as repeating patterns of activity (or spike trains). This point will be clearer later.
And here’s a bit of the world as experienced by this neural network:
The first and probably most profound gotcha of biological vision is that we don’t perceive the world in the same way as a camera and there is no screen in our brains (yet we experience what is considered the illusion of one); we see borders, shapes, colors, perspective and do adjustments on the fly for things like brightness and selecting foreground/background. Biologically there is much we don’t know, but we know that we process information via a number of maps in areas far away from our eyes ( cortical areas v1–5, although a lot of processing seems to happen before/after).
These maps take the shape of basic primitives like lines at different angles or blobs, a basic one that helps understand the concept is simply the inverse or negative image (one codes for light, the other for dark if you will), so the previous stimuli can be described as a dual input :
We’ll come back to how these parallel representations could be integrated later, but for now the next problem in visualizing engrams is the unknown number of maps we possess, how we acquire them ( most likely a mix of experience and biology ) and how they tie up with engrams.
Some but not all the maps we possess, if you stare at any one you will activate that map(s) and possibly some image after effects.
Maps in more detail
Your field of vision or view (FOV) is roughly a 200 degree horizontal arch and a 130 degree vertical one in front of your eyes , imagine that like a camera it simply has a square shape through which the world is perceived and that we can divide it in sections which roughly correspond to groups of neurons.…
Now let’s add a simple stimulus, ( a diagonal line in one corner of this FOV ), this diagonal line (a group of neurons really) activates a neuron somewhere in your visual cortex:
Importantly the visual cortex retains the spatial arrangement from early processing, hence the word map, here is the same stimulus in different regions:
And a real time implementation using computer vision :
Notes/Explanation: Here we are recreating a map for vertical lines surrounded by white space with a 3x3 shape, in array/list format for the code savvy:template = [[0, 1, 0],
[0, 1, 0],
[0, 1, 0]]The middle image is downsampled and is not part of the map but I included it so we can better understand what's happening, when a vertical line 3 big pixels tall is detected (in red) a corresponding neuron is activated in the Neuron Map (in golden/yellow ), so when six receptor neurons are activated (the template in green ), only one in the corresponding map is, just like the previous example but with a real stimulus and in real time.
Maps though seem to come in a variety of sizes and shapes, so to have a more complete picture of the world out there we would have to include say short and long lines at different angles as well as square blobs or shapes in different sizes, perhaps hundreds or more primary maps in total; to keep things simple yet meaningful, let's just focus on 4 basic maps: vertical, horizontal 135 and 45 degree small lines:
A note on resolution: while these examples hopefully work to convey the concept of maps, in reality things are much more complex since we don't have an uniform distribution of receptors, there is a greater density around the center of your visual field and your eyes dart or jump around scanning various regions of an image in higher resolution, sometimes mediated by what you are attending to. If you focus on this point >> . << you'll notice that you can't focus on the first words in this block of text and there is less acuity the farther out things are, so in some sense your vision is like a narrow flashlight that can only illuminate a fraction of the environment at any time.
Integration and engrams
To start making sense of engrams we need another layer of neurons, this layers job is to integrate or codify the previous maps into something meaningful in the environment for storage and/or further processing.
Here an asterisk like stimulus is perceived by receptors, divided in the map layer and later integrated in the association layer, about 32 Neurons connect (or synapse) with just 8 in the map layer (some connections shown in dotted lines) and finally into 1 (one) neuron in the association/integration layer (symbolized by black arrows), there are significant savings in the number of neurons, but at the cost of increasing connections.
In real time/computer vision this arrangement looks like this:
Notes/Explanation: The top right map (in blue) is the sum of the previous maps, once more this visual is included to help us make sense of the arrangement, but has no biological counterpart.The association layer is the bottom right grid (lighter grey) and the encoding neuron is shown in dark orange (when it is firing).How we encode is worthy of a longer discussion, here I am simplifying the encoding rules by telling the association layer to fire a single neuron when roughly half the maps are active, in other words there is a threshold of things in the environment that fit these maps and that in turn this cell deems worthy enough to fire.If you've heard about grandma or famous artist neurons, this ( the single orange neuron) is a similar concept and digital analog.
I am stopping short of calling this arrangement an engram because if you remember from the original definition there is a physical location (or virtual in this case) which we now have, processes for acquiring (or encoding) and storing the information which our single neuron in the association layer does.
We still need to account for what happens after we encode and store stimuli (or decoding ). In diagram form this is how a previously encoded stimulus can be reconstituted or decoded:
To retrieve (or represent) a previously encoded stimuli, the order or flow of information is reversed, starting from the association layer(s) a neuron or group of neurons gets activated and in turn the corresponding maps get activated next, the result is that the original stimulus is rebuilt or reconstituted. --- * ---Of note here is the location of the representation layer, where does it reside ?If stimuli enter through receptors are these receptors responsible for recreating the stimulus or is the representation layer located somewhere else ?It's still unclear but I believe the answer might be both. There are almost as many connections going downstream than upstream in the visual system (and other systems), REM sleep is considered to "replay" stimuli (at least partially) at the receptor level or close by, yet higher cognitive processes like working memory can still be performed after losing primary receptors hinting at higher cortical areas as the location, there could also be a combination of both or even more representation layers, we still don't know.In any case for our discussion it is sufficient to acknowledge that there is an internal representation layer (or layers).
While higher modes of cognition are outside these basic examples, the engram we just described can display some basic biology, here for instance we store the last thing that was deemed worthy of remembering, something we do with our short term memory.
Extending the previous example here we are storing the stimulus once detected, this is done by replaying the association neuron (in black) and the corresponding maps (in white), note there is overlap between active and stored maps ( something that also happens in real short term memory and can cause interference ).More simplifications: Neurons in the brain are for the most part not on/off switches, rather they spike, so a more accurate model would have neurons active at intervals ( or firing ). In the above example just imagine the white and black neurons are blinking.And lastly, this type of memory as represented is not realistic, short term memory decays, the blinking would slow down first and then eventually stop, theres are other ways of encoding by firing rate or quantity for instance.
As an engram is both the physical (or digital) apparatus needed to perceive the environment along with how it is organized ( how information flows ) and codified, you unfortunately can’t really see one without some extra help, consider the asterisk like stimulus we’ve been using, if we were to show just the constituent parts: active neurons and connections, we would end up with something like this:
To better “see” one, we’d need to first show the neural maps along with the association layer:
And finally we need to separate the flow of information into encoding and decoding through time :
While this I believe is a better basic description of an engram, we don’t see the world in black and white pixels and we are still missing a lot of modal information ( ie touch, smell, audio, etc, etc), we also need to briefly talk about how an engram could fit into higher cognitive processes, so let’s wrap things up by briefly talking about these complexities.
Engrams as the basic unit of cognition and generalization.
The first expansion to the engram theory we’ve been talking about is that of modality, in other words what about color, what about hearing/sound ? Well, the basic structure is the same :
Encoding: receptor > map > association... Decoding: association > map > representation , what’s different here is the receptor type and behavior along with what the map is encoding, so you could think about a spatial map of red things, a spatial map of sound frequencies and a spatial map of touch :
Here we have 3 stimuli and the corresponding mapping and association layers, the first stimulus a red square uses the same spatial mapping we've been using, sound is a bit harder to encode since there are time, frequency and loudness components, but to keep things simple here the map just encodes frequency, the third is a touch stimulus, run your finger over your arm hair, somewhere in your somatosensory cortex a row of neurons (the map) will become active.Of note also is the parallel nature of all these maps, the 3 stimuli can happen simultaneously.
The second and bigger complexity involves what happens to engrams later on, once we have a packet of sorts representing a stimulus, we ( the brain/the AI) can then do more interesting things like combining, creating, storing and recalling them, the map heuristic we looked at is thought to be part of higher cognitive tasks like giving complex and multimodal stimuli meaning. For AI purposes the rich association of these mapping systems and networks could help bridge the gap toward general AIs in contrast with the current rigid/narrow AIs.
A basic example of how a combination of engrams (text and images) could be combined into a richer association, since the connections here go both ways any of the stimulus presented can elicit the other associations. As for the how engrams integrate into new ideas, I'll leave you with the following quote :
“There is no such thing as a new idea. It is impossible. We simply take a lot of old ideas and put them into a sort of mental kaleidoscope. We give them a turn and they make new and curious combinations. We keep on turning and making new combinations indefinitely; but they are the same old pieces of colored glass that have been in use through all the ages.”
― Mark Twain
I hope this short overview of engrams can help you better understand them (it helped me ) and how they are related to AI and other subjects in neuroscience.
Thanks for reading !
Post Data :
Bibliography, sources - The main ideas used for this post are based on the book:
Seeing, second edition: The Computational Approach to Biological Vision By James V. Stone and John P. Frisby. Especially the early chapters.
The ideas of hierarchical maps though can be found in nearly every Neuroscience book, Networks of the Brain ( Sporns ) goes deeper into the subject and can be paired with Rhythms of the brain ( Buzsáki ) to incorporate time/dynamics. For the rest I’d point you to my ever growing Neuroscience/AI book recommendations :
Code: The code used throughout was done as fast prototypes rather than production code and so at this point I am not comfortable releasing it, but it is mostly Python + OpenCV + PySimpleGUI. If you are trying to recreate something here and get stuck let me know. There’s considerable overlap with the current ML, ANNs; specially in concepts like segmentation and convolutions but since these are focused on solving specific problems and have taken a life of their own, I think we are better starting from scratch.