Cognitive Neuroscience and Embodied Intelligence Memory and Learning Based on book Cognition, Brain and Consciousness ed. Bernard J. Baars courses taught by Prof. Randall O'Reilly, University of Colorado, and Prof. Włodzisław Duch, Uniwersytet Mikołaja Kopernika and http://wikipedia.org/ http://grey.colorado.edu/CompCogNeuro/index.php/CECN_CU_Boulder_OReilly http://grey.colorado.edu/CompCogNeuro/index.php/Main_Page Janusz A. Starzyk EE141 1 Introduction Learning is the process by which we acquire knowledge about the world. Learning involves memory to store representations that reflect experience, behavior and values. Human memory has surprising limitations and impressive capacities. Brain evolved around tasks of survival, thus it is well prepared to deal with ill-defined problems and challenges in real world. Its ability to remember academic information is quite recent and not as well developed in terms of storage capacities. Humans are exceptionally flexible in learning new skills. It is amazing that practically the same brain was serving humans to live in the stone age and was able to learn the skills needed in the age of 2 computers and Internet. EE141 Memory Memory is the process by which that knowledge of the world is encoded, stored, and later retrieved. (Kandel 2000) Memory storage involve synaptic changes in cortex. Correlated activities between neurons leads to strengthening connections between them. Temporary cell activities maintain immediate memories. Medial temporal lobes (MTL) are important for building memories. 3 EE141 General remarks Memory is any persistent effect of experience. Memory is seemingly uniform, but in reality it is very differentiated: spatial, visual, aural, recognition, declarative, semantic, procedural, explicit, implicit … Here we test mechanisms, so the primary division is: Synaptic memory (physical changes in synapses), long-term and requiring activation to have some influence on functioning. Dynamic memory, active, temporary activations, affects current functioning. Long-term priming, based on synaptic memory, yielding to fast modification – semantic and procedural memory are the result of slow processes. 4 Short-term priming, based on active memory. EE141 General remarks Memory Types STM LTM Working memory Short term memory Long term memory Nondeclarative Declarative Facts Events Parietal cortex Prefrontal cortex Limbic system Manual skills Conditioning Emotional Nuclei Priming Motor Cerebellum Neocortex 5 EE141 Memory MTL (perirhinal cortex) include two hippocampi and olfactory area. MTL interacts with the higher level visual area: inferior temporal lobe (IT) Close to MTL is auditory cortex and amygdala responsible for emotions EE141 6 Memory Thus MTL (perirhinal cortex) integrates multiple brain inputs. It is a “hub of hubs”. Hippocampus combines cognitive information from neocortex with emotional information from limbic areas and bids this information into memory that codes consciously experienced events. 7 EE141 Memory MTL helps to store and retrieve episodic memories. When visual cortex is activated by an image of the coffee cup it activates memory traces through MTL. These include semantic associations of the coffee cup such as coffee beans or the coffee aroma. Visual features like cup handle are also activated. This may activate episodic memory of yesterday’s coffee with a friend8 in cafeteria and traces of conversation. EE141 Memory Sensory input goes to working memory (WM). Working memory temporarily retains small amounts of information; only 4-7 items can be held in immediate WM. WM interacts with cognitive processes to perform explicit learning and retrieval as well as implicit learning. Explicit learning involves semantic memory (facts), episodic memory (episodes) and perceptual memory (learning music, art). 9 EE141 Implicit memory (fear, habits, biases, goals) Memory Explicit memory is first acquired through association areas of the cerebral cortex, namely prefrontal, limbic and parietooccipital-temporal. Then, the information is transferred to parahippocampal cortex, entorhinal cortex dentate gyrus, hippocampus, subiculum and back to entorhinal cortex. Damage to parahippocampal and entorhinal cortices produces greater deficits in memory storage for object recognition than does hippocampal damage. Right hippocampal damage produces greater deficits in memory for spatial representation, whereas left hippocampal damage produces greater deficits in memory for words, objects or people. In either case, the deficits are in formation of new, long-term memory; old memories are spared. www.unmc.edu/physiology/Mann/mann19.html EE141 10 Memory The relative positions of parts of the limbic system involved in learning and memory. (Kandel, 2000 Principles of Neural Science. ) Current thought is that the hippocampal system does the initial steps in long-term memory storage–different parts being more important for different kinds of memory. The results of hippocampal machinations–presumably memories–are transferred to the association cortex for storage. www.unmc.edu/physiology/Mann/mann19.html EE141 11 Memory Implicit memory contains procedural, emotional and motor skills. Implicit memory is often tested using priming where subjects receive subconscious perceptual or conceptual information. Perceptual memory refers to sensorimotor habits (skills) largely unconscious involving basal ganglia. Imagine riding a bike and you start falling to the right – WHAT TO DO? Conscious answer is to lean to the left (many cyclists say this) However when they ride a bike they instead turn their handlebar in the direction of the fall, expressing unconscious procedural knowledge. 12 EE141 Amnesia Clive Wearing suffered a viral infection that destroyed hippocampi and some frontal lobe areas. He retained his skills as musician, but he did not remember the most recent past. Some of his short term memory was preserved so he could converse, and be aware of the present. However he could not remember events from the recent past. For instance he would talk to his wife and few minutes later he forgot she was there. He couldn't register episodic or semantic memory. Ne couldn't recall episodic memory. 13 EE141 Amnesia The most important patient in cognitive neuroscience is known as HM. His medial temporal lobes were surgically removed by a surgeon who was unaware about their importance for memories. HM cannot not remember any events in his life after the surgery. He even cannot recognize his face due to changes over the years. He also suffers from retrograde amnesia and does not remember events from years immediately before surgery. His other cognitive functions are intact: he can reason, solve problems 14 and carry normal conversation. EE141 Amnesia HM represents amnesia in its pure form. In general amnesia is any loss of episodic memory with otherwise normal cognitive functions. The causes include infections, stroke, tumor, drugs, oxygen depravation, epilepsy, degenerative disease (like Alzheimer) or be of psychogenic nature. Amnesia results from damage to MTL including hippocampi and causes: Impaired memory but preserved perception, cognition, intelligence, and action. Impaired long term but not working memory – Amnesic people can perform normally on standard tests of intelligence – They can play chess, solve crossword puzzles, comprehend instructions, and reason logically Impaired recent but not remote memories (anterograde amnesia). Impaired explicit but not implicit memory – Learning, retention, and retrieval of memory without awareness is normal. 15 EE141 Amnesia Implicit and procedural memories are not damaged in amnesia. Perceptual priming involve sensory cortex Conceptual priming include word association. Patients with amnesia perform well on perceptual and conceptual priming tasks Patients with Alzheimer disease perform well on perceptual but not on conceptual priming tasks Procedural memory depends on perceptual-motor regions like basal ganglia. HM patient was able to learn and retain some motor tasks even he did not remember learning them. Patients with impaired basal ganglia due to Parkinson’s or Huntington’s disease show not improvement after practicing sensorimotor tasks. In serial time reaction (STR) tasks subjects are requested to retrace a series of dots on a computer screen Amnesic patients do well on implicit STR task but poorly on explicit tasks. Patients with basal ganglia disorders like Parkinson’s disease do poorly on both tasks 16 EE141 3 regions PC – rear parietal cortex and motor cortex; distributed representations, spatial memory, long-term priming, associations, deductions, schemes. FC – prefrontal cortex, isolated representations, disruption control, working memory. HC – hippocampus formation, episodic memory, spatial memory, declarative memory, sparse representations, good image separation. Slow learning, statistically relevant relationships => procedural and semantic memory, cortical; fast => episodic, HC. Retaining active information and simultaneously accepting new information, eg. multiplying in your head 12*6, requires FC. 17 EE141 Slow/rapid learning A neurons learns situational probability, correlations between the desired activity and input signals; optimal value of 0.7 is reached rapidly only with a small learning constant of 0.005 Every experience is a small fragment of uncertain, potentially useful knowledge about the world => stability of one's image of the world requires slow learning, integration leads to forgetting individual events. Relevant new information is learned after a single exposure. Lesions in the formation of the hippocampus cause subsequent amnesia. The neuromodulation system reaches a compromise of stability/plasticity. 18 EE141 Complementary learning systems 19 EE141 Active memory and priming Distributed overlapping representations in the PC can efficiently record information about the world, but this is not very precise and blurs with the passage of time. FC – prefrontal cortex, stores isolated representations; increases memory stability. The effects of priming are evident in people with a damaged hippocampus, cortical priming in the PC is possible. We will differentiate many forms of priming: length (short-term, long-term), type of information (visual, lexical), similarity (repetition, semantic). EE141 20 Priming Standard: completing roots, after reading a list of words we get a root and must add the ending, eg. rea--If reaction was on the list earlier, then it is usually chosen. The interval of time can be about an hour, so active memory can't be responsible for this. Homophones: read, reed. Completion: "It was found that the ...eel is on the ...", in which the last word is "orange, wagon, shoe, table” is heard as: "peel is on the orange", "wheel is on the wagon", "heel is on the shoe" "meal is on the table". 21 EE141 Priming model Project wt_priming.proj, Chapter 9 from (http://grey.colorado.edu/CompCogNeuro/index.php/CECN1_Wt_Priming) View Events: the first 3 have the same input images, but different output images, in total 13 pairs x 2 outputs = 26 combinations, IA - IB Attention: we're not yet learning the AB-AC lists, just the effect of learning. 22 EE141 Exploring the model View TrainLog and evaluation of the result: similarity of the output image, summarized as a yellow line, the name of the most similar event, measured by sm_nm = binary errors in the names of the closest events, part of the result not very similar to the given: A B. In blue both_err = 1 only if this isn't one of the two acceptable output images. Noise helps to break through impasses but it also causes a small lack of stabilisation of already-learned images. 23 EE141 Further tests Test_logs: first we will check if there are some tendencies, and then if we can teach a network to change preference after the presentation of IA and then IB. wt_update=Test, Test does one epoch, check Trial1_TextLog: ev_nm is either IA, or IB, and sm_nm is either 0 or 1, randomly. In Epoch1_TextLog we can see that there is always one of the two results, in sum 13/26, or half the time: there is no tendency. We check whether one exposure changes anything. wt_update => On_Line, learning after every event, Run Test, the frequency increases significantly to 18 and then 25 times. Conclusions: just error reduction gives mixed outputs A and B, a network without kWTA won't learn this task. The parietal cortex can be responsible for long-term priming. 24 EE141 AB-AC Learning People are able to learn two lists, word pairs A-B, and then A-C, eg. window-mind bike-trash .... and then: window-train bike-cloud without greater interference, doing well on tests for AB and AC. Networks with only error correction forget catastrophically! Interference results from using the same elements and weights to learn different associations. It's necessary to use different units, or to learn with context. 25 EE141 AB-AC Model Project ab_ac_interference.proj (http://grey.colorado.edu/CompCogNeuro/index.php/CECN1_ABAC_List_Learning) View Events_AB, Events_AC, Output: either A, or C, the context differentiates. Replication of catastrophic learning: View: Train_graph_log, red = errors, yellow = tests for AB. The test shows that after learning AC, the network forgets AB, many 26 units in the hidden layer take part in the learning of both lists. EE141 AB-AC Model hid_kwta 12=>4 to decrease the number of active elements. The test, but without changes. Increase the variance of initial values. wt_var 0.25=>0.4 Stronger influence of context fm_context 1=>1.5 Hebbian learning hebb 0.01=>0.05 Decrease the rate of learning lrate => 0.1, Batch Nothing here clearly helps but the catastrophes are less likely... Two systems of learning are clearly necessary, a fast one and a slow one – cortex and hippocampus. 27 EE141 How memories are made? Traditional thinking of memory as a permanent record Memories Are Made Of This of past events that can be played back, examined and retrieved is false. Memories of past events are in fact rarely accurate. Two people experiencing the same event may have different memories of it. The process view, considers memory as a result of a dynamic process, a reconstruction of the past influenced by present, anticipation of future events and other cognitive processes. We forgot most of what happened within minutes or hours and what remains is distorted by our knowledge and biases. Try to reconstruct what you did two weeks ago with as much detail and exact order as you can. Most of us will try to search for cues to figure out the sequence of events. EE141 – Did I go shopping and which stores I visited? – What merchandise did I look at? 28 How memories are made? You may confuse what happened two weeks ago with what happened some other time. Patients with disorder called confabulation make up false memories without intention of lying or awareness that they are not true. Memories influence how other memories are formed and retrieved. They influence our thoughts and actions, and are influenced by them. Stimulation of temporal lobe sometimes results in flood of conscious memories. One patient during brain stimulation experienced memory of: At four electrodes location 1-2 and 9-10 re-experiencing Flinstone cartoons from childhood At locations 8-9 and 13-14 hearing the rock band Pink Floyd. At locations 9-10 a baseball announcer. At locations 7-8 and 12-13 a female voice singing. 29 EE141 How memories are made? What happens in the nervous system to produce habituation? If the siphon of the animal (Aplysia californica ) is stimulated mechanically the animal withdraws the gill, presumably for protection. That action is known to occur because the stimulus activates receptors in the siphon, which activates, directly or indirectly through an interneuron, the motoneuron that withdraws the gill. This is a simple reflex circuit. With repeated activation, the stimulus leads to a decrease in the number of dopamine-containing vesicles that release their contents onto the motor neuron. EE141 From 30 www.unmc.edu/physiology/Mann/mann19.html How memories are made? Autobiographical memories evoked by temporal lobe stimulation 31 EE141 How memories are made? Possible explanation for this electrically stimulated recall of memories involves temporal lobe in neocortex. If some neurons are activated in neocortex, this evokes an overlapping pattern of neural activation in hippocampal system (MTL). The flow of information form neocortex to MTL causes hippocampal system to resonate with the original memory traces, to produce the original episodic experience in neocortex. 32 EE141 How memories are made? Most synapses in cortex are excitatory using neurotransmitter glutamate. A large minority are inhibitory using neurotransmitters like GABA (gamma amino butyric acid). These two processes are called long term potentiation (LTP) and long term depression (LTD). LTP has been observed in hippocampus using single cell recording. A schematic of a single cell recording in hippocampus 33 EE141 Hippocampus Anatomy and connections of the structures of the hippocampal formation: signals reach from uniand multimodal association areas through the Entorminal Cortex (EC). 34 EE141 More anatomy Hippocampus = king of the cortex Bidirectional connections with the entorhinal cortex: olfactory bulb, cingulate cortex, superior temporal gyrus (STG), insula, orbitofrontal cortex. 35 EE141 More anatomy Sporadic activation Representations in CA3 and CA1 are focused on specific stimuli, while in the subiculum and the entorhinal cortex they are strongly distributed. 36 EE141 Hippocampal formation Model contains structures: dentate gyrus (DG), areas CA1 and CA3, entorhinal cortex (EC). Pct Act = % of activation. 37 EE141 How memories are made? Many millions of neurons and billions of synapses are involved in LTP or LTD. Based on evidence from EEG, ERP, and fMRI we can suppose that formation of long term memories involves: Episodic input is presented via neocortex. It is integrated for memory purpose in the MTL (medial temporal lobes) involving hippocampi and related structures and perhaps thalamus and surrounding regions. Consolidation: MTL and related regions bind and integrate a number of neocortical regions in the process that transforms temporary synaptic connections into longer lasting memory traces in both MTL and neocortex. The main mechanism used is LTP and LTD. Normal sleep is important to form long-lasting memory traces. More permanent memories require protein synthesis – such as growth of dendritic spikes on the top of axons and dendrites. 38 EE141 How memories are made? The steps of learning, binding, consolidation and remembering. When a new event is learned cortex activates MTL Cortex and MTL resonate to establish the memory traces in a binding step In consolidation the resonance continues without external support Upon presentation of the original event's cue, MTL and cortex resonate to recall the stored memories. 39 EE141 How memories are made? Reconsolidation turns active neuronal connections into lasting ones. We have two kinds of reconsolidations: cellular and system reconsolidation. 40 EE141 How memories are made? Rapid consolidation occurs within minutes to hours from learning event. It correlates with morphological changes in synapses. If the stimulus is intense or repeated then gene transcription and protein formation lead to long lasting changes including creation of new synapses to form long lasting memory. 41 EE141 How memories are made? Nadel and Moskovitch concluded that contrary to the standard consolidation model, MTL is needed to represent even old episodic memories for as long as these memories exist. MTL neurons act as pointers to neocortical neurons that represent the experience. Neocortex, on the other hand, is sufficient to represent repeated experiences with words, objects, people and environment. MTL may help in initial formation of these neocortical traces, but once formed they can exist on their own. 42 EE141 Varieties of Memories Declarative memory can be divided into episodic and semantic memory. Episodic memory have specific source in time, space and events. It allows us to go back in time and relieve the experience. Semantic memory involve facts about the world, ourselves and other knowledge. We know which city is a capital of France or where are the great pyramids. EE141 43 Varieties of Memories Episodic memories: 1. Have reference to oneself Are organized around specific time period Are remembered consciously such that we can relive them Are susceptible to forgetting Are context dependent w.r.t. time, place, relationships etc. 2. 3. 4. 5. 44 EE141 Varieties of Memories Semantic memories: 1. Have reference to shared knowledge Are not organized around specific time period Give a feeling of knowing rather than recollection of a specific event. Are less susceptible to forgetting than events. Are relatively context independent. 2. 3. 4. 5. 45 EE141 Varieties of Memories In a study subjects were asked to tell if they remember the item or “know” the item. The act of remembering (episodes) resulted in higher brain activation than the “feeling of knowing” (semantic) 46 EE141 Varieties of Memories Episodic memories may turn into semantic memories over time Initially memories are episodic and context dependent Over time, episodic memories are transformed into semantic memories MTL is important for recovering episodic memories, which are linked by specific autobiographical context Episodic memories in Fig. show a man cooking on a barbecue, giving flowers to a lady, painting a picture and playing golf. A semantic network above combines all these specific episodes into a 47 simplified knowledge of a man who BBQs, loves, paints, and plays golf. EE141 Varieties of Memories Learning is often thought to require consciousness and paying attention. It certainly helps to learn by being aware of it It is a basic learning strategy for humans. However there are some evidence for learning without consciousness especially with emotional stimuli. Fig. from: http://universe-review.ca/R10-16-ANS.htm The terms explicit and implicit memories are used in context of remembering. EE141 Explicit (declarative) memory requires conscious awareness 48 Varieties of Memories Prefrontal cortex (PFC) is critical for working memory Lesions of PFC impaired performance in delayed response tasks. Fuster (1971) experimented with monkeys – they were trained to remember a color for a short period of time and then point to a correct color when presented with two choices. Through implanted electrodes he observed sustained neurons activities over the delay period in the area of dorsolateral (DL) PFC 49 EE141 Varieties of Memories Prefrontal cortex (PFC) serves to support the mental work performed on stored information rather than as a site of storage itself. Its primary function is to modulate activity in other cortical areas where the items in memory are stored. PFC enhances the relevant information in the memory and inhibits irrelevant information. When the information is relevant to a specific item in the memory, then ventral part of PFC is involved When the information regards the relations between many items, then dorsal part of PFC is involved. Anterior (frontal) regions of PFC are involved with coordination and monitoring activities among different PFC regions to implement higher order functions such as planning. 50 EE141 Varieties of Memories Combined brain regions work together for visual working memory. Hippocampus may encode new memories, while MTL may combine them with pother modalities and IT is involved in high level visual object representation. DL-PFC and anterior PFC is involved in short term maintenance of relations 51 EE141 Varieties of Memories Clive Wearing knows that something is wrong as he always lives in present time. He has a metacognitive concept of his own cognitive functions. A person may recall an episode using semantic cue and vice versa. For effective retrieval the retrieved information must overlap with learned and encoded one – the person must have a goal to retrieve it. MTL is mostly involved in retrieving episodic memory. Poor frontal function impairs tests on the source of memory and temporal order. Semantic memory both learning and retrieval depend more on the52left EE141hemisphere functions. Varieties of Memories Other kinds of memory may involve other brain structures. The amygdala mediates fear conditioning. The cerebellum and basal ganglia are needed for habits and skills, and subconscious conditioning. The thalamus is information hub constantly trading signals with cortex. Perceptual and motor learning involve the dynamic organization of cortical maps. Brain surgery can alter body maps – this is related to brain plasticity. Life is a development process of learning, adaptation and memory formation. New neurons are being born throughout the lifetime starting from stem cells. The ongoing placement of the neurons involves dynamic learning and adaptation process. 53 EE141 Varieties of Memories EE141 Overview of multiple learning systems in the brain 54 Separation and conjunction of images The hippocampus rapidly associates various representations of the cortex. Creates episodic memory Completes activations recreated from the memory and separates them into clearly distinct meanings Sparse encoding eases the separation of meanings CA1 separates by conjunction of images (representations) It's also able to recreate the original activation from the EC by reversible connections EE141 55 Model of the hippocampus Project hip.proj (http://grey.colorado.edu/CompCogNeuro/index.php/CECN1 _Hippocampus) Input signals enter through the entorhinal cortex (EC_in), to the dentate gyrus DG and the CA3 area, DG also influences CA3, where received signals can be completed through associations. CA3 has strong internal connections. CA1 has more distributed sparse representations => EC_out. EC: 144 el = 4*36; 1 of 4 active. DG: 625 el, CA3: 240 el CA1: 384 el = 12 col * 32 el EE141 56 Exploration of the hippocampal model Learning of AB – AC associations without interference. Autoassociations: EC_in = EC_out, reversible transformations. BuildNet, View_Train_Trial_Log will show the statistics. The input includes information about the input and output images and the list. StepTrain: units chosen in the previous step have white outlines. Partial overlapping of images in EC_in, DG, CA3, CA1. Training epoch: 10 list elements + 3 test sets: AB, AC, new View Test_Logs => text and graph log train_updt = no_updt to the test log, Run will do 3 epochs, the results are in Text_log, 70% remembered from the AB list and 100% from the AC list. Set test_uodt = no_updt, the network will more rapidly finish 3 training/test epochs. Test analysis: test_updt = Cycle_updt, Clear Trial1_1_Text_log 57 StepTest, we see only A + context, we see how the image completes. EE141 Further exploration Targ in Network shows what image was learned, act targ In TextLog, stim_er_on = proportion of units erroneously activated in EC_out, stim_er_off = erroneously not activated in EC_out. In Trial_1_GraphLog we can see these two numbers after every test, for known images they're small, correct memories, for new ones they're large, but on ~0,5 and off ~0.8, the network rarely fails. To move to list AC we turn off Test_updt = Trial_updt (or no_updt) and StepTest until in text_log, epc_ctrl changes to 1. These are events for list AC: the network does not recognize them (rmbr=0) because it hasn't learned them yet. Train_Epcs=5, train_env=Train_AC, Run and check results. 58 EE141 Summary The hippocampal model can rapidly, sequentially learn associations AB – AC without excessive interference. For this it was sufficient to use the Hebbian contrast rule, CPCA and the correct architecture. Interference results from using the same units, in CA3 it arrives at separation of identical images (representations) learned in another context. Separation of images doesn't allow associations, inferences based on similarity, efficient encoding of multidimensional information. The conjunction of images happens in CA1. This suggests a complementary role of the hippocampus, supplementing the slow learning mechanisms of the cortex. The hippocampus can remember episodes helping in spatial orientation, create conjunctive representations connecting different stimuli together quicker than the cortex. 59 EE141 Memory Memory is not uniform 1. Weights (long-term, require activation) vs activations (short-term, already activated, can influence processing) 2. Based on weights 3. The cortex has initial states but suffers from catastrophic influences. The hippocampus can learn fast without influences, using sparse distributed representations of images Based on activation The cortex shows initial states but isn't good for short-term memory 4. Cooperation of activation and memory based on weights 5. Video 1. 2. short-term memory in chimpanzees -30 sec Comparison with students– 30 sec 60 EE141 Active short-term memory Short-term priming: attention and influence on reaction speed. Besides the duration, memory content and effects resulting from similarity are like long-term priming. Project act_priming.proj. (http://grey.colorado.edu/CompCogNeuro/index.php/CECN1_Act_Priming) Completing roots or homophony, but without learning, only the influence of the remains after the last activation. The network has learned series IA-IB. The test has a series of images and results A and B, we show it A upon output, the network responds A; now we show the image for B but only phase is turned on – (lack of learning), the network's result is sometimes A, sometimes B. LoadNet, View TestLogs,Test The correlations of previous results A and B depend on the speed of fading of activation; check efekt act_decay 1 => 0, tendency to leaving a. Analyze the influence on results in test_log. 61 EE141 Active maintenance Project act_maint.proj (http://grey.colorado.edu/CompCogNeuro/index.php/CECN1_Active_Maintenance): active maintenance of information in working memory despite interference, quickly accessible, doesn't require synaptic changes. Recurrence is necessary, an attractor network with a large pool of attraction, resistant to noise. Video – remembering with delay – 30 sec The processes of analysing environmental data don't require such networks, because they are steered by incoming information. Activation should diverse, enabling associations and inferences, while we have external signals this will suffice, eg. if we note on paper the results of intermediate operations. With a lack of external activations, we have to rely on actively maintained representations in working memory, which has serious limits (famous Miller's 72, and even 42 for complex objects). First a model without attractors, which requires external signals, then distributed representations, but shallow attractors, not very resistant to noise; in the end deep but localised attractors, which disable associations. 62 EE141 Maintenance model Project act_maint.proj. 3 objects, 3 elements (features) r.wt, View Grid_log, Run: if there is an input activation is maintained, but after removal it disperses (the network blurred...). Check influence wt_mean =0.5, wt_var = 0.1, 0.25, 0.4 Net_Type Higher_order: we add combinations of feature pairs. 63 EE141 Defaults, Run, add noise_var=0.01, the network forgets... Isolated representations Default to return to initial parameters. network = IsolatedNet Lack of connections between hidden units, but there is recurrence, activation doesn't fade. Noise = 0.01 doesn't interfere, but with 0.02 sometimes gets ruined. Is it worth learning to focus in spite of noise? Different task: does stimulus S(t) = S(t+2)? Parameters: input_data = MaintUpdateEnv, network Isolated, noise 0.01 Init, Run: there are two inputs, Input 1 and 2, wt_scale 1=>2, changes the strength of local connections. The network can be switched from fast actualization to long-lasting maintenance. How to do this automatically? Dopamine and dynamic regulation of reward in the PFC. EE141 64 Working memory The prefrontal cortex plays the central role in maintaining active working memory and has desired properties: isolated self-activating attractor networks with extensive pools. Neuroanatomy, PFC connections and microcolumns => specialized area for active memory. A. PR – spatial. B. PR - spatial, self-ordered tasks. C. PR - spatial, object and verbal, self-ordered tasks and analytical thinking. D. PR - objects, analytical thinking. Typical experiments require delayed choice and show the differences between PC, IT, which have only temporary stimulus representations, and PFC, which maintains them longer. 65 EE141 Role of dopamine Blocking of dopamine has a negative influence on working memory, and aiding it has a positive influence. TD – temporal Difference in RL Dopamine (DA) arrives from the VTA (ventral tegmental area). DA strengthens internal activations, regulating access to working memory. VTA displays such increased activity. Basal ganglia can also regulate PFC activity. EE141 66 Basal Ganglia Pathways: thalamus- basal ganglia - cortex. Red lines – inhibition, mostly GABA. Blue lines: excitation, mostly glutamine. Black lines: dopamine, mostly inhibition. Malfunctions in these pathways lead to Parkinson, Huntington and other diseases. GP – Globus Pallidus Putamen; Substantia Nigra Subthalamic nucleus EE141 67 Working memory Project pfc_maint_updt.proj (http://grey.colorado.edu/CompCogNeuro/index.php/CECN1_PFC_Maint_Updt) Dynamic "gate” AC added to the network with recurrence and learning based on temporal differences (TD). Inputs: A, B, C, D Ignore, Store, Recall decides what to do with them PFC is working memory, AC = adaptive critic is a reward system (dopamine) controlling information renewal in the PFC, hidden layer represents the parietal cortex, hidden 2 maps to the output (frontal cortex). AC learns to predict the next reward, modulating the strength of internal PFC connections. 68 EE141 PFC Model r.wt: one-to-one connections between input, hidden layers and the PFC. AC has connections with the hidden layer and the PFC, but reverse connections AC => PFC serve only to modulate. Act, Step: we observe phases – and +, at first the activation of PFC and AC is zero, there are two + steps, first to change PFC weights, and then to set the correct signal propagation. When signal R appears (reminders), the network will not act correctly at first, the reward in AC is 0. At first the network doesn't know what's going on, learning only on Store, Ignore hidden layer 2, but sometimes noise in the PFC will cause the correct result and reward to appear. View Epoch_log, observe the change in weight of unit AC, r.wt Weights of S => AC should increase and error will decrease, the yellow line is the number of incorrect predictions of AC. View, Grid_log, Clear, act, Step. Store introduces data to the PFC, but 69 Ignore doesn't. After Recall, PFC is zeroed. EE141 PFC Model Figures EpochOutputData: cnt_err (black): number of errors per epoch (100 epochs), mostly errors in Recall. S_da (red): average amount of dopamine for Store, initially decreasing (PV/LV gives initially a lot of dopamine for all inputs), increasing when system starts to work correctly and number of errors goes down. I_da (blue): amount of dopamine for Ignore, decreases to 0, no reward. R_da (zielona): amount of dopamine in Recall, large fluctuations, shows difference with expectations. 70 EE141 A- not B Interactions between active and synaptic memory - weights have already changed but active memory is in a different state: what wins? These interactions are visible in the developing brains of children ~ 8 months (Piaget 1954), experiments done also on animals. A toy (food) is hidden in box A and after a short delay the child (animal) can remove it from there. After several repetitions in A, the toy is hidden in box B; the children keep looking in A. Active memory doesn't work in children as efficiently as synaptic memory, lesions in the area of the prefrontal cortex cause similar effects in adult and infant rhesus monkeys. Children make fewer errors looking in the direction of the place where the toy was hidden, than reaching for it. There are many interesting variants of this type of experiment and explanations on different levels. EE141 71 Project A- not B Decision-making process model: we know that information about place and objects is divided, so this information is given on input: place A, B, C, toy T1 or T2 and cover C1 or C2. Synaptic memory is realized with the help of standard CPCA Hebbian learning, and active memory as bi-directional connections between network representations in the hidden layer. Output layers: decisions about the direction of looking and reaching. The direction of looking is always activated during each experience, reaching is activated less often, only after moving the whole set-up toward the child, so these connections will rely on weaker learning. Initial tendency: agreement of looking and reaching on A (weight 0.7). All inputs connected with hidden neurons, weight 0.3. Project a_not_b.proj. (http://grey.colorado.edu/CompCogNeuro/index.php/CECN1_A_Not_B) EE141 72 Experiment 1 rect_ws =0.3 decides on the strength of recurrent activations in the hidden layer (working memory), changing this parameter simulates a child's development. View Events: 3 types of events, initial showing 4x, then A 2x, then B 1 x. An event has 4 temporal segments: 1) start, pretrial – boxes covered; 2) presentation, toy hidden in A; 3) expectation – toy in A; 4) choice – possible reaching. Only visible elements are active. View: Grid_log, Run performs the entire experiment, turns off display. ViewPre shows on Grid_log, A is activated ViewA shows A tests, after learning. ViewB shows B tests: the network makes an error. EE141 73 Further experiments Activation in the hidden layer flows toward the representation associated from A. rect_ws 0.3 => 0.75 for a mature child. Run, ViewB Although synaptic memory didn't change, more efficient working memory enables the undertaking of correct action. Try for rect_ws = 0.47 i 0.50 What happens? There is no activity – hesitation? The results depend on the length of the delay, with a shorter delay there are fewer errors. Delay 3=>1 Do tests for rect_ws = 0.47 i 0.50 What happens with a very young child? rect_ws = 0.15, delay = 3; Weak recurrence, weak learning for A. EE141 74 Other types of memory The traditional approach to memory assumes functional, cognitive, monolithic, canonical representations in memory. From modeling, it turns out that there are many systems interacting with each other which are responsible for memory, with different characteristics, variable representations and types of information. Recognition memory: was an element of the list seen earlier? A "recognition" signal is enough, remembering is not necessary. A hippocampus model is also useful here, it allows for remembering, but this is too much – in recognition memory the central role seems to be played by the area of the perirhinal cortex. Cued recall - completion of missing information. Free recall – effects of placement on the list (best at the beginning and the end), as well as grouping (chunking) of information. 75 EE141 Learning categories Categorization in psychology - many theories. Classic experiments: Shepard et al. (1961), Nosofsky et al. (1994). Problems with an increasing degree of complexity, division into categories C1, C2, 3 binary properties: color (black/white), size (small/large), shape (,). Type I: one property defines the category. Type II: two properties, XOR, np. Cat A: (black,large) or (white,small), any shape. Type III-V: one property + increasingly more exceptions. Type VI: lack of rules, enumeration Difficulties and speeds of learning: Type I < II < III ~ IV ~ V < VI 76 EE141 Canonical dynamic What happens in the brain while learning category definitions based on examples? Complex neurodynamics <=> the simplest dynamics (canonical). For all logical rules, we can write corresponding equations. For type II problems, or XOR: 1 2 2 2 2 V x, y, z 3 xyz x y z 4 V x 3 yz x 2 y 2 z 2 x x V y 3 xz x 2 y 2 z 2 y y V z 3 xy x 2 y 2 z 2 z z EE141 Feature area 77 Against majority List: diseases C or R, symptoms PC, PR, I Disease C is associated with symptoms (PC, I), disease R with (PR, I); C happens 3 times more often than R. (PC, I) => C, PC => C, I => C. Predictions „against majority” (Medin, Edelson 1988). Although PC + I + PR => C (60%), PC + PR => R (60%) Neurodynamic attractor pools? PDF in areas {C, R, I, PC, PR}. Psychological interpretation (Kruschke 1996): PR has meaning even though this is a differentiating symptom, although PC is more common. Activation PR + PC more often leads to result R although the gradient in direction R is greater. EE141 78 Learning Point of view Neurodynamics Psychology I+PC is more common => stronger synaptic connections, larger and deeper attractor basins. Symptoms I, PC are typical for C since they happen more often. To avoid attractors around I+PC leading to C, a deeper and more localized attractor around I+PR is created. For rare disease R, symptom I is not distinct, so attention focuses on PR associated with R. 79 EE141 Testing Point of view Neurodynamics Psychology Activating only I leads to C since more examples of I+PC create a larger shared attractor basin than I+PR. I => C, in accordance with expectations, more frequent stimuli I+PC are recalled more often. Activation by I+PC+PR leads frequently to C, because I+PC puts the system in the middle of the large C basin and even for PR gradients still lead to C. I+PC+PR => C because all symptoms are present and C is more frequent (base rates again). Activation by PR+PC leads more PC+PR => R because R is distinct frequently to R because the symptom, although PC is more attractor basin for R is deeper, and common. the gradient at (PR,PC) leads to R. 80 EE141 Summary Knowledge formed in memory is built, dynamic, continuous, appearing Behavior and inhibition of knowledge are the result of dynamic information processing rather than interaction structures set at the top. Recognition is based on the ability to differentiate earlier-learned activations from new, unknown activations. The hippocampus ensures high-quality recognition with a high threshold guaranteeing association of earlier-learned activations. Priming contributes to slow building of inviariant representations Two learning mechanisms Based on connection weights Based on neuron activation EE141 81 Summary The cortex helps recognition by priming The cortex leads to unstimulated associations The cortex is responsible for working memory cooperating with the hippocampus Sequences of grouped representations are stored in long-term memory Memory based on activation requires combining quick-actualizing with stable representations The hippocampus uses sparse distributed representations for fast learning without mixing ideas Priming memory can be long-term (based on weights) or short-term (based on activation) 82 EE141