Brief-term Hebbian studying can implement transformer-like consideration

Summary
Transformers have revolutionized machine studying fashions of language and imaginative and prescient, however their reference to neuroscience stays tenuous. Constructed from consideration layers, they require a mass comparability of queries and keys that’s tough to carry out utilizing conventional neural circuits. Right here, we present that neurons can implement attention-like computations utilizing short-term, Hebbian synaptic potentiation. We name our mechanism the match-and-control precept and it proposes that when exercise in an axon is synchronous, or matched, with the somatic exercise of a neuron that it synapses onto, the synapse will be briefly strongly potentiated, permitting the axon to take over, or management, the exercise of the downstream neuron for a short while. In our scheme, the keys and queries are represented as spike trains and comparisons between the 2 are carried out in particular person spines permitting for lots of of key comparisons per question and roughly as many keys and queries as there are neurons within the community.
Creator abstract
Most of the most spectacular current advances in machine studying, from producing photographs from textual content to human-like chatbots, are primarily based on a neural community structure generally known as the transformer. Transformers are constructed from so-called consideration layers which carry out massive numbers of comparisons between the vector outputs of the earlier layers, permitting data to move by the community in a extra dynamic approach than earlier designs. This massive variety of comparisons is computationally costly and has no recognized analogue within the mind. Right here, we present {that a} variation on a studying mechanism acquainted in neuroscience, Hebbian studying, can implement a transformer-like consideration computation if the synaptic weight modifications are massive and quickly induced. We name our methodology the match-and-control precept and it proposes that when presynaptic and postsynaptic spike trains match up, small teams of synapses will be transiently potentiated permitting a couple of presynaptic axons to regulate the exercise of a neuron. To reveal the precept, we construct a mannequin of a pyramidal neuron and use it as an instance the facility and limitations of the concept.
Quotation: Ellwood IT (2024) Brief-term Hebbian studying can implement transformer-like consideration. PLoS Comput Biol 20(1):
e1011843.
https://doi.org/10.1371/journal.pcbi.1011843
Editor: Emma Claire Robinson,
Kings School London, UNITED KINGDOM
Obtained: June 5, 2023; Accepted: January 19, 2024; Printed: January 26, 2024
Copyright: © 2024 Ian T. Ellwood. That is an open entry article distributed below the phrases of the Creative Commons Attribution License, which allows unrestricted use, distribution, and copy in any medium, offered the unique creator and supply are credited.
Knowledge Availability: All python code and NEURON channel MOD information, together with code for producing the information and figures can be found from a public GitHub repository, https://github.com/iellwood/MatchAndControlPaper.
Funding: This work was supported by the Mind and Habits Analysis Basis (grant 139526 to ITE). The funders had no position in examine design, information assortment and evaluation, determination to publish, or preparation of the manuscript.
Competing pursuits: The authors have declared that no competing pursuits exist.
Introduction
The transformer structure [1] has demonstrated outstanding, seemingly cognitive-like skills, together with few-shot and single-shot studying, puzzle fixing, pc programming, picture era and magnificence transformations [2–8]. Whereas nonetheless falling in need of human skills on many duties, transformer-based fashions proceed to enhance, and it’s pure to ask if their structure might have relevance to the computations carried out by neural circuits within the mind.
The important ingredient in transformers that differs from various deep studying strategies like recurrent neural networks and convolutional neural networks is the eye layer [1, 9–14]. This layer acts on three collections of vectors generally known as queries, keys and values. Every question is in contrast with every key and the output related to every question is a weighted sum of the values related to the most effective matching keys (Fig 1A and 1B). In state-of-the-art transformers, the variety of queries and keys will be within the tens of 1000’s.
Fig 1. Overview of the match-and-control precept.
A: In an consideration layer, every question is matched with the closest key. B: The output related to the question is given by the worth related to the closest matching key. C: The Match Section: We suggest that somatic exercise, pushed by basal dendrites, represents a question. This spike prepare backpropagates into the apical tuft, the place it’s in contrast in dendritic spines with axonal exercise, representing keys. D: The Management Section: Spines that acquired matching somatic and axonal exercise potentiate. Subsequent axonal exercise at potentiated spines represents the key-associated worth and is transmitted to the soma by way of dendritic spiking. E: The ion channels within the apical tuft and spines of the pyramidal neuron mannequin. F: The scale of the complete pyramidal neuron mannequin.
This matching of a number of queries and keys has no recognized analogue within the mind, however a associated concept is the comparability between a single question and plenty of keys. This less complicated computation is just like fashions of reminiscence recall, the place a question is analogous to a contextual cue and is used to seek out the most effective saved key or reminiscence in an associative neural community, resembling a contemporary Hopfield community [15, 16]. This analogy has been explored totally in [17], whereas [18] made the reference to hippocampal circuits extra specific. The connection between transformers and a writable reminiscence has additionally been made in machine studying [13] by way of the equivalence of linear transformers [11] and quick weight programmers [19–21]. (See additionally [22, 23] for 2 very totally different proposals and [24, 25] for efforts to seek out transformer-like representations within the mind).
We word, nonetheless, two necessary variations between fashions primarily based on the hippocampus [17, 18] and transformers. First, solely a single question is in contrast with the gathering of keys. Whereas one can think about sequentially evaluating queries or that a number of areas of the mind can obtain totally different queries concurrently, in both case, the variety of queries might be far smaller than the variety of keys. In distinction, in transformers, queries and keys are sometimes of the same amount. A second concern is that the keys in these fashions will not be dynamic, however mounted vectors realized by expertise, as they characterize saved reminiscences within the hippocampus. Whereas new keys will be added or modified by studying, this largely mounted assortment of keys is a departure from the spirit of the transformer the place the keys, queries and values are generated on the fly with every cross by the community.
Within the hippocampal-inspired fashions of transformers, the vector-valued queries are represented as a group of firing charges in a bundle of axons, as is typical for machine-learning-inspired fashions of the mind. Nonetheless, utilizing firing-rate vectors runs into difficulties when many queries and keys have to be in contrast concurrently. For instance, suppose one constructed a mannequin by which a bundle of axons, representing a key, is shipped to a number of mind areas in order that it could be in contrast with a number of queries. Regardless that the exercise in every copy of the bundle of axons is equivalent, every mind area will obtain a special message, reworked by the synaptic weight matrix between the axons and the downstream neurons. To implement an attention-like computation, these weight matrices have to be almost equivalent, or every comparability might be made with a reworked key. Fixing this concern would require refined studying guidelines which might be unknown within the mind.
Another, which we look at on this examine, is that queries and keys are represented by brief trains of spikes, as has been explored in current makes an attempt to implement transformers on neuromorphic chips [26–28]. A bonus of this selection is that each neural receiver of a prepare of spikes will obtain the identical message, whatever the synaptic power of the connection (offered that the power is just not zero). As well as, if one needs to ship a vector to many receivers within the mind, one can use the pure arborization of axons with none further equipment.
A closing benefit of utilizing spike trains is that organic mechanisms that evaluate two spike trains are well-established in neuroscience. When a neuron receives a prepare of spikes from an excitatory axon, glutamate is launched from the axon terminal and binds with glutamate receptors on the postsynaptic backbone. Of specific curiosity, NMDA receptors solely open and permit calcium entry into the backbone when they’re sure to glutamate and the backbone is depolarized, which may happen when somatic motion potentials backpropagate into the dendritic tree [29]. Underneath the fitting situations, the quantity of calcium entry right into a backbone can thus act as a proxy for the way related the sample of presynaptic motion potentials is to the sample of somatic spikes.
In normal theories of synaptic plasticity, when massive quantities of calcium enter the backbone, the backbone will endure long run potentiation (LTP), strengthening the synapse [30, 31]. This mechanism is believed to be the organic foundation for Hebbian studying within the mind. Right here, we contemplate a modification of those guidelines by which the potentiation following calcium entry is bigger than what is often present in LTP experiments, however is transient, lasting solely seconds, making it a type of short-term plasticity. If this potentiation is massive sufficient to permit a single axon to provoke dendritic spikes, we present {that a} computation just like transformer consideration will be carried out. We emphasize that as a result of the modifications to synaptic weights will not be lengthy lasting, our scheme is an instance of synaptic computation [32], not studying or reminiscence.
Outcomes
The match-and-control precept
In consideration layers, when a key matches with a question (i.e., the vectors have a big dot-product), the worth related to the important thing turns into the output related to the question. We emulate this course of utilizing pyramidal neurons as follows. We suppose {that a} prepare of spikes within the soma of a neuron represents a question vector. These somatic spike trains are generated by axonal inputs to the basal dendrites of the neuron. We characterize a number of queries with a number of neurons. The important thing vectors are represented by trains of motion potentials carried by particular person axons (one axon per key) that synapse onto the apical dendritic tree of the neuron.
Through the match section (Fig 1C), the firing patterns within the axons synapsing onto the apical dendrites are in contrast with the back-propagated motion potentials (BAPs) evoked by basal-driven exercise, utilizing the quantity of calcium entry into the apical dendritic spines. Intimately, in every backbone, NMDA receptors open when depolarizations from the BAPs happen simply after glutamate is launched onto the publish synaptic density of the backbone. Because the sample of glutamate launch is set by the sample of presynaptic spikes, which we establish with the important thing, the entire quantity of calcium that enters the backbone acts as a measure of the similarity of the timing of spikes within the question and key.
One subtlety with this dialogue is that small fluctuations of the dendritic membrane potential can permit NMDA receptors to leak sizable quantities of calcium into the backbone, spoiling this connection. This may be resolved by contemplating a calcium detector that may be a non-linear operate of the focus of calcium, suppressing responses to small quantities of calcium entry. Such non-linearities can happen when the molecule that senses calcium has a number of binding websites for calcium, all of which have to be sure for the molecule to be lively. For instance, one of many major molecules that detects backbone calcium ranges is calmodulin (CaM), which requires 4 calcium ions to change into totally lively.
We thus contemplate a easy polynomial non-linearity and use the fourth energy of the calcium focus as our measure of similarity. Elevating the calcium focus to this energy suppresses small fluctuations in calcium and has the great function that it narrows the temporary calcium entry occasions when BAPs coincide with glutamate launch. We word that in vivo mechanisms for calcium detection are seemingly extra complicated than this straightforward mannequin. For instance, CaM’s calcium-binding affinity will be affected by CaM-binding proteins and isn’t unbiased for the 4 calcium binding websites [33, 34]. Furthermore, there are a number of backbone mechanisms for the detection of calcium, together with calcium-dependent ion channels which will play a task on this course of, every of which may have a special calcium-dependent exercise. Nonetheless, all we’d like for our mechanism to operate is any non-linearity that suppresses the response to low ranges of calcium entry.
Following the match section, we suggest that spines with excessive ranges of calcium quickly and transiently potentiate their synaptic power with their respective axons. In our mannequin, we are going to stay agnostic about how precisely this potentiation is applied on the synapse and, so far as we’re conscious, there isn’t a experimental proof of fast massive will increase in synaptic power that depend upon the postsynaptic potential. Within the dialogue, we are going to look at the relative deserves of a number of doable mechanisms, together with fast AMPA phosphorylation, calcium-dependent non-specific ion channels, presynaptic facilitation, post-tetanic potentiation and NMDA spikes. For now, although, we merely assume that some mechanism exists and present its benefits for implementing the eye computation.
In consideration layers, a softmax operate is used to output a weighted sum of the values that intently match the question. In early consideration layers, this course of sometimes returns a mean of many values, however within the center layers, the values are sometimes related to a single key [17]. It’s this extremely selective type of consideration that we want to emulate, as we really feel that it’s the most “transformer-like”. Furthermore, implementing a softmax throughout spines is difficult as a result of it requires information of the entire quantity of potentiation throughout all of the spines to normalize the web output. Whereas that is probably doable utilizing a extra refined setup, right here we merely decide a excessive threshold for built-in calcium entry in order that solely a really small variety of spines ever change into potentiated. Whereas this means that generally we are going to fail to have any match, such failures are a well-known scenario in neuroscience. For instance, an try to recall a reminiscence from a cue can lead to no reminiscence being recalled or an try to provoke an motion can fail to provoke any motion.
As soon as any spines that cross our threshold are potentiated, we enter the management section (Fig 1D) by which axons presynaptic to potentiated synapses are in a position to drive spiking within the soma of the postsynaptic neuron, thus taking management of its exercise. The identical pre-dendritic axons that beforehand transmitted the keys now transmit spike trains representing key-associated values. The keys and values are thus temporally concatenated within the prepare of spikes delivered by the axons. To transmit the values related to the most effective matching keys, the axons that synapse onto the potentiated spines elicit dendritic spikes that propagate to the soma of the neuron resulting in somatic after which axonal spiking. On this approach, the somatic exercise of the neuron transitions from being the question within the matching section to being the worth related to the most effective matched spines within the management section.
We word that the variety of queries on this scheme is the same as the variety of taking part neurons, whereas the variety of keys is given by the variety of presynaptic axons, permitting for big numbers of queries and keys. Nonetheless, not each secret’s in contrast with each question. Dendritic bushes sometimes have 1000’s of spines, however every axon could make a number of synapses with a single neuron [35–37], making the variety of comparisons as a lot as an order of magnitude smaller than the variety of spines and much smaller than the variety of neurons in a typical neural circuit, which may vary from the 1000’s to thousands and thousands. Our proposal ought to thus be regarded as a sort of “sparse consideration” the place the variety of keys and queries will be huge, probably far bigger than fashionable transformers, however solely a fraction of them are in contrast with one another.
Biophysical mannequin implementing the match-and-control precept
To check if our scheme is biophysically believable, we applied a cable-theory primarily based pyramidal neuron mannequin with Hodgkin-Huxley-style ion channels utilizing the NEURON simulator [38] (Fig 1E and 1F). Within the building of our mannequin, we aimed to construct as easy a mannequin as doable that would implement the match-and-control precept. Actual pyramidal neurons embody quite a few further channels not thought-about right here, together with calcium channels, calcium-activated non-specific ion channels, a number of species of potassium channels and chloride channels. Our mannequin ought to thus be thought-about a proof-of-principle, in addition to an indication that only a few components are essential to provide our desired results, however not an try to implement any particular pyramidal neuron sort within the mind.
For the soma, basal dendrites and axon, we used a lowered mannequin of a layer V pyramidal neuron by Bahl et. al. [39]. For the dendrites, we constructed a 500 μm apical dendrite primarily based on a mannequin by Hay et. al. [40] related to a branching tuft consisting of six 250 μm segments with 250 spines per department, including as much as a complete of 1500 spines. For AMPA and NMDA channels we used fashions primarily based on [41–43]. We assumed that every of 150 axons synapsed 10 occasions on the dendritic tree, and that they synapsed on neighboring spines to provide them the most effective capability to depolarize the dendrites [37, 44, 45]. Our mannequin thus permits 150 distinctive keys and related values per neuron. The quantity of redundant axonal enter on neighboring spines doesn’t have a big affect on the mannequin’s efficiency because it merely will increase the dimensions of the dendritic EPSPs by an element of 10 for every presynaptic spike in each the match and management phases. This enhance will be achieved in a mannequin the place every axon solely synapses as soon as onto the dendritic tree by rescaling the AMPA conductance, and it is just the product of the AMPA conductance and axon synapse redundancy that’s necessary for the simulations. Nonetheless, we embody this function because it has been noticed in vivo and proposed as a mechanism for dendritic spike initiation by single axons.
We used the identical ion channels for the apical shaft and tuft dendrites as Hay et. al. [40], however elevated the conductance of the quick sodium channels and voltage dependent potassium channels, as their mannequin doesn’t help propagating dendritic spikes. In a brief section connecting the apical dendrite with the tuft branches, we used a barely greater density of quick sodium channels to permit the dendritic spikes to propagate into the apical shaft. We additionally elevated the axial resistance on this section to stop the bigger apical dendrite shaft from shunting any currents generated in close by spines. We word that in some pyramidal neurons, this area would comprise a excessive density of excessive voltage activated calcium channels [46], however to maintain our mannequin easy, we’ve got not modelled calcium dynamics exterior of the spines.
Calcium entry as a measure of spike prepare similarity
We started by testing whether or not calcium can be utilized as a proxy for spike prepare similarity in our mannequin, as in fashions of spike-timing dependent plasticity [41, 47]. An instance simulation is proven in Fig 2A for a backbone whose inputs match the somatic motion potentials and for a backbone whose inputs are random. As anticipated, matching the glutamatergic inputs to the backbone with the BAPs led to elevated calcium entry, however we additionally noticed sizable quantities of calcium entry away from BAPs within the backbone with unmatched inputs. As described above, utilizing the fourth energy of the calcium focus suppressed these occasions (Fig 2A, backside two traces).
Fig 2. Calcium as a measure of spike prepare similarity.
A: An instance simulation of the mannequin is proven. The left column of traces are recordings from a backbone receiving presynaptic spikes that occurred 7 ms earlier than the back-propagating motion potentials arrived from the soma. The precise column reveals recordings from a backbone that acquired random glutamatergic inputs drawn from an 6 Hz Poisson course of. Observe that even unmatched glutamatergic inputs produced calcium entry (third hint from backside, proper column), however that elevating the calcium focus to the fourth energy suppressed these occasions (backside two traces). B: The latency for a back-propagated motion potential to succeed in every of the 1500 spines within the dendritic tree (common of 10 simulations). Observe that slight variations in time to succeed in the symmetric branches of the dendritic tree arose within the simulation as a result of the dendrites have been receiving random axonal inputs. C: Plot of the best-fit kernel used to mannequin the integral of the fourth energy of the calcium focus as a bilinear overlap between the presynaptic and postsynaptic spike trains. Observe that the Kernel included a double sigmoidal window that pressured its values to exponentially decay to zero exterior of the interval [-30 ms, 30 ms] with a time-constant of 10 ms. D: A comparability of the easy kernel mannequin with the simulated built-in calcium sign on a check dataset not included within the match. 1000/15000 randomly chosen check information factors are proven.
In consideration layers, the similarity of a key and question is measured utilizing a dot product. We thus examined if the web calcium entry could possibly be approximated by the same bilinear type. To account for the temporal kinetics of NMDA channels and the finite width of BAPs, we used a mannequin with an arbitrary temporal kernel,
(1)
the place tpre and tpublish are the occasions of the pre- and postsynaptic BAPs and Okay is the kernel. We word that for tpublish we used the time when the BAP reached the backbone, accounting for the transmission delays from the lengthy dendrites of the apical tree proven in Fig 2B.
To estimate Okay(t), we used 1 second spike trains sampled from a Poisson course of with a mean spike price of 6 Hz and a spike refractory interval of fifty ms. We ran 100 simulations, giving a dataset of 15000 presynaptic spike trains and 100 publish synaptic spike trains. Utilizing gradient descent on a least squares loss, yielded the most effective match for Okay(t) proven in Fig 2C. The standard of the match is proven for a separate check dataset in Fig 2D, the place the mannequin defined 81% of the variance.
If the kernel was a delta-function, it could suggest that the calcium integral is roughly equal to the peculiar Euclidian inside product between the 2 spike trains thought-about as temporal vectors (i.e., vectors with ones the place the spikes are and zeros the place there are not any spikes). Due to the finite width of the height round zero, the mannequin could also be thought-about a temporally smeared model of a easy dot product.
On this comparability, we eliminated a small assortment of outliers (84/15000 (0.6%) and 77/15000 (0.5%) for coaching and check information) whose built-in calcium entry was 4 normal deviations bigger than the common. Such outliers can happen when the random fluctuations of the membrane potential permit for non-trivial NMDA channel conductance or when spontaneous dendritic spikes happen. Although such occasions are uncommon, they’re unavoidable in our fashions as a result of massive variety of random synaptic inputs to the dendrites and cut back the selectivity of our matching process.
Discovering the most effective matched spike prepare
We measured the optimum time for the presynaptic spikes to reach by computing the integral of the fourth energy of the calcium focus as a operate of the offset between pre- and postsynaptic spikes (Fig 3A). This computation confirmed that, ideally, presynaptic spikes arrive round 7 ms earlier than the BAPs arrive, a price we use all through our simulations.
Fig 3. Testing if a threshold can distinguish between matched and random spike trains.
A: The common over 300 simulations of the integral of the fourth energy of the calcium focus as a operate of the time separation between pre- and postsynaptic spikes. Most at 7 ms. B-D: Kernel density estimate of the distribution of calcium integrals for matched spike trains (blue) and random spike trains (purple). Match home windows of dimension 0.5, 1, and a couple of s proven. The calcium integrals have been normalized by a threshold that rejected all non-matched spines 90% of the time (dotted line). E: Receiver Operator Attribute (ROC) plot of the true constructive price vs. false-positive price for numerous calcium integral thresholds. Observe that the false-positive price is for any of the spines receiving random enter to have crossed the edge. The 45-degree line, representing no discrimination capability, is proven for comparability. F: Kernel density estimate of the distribution of calcium integrals for matched spike trains (blue) and random spike trains (faint pink) when a small jitter is added to the occasions of the matched spike prepare. The time offsets have been drawn from a gaussian with normal deviation of 0 (blue), 1 (purple) and a couple of (inexperienced) ms. G: A ROC plot as in panel E, however for various quantities of spike-time jitter and a 1 s match window.
For the match-and-control precept to work, it’s essential {that a} threshold can distinguish between an identical spike prepare and a random spike prepare. Whereas that is straightforward to realize when evaluating a single matching prepare with a single random prepare, it turns into non-trivial when there are a lot of random trains (our mannequin has 149), none of which ought to potentiate. There may be thus an unavoidable trade-off between rejecting as many random spike trains as doable and accepting as many matching spike trains as doable.
On this examine, we chosen an arbitrary rule that our threshold ought to reject all 149 non-matched spike trains 90% of the time and we normalize the calcium integrals by this threshold to make it simpler to match totally different situations. Due to this, in each plot, 1 is the edge for potentiation. We thought-about three totally different matching window sizes, 0.5, 1 and a couple of seconds. To match the power of a threshold to differentiate between matched and random spines, we chosen one group of spines to have a matched spike prepare enter and the remaining to be random spike trains, as described above. We ran 1500 simulations and computed a kernel density estimation of the distribution of calcium integrals, as proven in Fig 3B, 3C and 3D. Unsurprisingly, because the window dimension will increase, increasingly more of the matched spines cross the edge. This impact can be seen within the receiver operator attribute (ROC) plot (Fig 3E), the place longer matching home windows push the curve to the higher left. Lastly, utilizing a 1s window, we examined if including temporal jitter to the presynaptic spike occasions would impact the efficiency of our threshold. As will be seen in Fig 3F and 3G, shifting every spike by a random quantity drawn from a Gaussian with normal deviation of 1 or 2 ms solely had modest results on distribution of calcium integrals, however, as anticipated, made the random and matching distributions have extra overlap.
With our threshold that rejects the random spike trains 90% of the time, we discovered that the matched spines would cross threshold 45%, 82% or 98% of the time for 0.5, 1 or 2 second matching home windows. Including noise to spike trains within the 1 second window decreased the true constructive charges to 78%, and 73% for 1 and a couple of ms normal deviation noise, respectively. As ought to be anticipated, utilizing longer home windows and fewer noise permits for simpler separation of the matched keys from random keys.
Observe that, though we didn’t discover it on this examine, rising the variety of axonal inputs to the dendrites seemingly will increase the false constructive price. This may occur in two methods: First, with extra random inputs, it turns into more and more seemingly {that a} random spike prepare may be very near a matched spike prepare. Having related keys can occur in an consideration layer as nicely, however a significant distinction between a machine studying implementation and our system is that calcium is a loud measure of spike prepare similarity, making it unimaginable to differentiate between almost and precisely matching spike trains, even with a excessive threshold. Second, there’s a small probability that non-matching spike trains will produce massive quantities of calcium entry due to random fluctuations of the dendritic membrane and even spurious dendritic spikes. This concern has no analogue in transformers, however is fortunately solely a uncommon incidence in our mannequin.
As a result of each points have a roughly unbiased probability of taking place for every axonal enter, we anticipate that the true constructive price will decay exponentially to zero (holding the false constructive price fixed) because the variety of spines and presynaptic axons will increase. We thus speculate that it could be tough or unimaginable to implement our scheme, with out main modifications, if the apical dendrite tuft acquired 1000’s of unbiased axonal inputs.
The management section
To potentiate matching spines, we multiplied the AMPA-component of the EPSP dimension by a sigmoid proven in Fig 4A, rising it by an element of eight when the calcium integral crossed our threshold. The baseline and potentiated EPSP sizes have been tuned by hand to make sure that potentiated spines have been in a position to drive dendritic spikes, whereas random fluctuations within the membrane potential hardly ever prompted them. As a result of it’s unrealistic for EPSP sizes to instantaneously enhance, we added a time lag to the consequences of the sigmoid with a time fixed of 500 ms. This time fixed was arbitrary, and we discovered that it had little impact on our simulations, although a minor good thing about the lag was that it lowered untimely potentiation in the course of the matching section in addition to spurious potentiation in the course of the management section when non-matched values occurred to coincide with the worth of the matched backbone.
Fig 4. Simulations with potentiation and a management section.
A: Plot of the sigmoid used for potentiation of backbone EPSPs as a operate of built-in calcium. The edge is proven as a vertical line. B: An instance simulation exhibiting a backbone receiving enter from an axon whose spikes coincided with the timing of BAPs (matched backbone) and a second backbone which acquired inputs from one of many different 149 axons with random spike trains (random backbone). The random backbone was chosen as probably the most potentiated of all spines not receiving enter from the one matched axon. The somatic stimulation was solely delivered in the course of the match section, however somatic spikes seem within the management section that coincide with inputs to the matched backbone (somatic voltage, purple hint), however not the random backbone. The integral of the fourth energy of calcium is proven for each spines and the edge is proven as a dotted blue line. Solely the matched backbone crossed the edge. C: A set of repeated simulations with equivalent somatic and dendritic stimulation however with various thresholds for potentiation. Solely one of many 150 axons had a spike prepare that matched the BAPs, whereas the remaining had random spike trains. Voltages are recorded within the soma. The numbers on the left characterize fractions of our normal threshold that rejects all random axonal inputs 90% of the time. D: An instance simulation the place two spines had inputs from axons whose spike trains matched with the somatic stimulation. Each spines have been in a position to transmit spikes to the soma producing an output that was the sum of the 2 enter spike trains. E: Failure instance 1: Regardless of one of many axons having a spike prepare that matches with the BAPs, the spines that obtain this axonal enter don’t potentiate strongly sufficient to set off dendritic spikes. F: Failure instance 2: A number of spines potentiated, solely one in all which had axonal enter that matched the BAPs. Observe that the matched backbone was in a position to produce dendritic spikes, however many of the dendritic spikes originated from one of many different spines.
A typical simulation is proven in Fig 4B, which reveals a backbone receiving enter from an axon whose spike prepare matches with the somatic spike prepare. After the match section, no somatic stimulation is delivered, however the soma is pushed to spike in time with the pre-dendritic axon’s motion potentials, exhibiting that the axon is ready to management somatic exercise. Additionally proven is the second-most-potentiated backbone that acquired enter from a pre-dendritic axon with a random spike prepare, however there isn’t a correlation between this axon’s spike prepare and the somatic spike prepare. In our simulation, we left a small hole of 500 ms between the management and matching phases, however this isn’t essential. We anticipate that in an actual neuron, the management section would start instantly, however discovered that together with this hole made it simpler to know our plots.
To see the impact of the calcium integral threshold, an equivalent set of pre- and postsynaptic inputs have been simulated with totally different thresholds (Fig 4C). On the highest threshold, no potentiation happens, whereas at decrease thresholds spurious spikes happen in the course of the management section. On the lowest threshold, a cascading catastrophe happens as increasingly more spines potentiate, and the dendrites are pushed to spike wildly. In a extra detailed ion-channel mannequin, we anticipate that the dendrite would change into persistently depolarized on this situation, fairly than help ultra-high-frequency spiking. Regardless, we don’t anticipate to see such habits in actual neurons as homeostatic processes within the neuron might detect extreme depolarization or spiking and enhance the backbone potentiation threshold to stop it sooner or later.
We additionally examined the case the place two teams of spines match with the BAPs sample as proven in Fig 4D. In some such instances, as proven right here, the spikes of the presynaptic spike trains are merely added collectively, however we word the prospect of each trains crossing the edge is the sq. of the prospect of one in all them crossing, making this occurence much less seemingly than a profitable potentiation of a single backbone group. We word, although, that, when a number of spines are weakly potentiated, they are going to proceed to potentiate in the course of the management section, since they’re typically depolarizing the backbone sufficient to open NMDA channels. This persevering with potentiation in the course of the management section is attention-grabbing because it interprets the standard of the match right into a non-trivial time-dependence, although we’re uncertain if it has organic relevance.
We additionally observe instances by which an axon with a spike prepare matched to the somatic exercise fails to take management of the postsynaptic output. In Fig 4E, we see a failed potentiation of a backbone that hardly missed crossing the edge (dotted line). In Fig 4F, we see an instance the place one of many random spines additionally potentiated. On this case, spikes originating from the spines receiving matched enter solely account for a small fraction of the spikes generated.
To judge the general efficiency of the mannequin, we carried out 1000 simulations for every of the three match window sizes, 0.5, 1 and a couple of s. In every simulation, one of many axonal inputs was once more randomly chosen to match with the somatic spike sample in the course of the matching section, whereas the opposite 149 axons have been drawn from a 6 Hz Poisson course of. We then characterised every of the matched axon’s presynapatic inputs in the course of the management section as “profitable” provided that it was adopted inside 20 ms by a somatic spike. In distinction, every somatic spike that was not preceded by an enter from the matched axon on this identical window was thought-about “spurious”. The % of profitable and spurious somatic spikes is proven in Fig 5A.
Fig 5. Efficiency of the mannequin in the course of the management section.
A: The outcomes from 1000 simulations of the mannequin for every of the three window sizes are proven. In every simulation, one axonal enter was matched with the somatic spiking sample and the remaining have been random. The % of axonal inputs from the matched axon that produced a somatic spike is proven in blue (% profitable). The % of spikes that would not be defined by axonal stimulation from the matched axon is proven in purple (% spurious). B&C: The % of spikes that have been profitable and spurious is damaged down by question spike price.
We noticed that the spike price of the question had a robust affect on the % of profitable spikes, particularly for the 0.5 and 1 s match home windows (Fig 5B), whereas having a extra modest impact on the % of spurious spikes (Fig 5C). This discovering isn’t a surprise, since every somatic spike provides an opportunity for NMDA receptors to open and permit calcium entry. When there are fewer spikes, much less calcium will enter every backbone, lowering the prospect that the backbone will cross the edge for potentiation. It’s thus very pure to repair the spike price of the question to be the common spike price of the axonal inputs. Certainly, doing so for 1 s match home windows raises the % of profitable spikes as much as that of the two s home windows.
Provided that the match and management precept can implement a computation just like transformer-attention, it’s pure to ask if the biophysical mannequin might substitute for an consideration layer in a working transformer and whether or not a quantitative measure of the efficiency of this substitution could possibly be carried out. Right here we describe two obstacles that have to be overcome earlier than such a process could possibly be applied. First, our mannequin makes use of spike trains, as an alternative of price vectors, to explain the keys, queries and values. Though we demonstrated that spines can compute a smoothed overlap between the important thing and question spike trains when represented as vectors, the corresponding vectors are sparse and non-negative, in contrast to the arbitrary vectors utilized in most transformers. There has already been some effort to construct transformers out of such vectors [26–28], however further work is required. Second, on this paper we used a easy threshold to pick out the most effective matching key, fairly than a softmax. This selection permits for the likelihood that no matching secret’s discovered or that a number of keys match with the question and their values are transmitted with equal weight. It could be doable to make use of a extra refined mannequin with suggestions inhibition to higher approximate the normalization of a softmax, however we go away this to future work.
Dialogue
The match-and-control precept is a straightforward mechanism for implementing one thing akin to the eye layers of transformers. Consideration is computationally costly, however our mannequin permits for big numbers of key and question comparisons as a result of it makes use of one of many smallest computational models of the mind, small teams of spines, to carry out every overlap. Right here we briefly talk about the restrictions of this proposal in addition to doable circuits within the mind the place it’d plausibly happen.
In our view, probably the most troublesome facet of our proposal is the massive potentiation of synapses required for the axonal inputs to regulate somatic exercise. In our mannequin we used an element of eight enhance in EPSP dimension, which is far bigger than what is often seen following LTP induction. Remarkably, such massive will increase in EPSP dimension have been noticed in short-term potentiation way back to Bliss and Lomo [48], however this impact is believed to be presynaptic [49] and thus can not depend upon the synchrony of pre- and postsynaptic spikes. One doable mechanism that we’ve got contemplated is that presynaptic strengthening happens broadly throughout the dendritic tree following the match section, permitting presynaptic spikes to drive dendritic spiking, however that postsynaptic melancholy of non-matching spines happens concurrently, blocking this impact for all however a couple of spines. An identical concept is that the spiking patterns of the worth might differ qualitatively from these of the important thing. If the worth spike prepare consists of bursts of motion potentials, this is able to make it simpler for single axons to provoke dendritic spikes, however would once more require melancholy of non-matching synapses.
Calcium-dependent ion channels supply one other mechanism for depolarizing the backbone. Notably, transient receptor potential (TRP) channels resembling TRPM4 [50] and TRPM5 [51, 52] which might be impermeable to calcium, might permit for fast depolarization of dendrites following NMDA-dependent calcium entry, permitting for simpler dendritic spike initiation. Calcium-dependent potassium channels, resembling SK channels, are present in spines [53], and will depress the voltage in spines with low ranges of calcium entry. Appreciable care must be taken in the usage of these channels as any persistent enhance in membrane potential might trigger NMDA channels to open within the absence of BAPs. Nonetheless, we consider that cautious use of them might cut back the dimensions of the potentiation wanted for our easy mannequin.
A associated concern is the power of dendritic spikes to faithfully propagate to the soma. We discovered it essential to rigorously design the interface between the small dendrites of the tuft and the a lot bigger apical dendrite to stop dendritic spikes from dying out. We resolved this downside by including a area of considerably greater sodium channel density. In massive pyramidal neurons this lively zone could be changed by area of high-voltage-activated calcium channels, resembling L-channels, however the slower kinetics of such channels might have issue in faithfully transmitting the person spikes in a spike prepare. This may increasingly make it advantageous to implement our mechanism in pyramidal neurons with narrower apical dendrites than we utilized in our mannequin or to make use of a price consisting of bursts of spikes that could be simpler to transmit to the soma.
Whereas these issues might seem as nice challenges for our design, we word that Larkum, in his apical dendrite “manifesto” [54], instructed that small clusters of spines on layer 2/3 neuron apical dendrites may “with explosive affect dictate the firing of the neuron”. In his formulation, this impact arises as a result of broad dendritic inhibition silences many of the spines, whereas sparing a small variety of them which might be depolarized by NMDA spikes. We prolong this speculation by observing that, if some Hebbian mechanism will be found that selects which teams of spines are allowed to regulate the neuron, it could permit for an attention-like computation in cortex.
One other main objection that one may increase to our proposal is that NMDA-dependent plasticity already has a goal within the mind, LTP, and our new mechanism would appear to crowd in on earlier theories of studying. Nonetheless, transient will increase in synaptic power which might be bigger than what’s going to stay asymptotically are frequent in LTP experiments and plenty of theories have been proposed which may use them [32, 49]. Moreover, if the synapses in our mannequin endure LTP slowly after every match section, this is able to suggest that axons that generally match with the exercise of a neuron change into most popular in future comparisons. Whereas this is able to break the democratic matching means of consideration layers in transformers, such a bias could possibly be advantageous. Since our matching process entails a loud comparability between spike trains (due to random membrane fluctuations) evaluating 1000’s of keys with a question could also be almost unimaginable in vivo and a mechanism, like LTP, that selects seemingly candidates for matching could be helpful. A much more difficult query is how a neural implementation of transformers could possibly be skilled by expertise. This downside stays unsolved for conventional fashions of neural circuits as nicely, however we word that there are promising strategies for coaching synthetic spiking transformers [27], even when they haven’t but achieved the identical efficiency as normal rate-coded fashions.
An extra puzzle is what mechanism may synchronize the arrival of the keys with the beginning of the queries. Whereas it’s tempting to imagine that cortical rhythms may play a task, we’ve got discovered that match home windows round 0.5—2 s are essential for lowering noise within the spike prepare comparisons, a timescale for much longer than the cycles of theta, beta or gamma rhythms discovered within the cortex. Bouts of cortical exercise lasting round a second are noticed following sensory stimulation, or in the course of the “UP states” that happen in each sleeping and awake however resting animals [55, 56]. The match section could also be triggered off the beginning of those intervals of excessive exercise.
Lastly, we flip to the query of which neural populations within the mind may implement our program. We chosen a pyramidal neuron for our mannequin as a result of the apical dendrite is a pure mechanism for transmitting dendritic spikes to the soma. (See [54, 57–59] for different theories of apical dendrites.) Nonetheless, massive layer V pyramidal neurons specifically have a number of disadvantages. First, they’re recognized to have dendritic spikes that don’t attain the soma and their dendrites are managed by complicated dendritic inhibition [60, 61]. Second, BAPs have been noticed to broaden temporally as they attain the distal tuft, seemingly as a result of calcium channel participation, which might hurt our exact temporal matching [61]. Lastly, they’re the output neurons of cortex, and, as was mentioned above, in lots of transformer fashions, consideration layers are sometimes solely extremely selective within the center layers of transformers, not of their enter or output layers.
We thus conjecture that if our mechanism happens, it’s extra more likely to be present in both skinny tufted layer V neurons [62, 63] or in layer 2/3 neurons each of which challenge primarily inside the cortex. Layer 2/3 neurons, specifically, might have the most effective capability to propagate BAPs to the dendrites for matching, and dendritic spikes to the soma for transmission, as a result of their comparatively brief apical size [64].
Supplies and strategies
All computations have been carried out on a desktop PC with a 12 core Intel Xeon Processor utilizing the Python interface for the NEURON simulator. All python code and NEURON channel MOD information, together with code for producing the information and figures can be found from a public GitHub repository, https://github.com/iellwood/MatchAndControlPaper.
The repository contains directions for operating the code and a script that prints the NEURON mannequin’s parameters, together with the conductances of the person ion channels.
Acknowledgments
We wish to thank Andrew Bass, Ronanld Hoy, Christiane Linster, Monzilur Rahman, Weinan Solar, Melissa Warden and Ronald Harris-Warrick for discussions and feedback on the manuscript.
References
- 1.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Consideration Is All You Want. Advances in Neural Info Processing Programs. 2017. - 2.
Saharia C, Chan W, Saxena S, Li L, Whang J, Denton E, et al. Photorealistic Textual content-to-Picture Diffusion Fashions with Deep Language Understanding. 2022; arxiv:2205.11487v1. - 3.
Ramesh A, Dhariwal P, Nichol A, Chu C, OpenAI MC. Hierarchical Textual content-Conditional Picture Technology with CLIP Latents. 2022; arxiv:2204.06125v1. - 4.
Lin T, Wang Y, Liu X, Qiu X. A Survey of Transformers. AI Open. 2021;3:111–132. - 5.
Devlin J, Chang MW, Lee Okay, Toutanova Okay. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL HLT 2019—2019 Convention of the North American Chapter of the Affiliation for Computational Linguistics: Human Language Applied sciences—Proceedings of the Convention. 2018;1:4171–4186. - 6.
Polu S, Han JM, Zheng Okay, Baksys M, Babuschkin I, Sutskever I. Formal Arithmetic Assertion Curriculum Studying. Worldwide Convention on Studying Representations, 2023. - 7.
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, et al. An Picture is Value 16×16 Phrases: Transformers for Picture Recognition at Scale. 2020; Worldwide Convention on Studying Representations, 2021. - 8.
Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, et al. Language Fashions are Few-Shot Learners. Advances in Neural Info Processing Programs. 2020;2020-December. - 9.
Bahdanau D, Cho KH, Bengio Y. Neural Machine Translation by Collectively Studying to Align and Translate. third Worldwide Convention on Studying Representations, ICLR 2015—Convention Monitor Proceedings. 2014. - 10.
Luong MT, Pham H, Manning CD. Efficient Approaches to Consideration-based Neural Machine Translation. Convention Proceedings—EMNLP 2015: Convention on Empirical Strategies in Pure Language Processing. 2015; p. 1412–1421. - 11.
Katharopoulos A, Vyas A, Pappas N, Fleuret F. Transformers are RNNs: Quick Autoregressive Transformers with Linear Consideration. thirty seventh Worldwide Convention on Machine Studying, ICML 2020. 2020;PartF168147-7:5112–5121. - 12.
Choromanski Okay, Likhosherstov V, Dohan D, Music X, Gane A, Sarlos T, et al. Rethinking Consideration with Performers. ICLR 2021—ninth Worldwide Convention on Studying Representations. 2020. - 13.
Schlag I, Irie Okay, Schmidhuber J. Linear Transformers Are Secretly Quick Weight Programmers. Proceedings of Machine Studying Analysis. 2021;139:9355–9366. - 14.
Peng H, Pappas N, Yogatama D, Schwartz R, Smith NA, Kong L. Random Function Consideration. ICLR 2021—ninth Worldwide Convention on Studying Representations. 2021. - 15.
Krotov D, Hopfield JJ. Dense Associative Reminiscence for Sample Recognition. Advances in Neural Info Processing Programs. 2016;29. - 16.
Demircigil M, Heusel J, Löwe M, Upgang S, Vermet F. On a Mannequin of Associative Reminiscence with Large Storage Capability. Journal of Statistical Physics. 2017;168:288–299. - 17.
Ramsauer H, Schäfl B, Lehner J, Seidl P, Widrich M, Adler T, et al. Hopfield Networks is All You Want. 2020; Worldwide Convention on Studying Representations; 2021. - 18.
Whittington JCR, Warren J, Behrens TEJ. Relating transformers to fashions and neural representations of the hippocampal formation. Worldwide Convention on Studying Representations; 2022. - 19.
Schmidhuber J. Studying to regulate fast-weight reminiscences: A substitute for recurrent nets. Technical Report FKI-147-91, TU Munich. 1991. - 20.
Schmidhuber J. Studying to Management Quick-Weight Recollections: An Different to Dynamic Recurrent Networks. Neural Computation. 1992;4:131–139. - 21.
Schmidhuber J. Lowering the ratio between studying complexity and variety of time various variables in totally recurrent nets; Worldwide Convention on Synthetic Neural Networks, 1993. p. 460–463. - 22.
Kozachkov L, Kastanenka KV, Krotov D. Constructing Transformers from Neurons and Astrocytes. Proceedings of the nationwide academy of sciences of the USA of America. 2023; 120(34):e2219150120 pmid:37579149 - 23.
Irie Okay, Schmidhuber J. Studying to Management Quickly Altering Synaptic Connections: An Different Kind of Reminiscence in Sequence Processing Synthetic Neural Networks; NeurIPS 2022. - 24.
Caucheteux C, King JR. Brains and algorithms partially converge in pure language processing. Communications Biology 2022 5:1. 2022;5:1–10. pmid:35173264 - 25.
Schrimpf M, Clean IA, Tuckute G, Kauf C, Hosseini EA, Kanwisher N, et al. The neural structure of language: Integrative modeling converges on predictive processing. Proceedings of the Nationwide Academy of Sciences of the USA of America. 2021;118:e2105646118. pmid:34737231 - 26.
Yao M, Gao H, Zhao G, Wang D, Lin Y, Yang Z, et al. Temporal-wise Consideration Spiking Neural Networks for Occasion Streams Classification. Proceedings of the IEEE Worldwide Convention on Laptop Imaginative and prescient. 2021; p. 10201–10210. - 27.
Li Y, Lei Y, Yang X. Spikeformer: A Novel Structure for Coaching Excessive-Efficiency Low-Latency Spiking Neural Community. 2022; arXiv:2211.10686. - 28.
Zhu RJ, Zhao Q, Eshraghian JK. SpikeGPT: Generative Pre-trained Language Mannequin with Spiking Neural Networks. 2023; arXiv:2302.13939. - 29.
Citri A, Malenka RC. Synaptic Plasticity: A number of Types, Capabilities, and Mechanisms. Neuropsychopharmacology 2008 33:1. 2007;33:18–41. pmid:17728696 - 30.
Markram H, Gerstner W, Sjöström PJ. Spike-timing-dependent plasticity: A complete overview. Frontiers in Synaptic Neuroscience. 2012;4:2. pmid:22807913 - 31.
Nicoll RA. A Temporary Historical past of Lengthy-Time period Potentiation. Neuron. 2017;93:281–290. pmid:28103477 - 32.
Abbott LF, Regehr WG. Synaptic computation. Nature 2004 431:7010. 2004;431:796–803. pmid:15483601 - 33.
Xia Z, Storm DR. The position of calmodulin as a sign integrator for synaptic plasticity. Nature Opinions Neuroscience 2005 6:4. 2005;6:267–276. pmid:15803158 - 34.
Pepke S, Kinzer-Ursem T, Mihalas S, Kennedy MB. A Dynamic Mannequin of Interactions of Ca2+, Calmodulin, and Catalytic Subunits of Ca2+/Calmodulin-Dependent Protein Kinase II. PLOS Computational Biology. 2010;6:1000675. pmid:20168991 - 35.
Kasthuri N, Hayworth KJ, Berger DR, Schalek RL, Conchello JA, Knowles-Barley S, et al. Saturated Reconstruction of a Quantity of Neocortex. Cell. 2015;162:648–661. pmid:26232230 - 36.
Hiratani N, Fukai T. Redundancy in synaptic connections allows neurons to study optimally. Proceedings of the Nationwide Academy of Sciences of the USA of America. 2018;115:E6871–E6879. pmid:29967182 - 37.
Gal E, London M, Globerson A, Ramaswamy S, Reimann MW, Muller E, et al. Wealthy cell-type-specific community topology in neocortical microcircuitry. Nature Neuroscience 2017 20:7. 2017;20:1004–1013. pmid:28581480 - 38.
Carnevale NT, Hines ML. The NEURON ebook. The NEURON Ebook. 2006; p. 1–457. - 39.
Bahl A, Stemmler MB, Herz AVM, Roth A. Automated optimization of a lowered layer 5 pyramidal cell mannequin primarily based on experimental information. Journal of neuroscience strategies. 2012;210:22–34. pmid:22524993 - 40.
Hay E, Hill S, Schürmann F, Markram H, Segev I. Fashions of Neocortical Layer 5b Pyramidal Cells Capturing a Large Vary of Dendritic and Perisomatic Energetic Properties. PLOS Computational Biology. 2011;7:e1002107. pmid:21829333 - 41.
Badoual M, Zou Q, Davison AP, Rudolph M, Bal T, Frégnac Y, et al. Biophysical and phenomenological fashions of a number of spike interactions in spike-timing dependent plasticity. Worldwide journal of neural techniques. 2006;16:79–97. pmid:16688849 - 42.
Kim Y, Hsu CL, Cembrowski MS, Mensh BD, Spruston N. Dendritic sodium spikes are required for long-term potentiation at distal synapses on hippocampal pyramidal neurons. eLife. 2015;4. pmid:26247712 - 43.
Humphries R, Mellor JR, O’Donnell C. Acetylcholine Boosts Dendritic NMDA Spikes in a CA3 Pyramidal Neuron Mannequin. Neuroscience. 2022;489:69–83. pmid:34780920 - 44.
Bloss EB, Cembrowski MS, Karsh B, Colonell J, Fetter RD, Spruston N. Single excitatory axons type clustered synapses onto CA1 pyramidal cell dendrites. Nature Neuroscience 2018 21:3. 2018;21:353–363. pmid:29459763 - 45.
Wilson DE, Whitney DE, Scholl B, Fitzpatrick D. Orientation selectivity and the purposeful clustering of synaptic inputs in major visible cortex. Nature Neuroscience 2016 19:8. 2016;19:1003–1009. pmid:27294510 - 46.
Ramaswamy S, Markram H. Anatomy and physiology of the thick-tufted layer 5 pyramidal neuron. Frontiers in Mobile Neuroscience. 2015;9:233. pmid:26167146 - 47.
Micheli P, Ribeiro R, Giorgetti A. A Mechanistic Mannequin of NMDA and AMPA Receptor-Mediated Synaptic Transmission in Particular person Hippocampal CA3-CA1 Synapses: A Computational Multiscale Strategy. Worldwide Journal of Molecular Sciences. 2021;22:1–24. pmid:33546429 - 48.
Bliss TV, Lomo T. Lengthy-lasting potentiation of synaptic transmission within the dentate space of the anaesthetized rabbit following stimulation of the perforant path. The Journal of physiology. 1973;232:331–56. pmid:4727084 - 49.
Zucker RS, Regehr WG. Brief-term synaptic plasticity. Annual evaluate of physiology. 2002;64:355–405. pmid:11826273 - 50.
Launay P, Fleig A, Perraud AL, Scharenberg AM, Penner R, Kinet JP. TRPM4 is a Ca2+-activated nonselective cation channel mediating cell membrane depolarization. Cell. 2002;109:397–407. pmid:12015988 - 51.
Bos R, Drouillas B, Bouhadfane M, Pecchi E, Trouplin V, Korogod SM, et al. Trpm5 channels encode bistability of spinal motoneurons and guarantee motor management of hindlimbs in mice. Nature Communications 2021 12:1. 2021;12:1–18. pmid:34819493 - 52.
Lei YT, Thuault SJ, Launay P, Margolskee RF, Kandel ER, Siegelbaum SA. Differential contribution of TRPM4 and TRPM5 nonselective cation channels to the sluggish afterdepolarization in mouse prefrontal cortex neurons. Frontiers in Mobile Neuroscience. 2014;8:267. pmid:25237295 - 53.
Ngo-Anh TJ, Bloodgood BL, Lin M, Sabatini BL, Maylie J, Adelman JP. SK channels and NMDA receptors type a Ca2+-mediated suggestions loop in dendritic spines. Nature neuroscience. 2005;8:642–649. pmid:15852011 - 54.
Larkum ME. Are Dendrites Conceptually Helpful? Neuroscience. 2022;489:4–14. pmid:35288178 - 55.
Luczak A, Barthó P, Marguet SL, Buzsáki G, Harris KD. Sequential construction of neocortical spontaneous exercise in vivo. Proceedings of the Nationwide Academy of Sciences of the USA of America. 2007;104:347–52. pmid:17185420 - 56.
Wilson DC. Up and down states. Scholarpedia journal. 2008;3:1410. pmid:20098625 - 57.
Larkum M. A mobile mechanism for cortical associations: an organizing precept for the cerebral cortex. Developments in Neurosciences. 2013;36:141–151. pmid:23273272 - 58.
Payeur A, Guerguiev J, Zenke F, Richards BA, Naud R. Burst-dependent synaptic plasticity can coordinate studying in hierarchical circuits. Nature neuroscience. 2021;24:1010–1019. pmid:33986551 - 59.
Sacramento J, Costa RP, Bengio Y, Senn W. Dendritic cortical microcircuits approximate the backpropagation algorithm. Advances in Neural Info Processing Programs. 2018;31. - 60.
Cichon J, Gan WB. Department-specific dendritic Ca2+ spikes trigger persistent synaptic plasticity. Nature. 2015;520:180. pmid:25822789 - 61.
Larkum ME, Kaiser KMM, Sakmann B. Calcium electrogenesis in distal apical dendrites of layer 5 pyramidal cells at a crucial frequency of back-propagating motion potentials. Proceedings of the Nationwide Academy of Sciences of the USA of America. 1999;96:14600–14604. pmid:10588751 - 62.
Morishima M, Kawaguchi Y. Recurrent connection patterns of corticostriatal pyramidal cells in frontal cortex. The Journal of neuroscience: the official journal of the Society for Neuroscience. 2006;26:4394–405. pmid:16624959 - 63.
Hattox AM, Nelson SB. Layer V neurons in mouse cortex projecting to totally different targets have distinct physiological properties. Journal of Neurophysiology. 2007;98:3330–3340. pmid:17898147 - 64.
Larkum ME, Waters J, Sakmann B, Helmchen F. Dendritic Spikes in Apical Dendrites of Neocortical Layer 2/3 Pyramidal Neurons. The Journal of Neuroscience. 2007;27:8999. pmid:17715337