We propose a model of songbird learning that focuses on avian brain areas HVC and RA, involved in song production, and area LMAN, important for generating song variability. Plasticity at HVC → RA synapses is driven by hypothetical "rules" depending on three signals: activation of HVC → RA synapses, activation of LMAN → RA synapses, and reinforcement from an internal critic that compares the bird's own song with a memorized template of an adult tutor's song. Fluctuating glutamatergic input to RA from LMAN generates behavioral variability for trial-and-error learning. The plasticity rules perform gradient-based reinforcement learning in a spiking neural network model of song production. Although the reinforcement signal is delayed, temporally imprecise, and binarized, the model learns in a reasonable amount of time in numerical simulations. Varying the number of neurons in HVC and RA has little effect on learning time. The model makes specific predictions for the induction of bidirectional long-term plasticity at HVC → RA synapses.
All Science Journal Classification (ASJC) codes