How vocal turn-taking evolved, how it develops within individuals, and how it is mediated by neural circuits are open questions. We constructed a computational model of marmoset monkey vocal production to make predictions about the mechanisms underlying vocal exchanges and their development. The model is based on the interactions among three neural structures ('drive', 'motor', and 'auditory') with feedback connectivity inspired by published physiological and anatomical data. We fitted our model to the temporal dynamics of spontaneous vocalizations produced by isolated marmosets. We then tested the model for its ability to predict the structure of vocal exchanges between two marmosets. Our results demonstrate that the interaction between two of these three-node models result in turn-taking behavior that is nearly identical to that seen in natural adult marmoset vocal exchanges. We simulated the development of the auditory node by gradual strengthening its connection with the motor node. This generated the prediction that, during turn-taking development, the maturation of this connection will lead to decreases in overlapping vocalizations and increases in alternating exchanges.