The use of one-bit analog-to-digital converters (ADCs) at a receiver is a power-efficient solution for future wireless systems. This paper presents a likelihood function learning method that enables robust maximum-a-posteriori-probability (MAP) detection for time-varying multiple-input multiple-output systems with one-bit ADCs. The key idea is to track the temporal variations of likelihood functions by exploiting input-output samples obtained from data detection, each containing the likelihood function information at each time slot. To deal with the uncertainty of this information caused by a data detection error, a Markov decision process (MDP) is defined, which maximizes the accuracy of the likelihood function learned from the samples. Then a reinforcement learning algorithm is developed to solve this MDP in a computationally efficient manner. Simulation results demonstrate that the use of the proposed method significantly improves the robustness of MAP detection to both the channel estimation error and channel variations over time.