In general, reliable communication via multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) requires accurate channel estimation at the receiver. The existing literature largely focuses on denoising methods for channel estimation that depend on either (i) channel analysis in the time-domain with prior channel knowledge or (ii) supervised learning techniques which require large prelabeled datasets for training. To address these limitations, we present a frequency-domain denoising method based on a reinforcement learning framework that does not need a priori channel knowledge and pre-labeled data. Our methodology includes a new successive channel denoising process based on channel curvature computation, for which we obtain a channel curvature magnitude threshold to identify unreliable channel estimates. Based on this process, we formulate the denoising mechanism as a Markov decision process, where we define the actions through a geometry-based channel estimation update, and the reward function based on a policy that reduces mean squared error (MSE). We then resort to Q-learning to update the channel estimates. Numerical results verify that our denoising algorithm can successfully mitigate noise in channel estimates. In particular, our algorithm provides a significant improvement over the practical least squares (LS) estimation method and provides performance that approaches that of the ideal linear minimum mean square error (LMMSE) estimation with perfect knowledge of channel statistics.