In this paper, the convergence time of federated learning (FL), when deployed over a realistic wireless network, is studied. In particular, with the considered model, wireless users transmit their local FL models (trained using their locally collected data) to a base station (BS). The BS, acting as a central controller, generates a global FL model using the received local FL models and broadcasts it back to all users. Due to the limited number of resource blocks (RBs) in a wireless network, only a subset of users can be selected and transmit their local FL model parameters to the BS at each learning step. Meanwhile, since each user has unique training data samples and the BS must wait to receive all users' local FL models to generate the global FL model, the FL performance and convergence time will be significantly affected by the user selection scheme. In consequence, it is necessary to design an appropriate user selection scheme that enables all users to execute an FL scheme and efficiently train it. This joint learning, wireless resource allocation, and user selection problem is formulated as an optimization problem whose goal is to minimize the FL convergence time while optimizing the FL performance. To address this problem, a probabilistic user selection scheme is proposed using which the BS will connect to the users, whose local FL models have large effects on its global FL model, with high probabilities. Given the user selection policy, the uplink RB allocation can be determined. To further reduce the FL convergence time, artificial neural networks (ANNs) are used to estimate the local FL models of the users that are not allocated any RBs for local FL model transmission, which enables the BS to include more users' local FL models to generate the global FL model so as to improve the FL convergence speed and performance. Simulation results show that the proposed ANN-based FL scheme can reduce the FL convergence time by up to 53.8, compared to a standard FL algorithm.