We consider the problem of convergence time minimization for federated learning (FL) implemented in wireless systems. In such setups, each wireless edge device transmits its local FL model parameters to a base station (BS). The BS then uses the received FL parameters to generate a common FL model and broadcasts it to all edge devices. Since the FL parameters must be transmitted over wireless links, the convergence time depends not only on the number of training steps, but also on the FL parameter transmission delay at each training step, which can be substantial when conveying a large number of parameters. In addition, due to limited wireless resources such as spectrum, only a subset of edge devices can participate in each FL training step, which can further increase convergence time. Our goal therefore is to optimize wireless resource management and user selection for FL, as well as limit the volume of transmitted FL parameters. In this paper, three schemes for facilitating communication efficient FL are introduced: First, a probabilistic device selection scheme is designed such that the devices that can significantly improve the convergence speed and training loss have high probabilities for FL parameter transmission. Then, given the subset of participating devices, an efficient wireless resource allocation scheme is developed. Finally, a quantization method is proposed to reduce the data size. Simulation results demonstrate that the proposed FL method can improve handwritten digit identification accuracy and convergence delay by up to 3% and 90% compared to the conventional FL.