In this paper a multiband, multiagent reinforcement learning based distributed sensing policy for cognitive radio networks is proposed. In the proposed sensing policy the secondary users (SUs) collaborate with neighboring users by exchanging information locally. The objective is to maximize the amount of free spectrum found for secondary use while guaranteeing a certain probability of detection. The SUs employ spatial diversity through collaborative sensing to control the false alarm rate and thus the probability of finding available spectrum opportunities. The SUs in the cognitive radio network make local decisions based on their own and their neighbors' local test statistics to identify unused spectrum locally. Simulation results show that the proposed sensing policy provides a straightforward approach for obtaining a good tradeoff between sensing more spectrum and the reliability of the sensing results.