User profile release in recommendation systems can apply the user profile perturbation technique to protect user privacy, in which each user sends a perturbed user profile such as the a list of clicked items to receive a recommendation service from a server. The perturbation policy such as the privacy budget determines the recommendation quality and the privacy level, while its optimization usually depends on the known attack model, which is rarely known by the users. In this paper, we propose a reinforcement learning based user profile perturbation scheme that applies differential privacy to protect user privacy for recommendation systems. According to reinforcement learning, the privacy budget to perturb the released user profile depends on the features of the actual user profiles and the released user profiles, and the estimated user privacy level. This scheme enables a user to optimize his or her perturbation policy in terms of both the user privacy level and the received recommendation quality without being aware of the attack model. We evaluate the computational complexity of this scheme and analyze a case study, a privacy aware movie recommendation system. Simulation results show that this scheme improves user privacy protection for a given level of recommendation quality compared with a benchmark profile perturbation scheme.