This paper deals with the power allocation problem for achieving the upper bound of sum-rate region in energy harvesting downlink channels. We prove that the optimal power allocation policy that maximizes the sum-rate is an increasing function for harvested energy, channel gains, and remaining battery, regardless of the number of users in the downlink channels. We use this proof as a mathematical basis for the construction of a shallow neural network that can fully reflect the increasing property of the optimal policy. This scheme helps us to avoid using big neural networks which requires huge computational resources and causes overfitting. Through experiments, we reveal the inefficiencies and risks of deep neural network that are not optimized enough for the desired policy, and shows that our approach learns a robust policy even with the severe randomness of environments.