We present a scalable approach to optimizing robot control policies for a target collective behavior in a spatially inhomogeneous robotic swarm. The approach can incorporate robot feedback to maintain system performance in an unknown environmental flow field. We consider systems in which the robots follow both deterministic and random motion and transition stochastically between tasks. Our methodology is based on an abstraction of the swarm to a macroscopic continuous model, whose dimensionality is independent of the population size, that describes the expected time evolution of swarm subpopulations over a discretization of the environment. We incorporate this model into a stochastic optimization method and map the optimized model parameters onto the robot motion and task transition control policies to achieve a desired global objective. We illustrate our methodology with a scenario in which the behaviors of a swarm of robotic bees are optimized for both uniform and nonuniform pollination of a blueberry field, including in the presence of an unknown wind.