A good benchmark suite should provide users with inputs that have multiple levels of fidelity for different use cases such as running on real machines, register level simulations, or gate-level simulations. Although input reduction has been explored in the past, there is a lack of understanding how to systematically scale input sets for a benchmark suite. This paper presents a framework that takes the novel view that benchmark inputs should be considered approximations of their original, full-sized inputs. It formulates the input selection problem for a benchmark as an optimization problem that maximizes the accuracy of the benchmark subject to a time constraint. The paper demonstrates how to use the proposed methodology to create several simulation input sets for the PARSEC benchmarks and how to quantify and measure their approximation error. The paper also shows which parts of the inputs are more likely to distort their original characteristics. Finally, the paper provides guidelines for users to create their own customized input sets.