With mostly non-sequential reference, memory access becomes the bottleneck of SIMD processor performance. In this paper, we present an approach for optimal H.264 motion estimation code generation over a SIMD platform with the objective to minimize memory access overhead. Specifically, we formulate the code generation task as a constrained optimization problem where the objective function is to minimize the amount of memory access overhead, subject to the constraint of the data-dependencies the algorithm. The target platform is based on a native SIMD processor architecture known as PLX developed at Princeton University. We illustrate with an example of generating PLX code for the H.264 variable-block size motion estimation algorithm. We show that the optimization yields significant performance enhancement.