The Cys2His2 zinc finger (ZF) is the most frequently found sequence-specific DNA-binding domain in eukaryotic proteins. The ZF's modular protein-DNA interface has also served as a platform for genome engineering applications. Despite decades of intense study, a predictive understanding of the DNA-binding specificities of either natural or engineered ZF domains remains elusive. To help fill this gap, we developed an integrated experimental-computational approach to enrich and recover distinct groups of ZFs that bind common targets. To showcase the power of our approach, we built several large ZF libraries and demonstrated their excellent diversity. As proof of principle, we used one of these ZF libraries to select and recover thousands of ZFs that bind several 3-nt targets of interest. We were then able to computationally cluster these recovered ZFs to reveal several distinct classes of proteins, all recovered from a single selection, to bind the same target. Finally, for each target studied, we confirmed that one or more representative ZFs yield the desired specificity. In sum, the described approach enables comprehensive large-scale selection and characterization of ZF specificities and should be a great aid in furthering our understanding of the ZF domain.
All Science Journal Classification (ASJC) codes