This paper proposes a 5-component detection pipeline for use in a computer vision-based animal recognition system. The end result of our proposed pipeline is a collection of novel annotations of interest (AoI) with species and view-point labels. These AoIs, for example, could be fed as the focused input data into an appearance-based animal identification system. The goal of our method is to increase the reliability and automation of animal censusing studies and to provide better ecological information to conservationists. Our method is able to achieve a localization mAP of 81.67%, a species and viewpoint annotation classification accuracy of 94.28% and 87.11%, respectively, and an AoI accuracy of 72.75% across 6 animal species of interest. We also introduce the Wildlife Image and Localization Dataset (WILD), which contains 5,784 images and 12,007 labeled annotations across 28 classification species and a variety of challenging, real-world detection scenarios.