This paper proposes a multiple-source multiple-sample fusion approach to identity verification. Fusion is performed at two levels: intramodal and intermodal. In intramodal fusion, the scores of multiple samples (e.g. utterances or video shots) obtained from the same modality are linearly combined, where the combination weights are dependent on the difference between the score values and a user-dependent reference score obtained during enrollment. This is followed by intermodal fusion in which the means of intramodal fused scores obtained from different modalities are fused. The final fused score is then used for decision making. This two-level fusion approach was applied to audio-visual biometric authentication, and experimental results based on the XM2VTSDB corpus show that the proposed fusion approach can achieve an error rate reduction of up to 83%.