We describe new ways to simulate 2-party communication protocols to get protocols with potentially smaller communication. We show that every communication protocol that communicates C bits and reveals I bits of information about the inputs to the participating parties can be simulated by a new protocol involving at most Õ(√CI) bits of communication. If the protocol reveals I bits of information about the inputs to an observer that watches the communication in the protocol, we show how to carry out the simulation with Õ(I) bits of communication. These results lead to a direct sum theorem for randomized communication complexity. Ignoring polylogarithmic factors, we show that for worst case computation, computing n copies of a function requires √n times the communication required for computing one copy of the function. For average case complexity, given any distribution μ on inputs, computing n copies of the function on n inputs sampled independently according to μ requires √n times the communication for computing one copy. If μ is a product distribution, computing n copies on n independent inputs sampled according to μ requires n times the communication required for computing the function. We also study the complexity of computing the sum (or parity) of n evaluations of f, and obtain results analogous to those above.