An important correctness issue for emerging multi/many-core shared memory systems is to ensure that the inter-processor communication through shared memory conforms to the memory ordering rules, as specified by the architecture's memory consistency model . This presents a significant validation challenge. Growing system complexity makes it increasingly hard to identify all deep-state logic bugs in pre-silicon verification. Further, aggressive technology scaling makes hardware more vulnerable to dynamic errors that can only be detected at runtime. In this paper, we propose an approach for runtime validation of memory ordering. This allows us to survive bugs that escape pre-silicon verification, as well as deal with emerging dynamic errors. Our solution consists of two parts: 1) at the microarchitecture level, we add efficient hardware support to capture the observed ordering among shared-memory operations; 2) we perform online verification of the observed memory ordering by checking for cycles in the constraint graph [11, 12]. We combine these to achieve end-to-end correctness validation of the system execution with respect to the memory ordering specification. There are several challenges that need to be addressed to make this approach practical. We describe these, as well as optimization techniques for reducing the hardware overhead. Estimates obtained from preliminary chip multiprocessor simulation experiments show that the proposed techniques are very effective in achieving acceptable hardware overhead and minimal performance impact.