As chip multiprocessors (CMPs) are being increasingly used in embedded computing, optimizing data locality considering interprocessor interactions is becoming critical. To address this problem, this paper proposes a new abstraction called the interprocessor data reuse vector, which captures the reuse distance (in terms of loop iterations) between successive accesses to a given data element from different processors. Based on this reuse vector, we then present a data locality optimization scheme. A unique characteristic of this scheme is that it allows application of different transformations to different processors of the CMP if this helps improve locality of data shared across processors. We automated our approach within an optimizing compiler and collected statistics using eight application codes. Our results indicate that the proposed code restructuring is very effective in practice (about 9% savings in performance over a standard data locality optimizer).