Targeting network-on-chIP based manycores, we propose a novel compiler framework to optimize the network latencies experienced by off-chIP data accesses in reaching the target memory controllers. Our framework consists of two main components: data access placement and computation placement. In the data access placement, we separate the data access nodes from the computation nodes, with the goal of minimizing the number of links that need to be visited by the request messages. In the computation placement, we introduce computation decomposition and select appropriate computation nodes, to reduce the amount of data sent in the response messages and also to minimize the number of communication links visited. We performed an experimental evaluation of our proposed approach, and the results show an average execution time improvement of 21.1%, while reducing the network latency by 67.3%.