Increases in cache capacity are accompanied by growing wire delays due to technology scaling. Non-Unifor Cache Architecture (NUCA) is one of proposed solutions to reducing the average access latency in such cache designs.While most of the prior NUCA work focuses on data placement, data replacement, and migration related issues, this paper studies the problem of data search (access) in NUCA. In our architecture we arrange sets of banks with equal access latency into rings. Our Last Access Based (LAB) prediction scheme predicts the ring that is expected to contain the required data and checks the banks in that ring first for the data block sought. We compare our scheme to two alternate approaches: searching all rings in parallel, and searching rings sequentially. We show that our LAB ring prediction scheme reduces L2 energy significantly over the sequential and parallel schemes, while maintaining similar performance. Our LAB scheme reduces energy consumption by 15.9% relative to the sequential lookup scheme, and 53.8% relative to the parallel lookup scheme.