In this paper, we address an important problem in the wireless monitoring, i.e., how to choose channels with best (or worst) qualities timely and accurately. We consider both scenarios of one or more sniffers simultaneously monitoring multiple channels in the same area. Since the channel information is initially unknown to the sniffers, we shall adopt learning methods during the monitoring to predict the channel condition by a short time of observation. We formulate this problem as a novel branch of the classic multi- armed bandit (MAB) problem, named exploration bandit problem, to achieve a trade-off between monitoring time/resource budget and the channel selection accuracy. In the multiple sniffer cases, including partly-distributed (with limited communications) and fully-distributed (without any communications) scenarios, we take communication costs and interference costs into account, and analyze how these costs affect the accuracy of channel selection. Extensive simulations are conducted and the results show that the proposed algorithms could achieve higher channel selection accuracy than other exploration bandit approaches, hence it proves the advantages of the proposed algorithms.