Peak power management of datacenters has tremendous cost implications. While numerous mechanisms have been proposed to cap power consumption, real datacenter power consumption data is scarce. Prior studies have either used a small set of applications and/or servers, or presented data that is at an aggregate scale from which it is difficult to design and evaluate new and existing optimizations. To address this gap, we collect power measurement data at multiple spatial and fine-grained temporal resolutions from several geo-distributed datacenters of Microsoft corporation over 6 months. We conduct aggregate analysis of this data to study its statistical properties. We find evidence of self-similarity in power demands, statistical multiplexing effects, and correlations with the cooling power that caters to the IT equipment. With workload characterization a key ingredient for systems design and evaluation, we note the importance of better abstractions for capturing power demands, in the form of peaks and valleys. We identify attributes for peaks and valleys, and important correlations across these attributes that can influence the choice and effectiveness of different power capping techniques. We characterize these attributes and their correlations, showing the burstiness of small duration peaks, and the importance of not ignoring the rare but more stringent or long peaks. The correlations between peaks and valleys suggest the need for techniques to aggregate and collectively handle them. With the wide scope of exploitability of such characteristics for power provisioning and optimizations, we illustrate its benefits with two specific case studies. The first shows how peaks can be differentially handled based on our peak and valley characterization using existing approaches, rather than a one-size-fits-all solution. The second illustrates a simple capacity provisioning strategy for energy storage using the peak and valley characteristics.