As the underground industry of malware prospers, malware developers consistently attempt to camouflage malicious code and undermine malware detection with various obfuscation schemes. Among them, metamorphism is known to have the potential to defeat the popular signature-based malware detection. A metamorphic malware sample mutates its code during propagations so that each instance of the same family exhibits little resemblance to another variant. Especially with the development of compiler and binary rewriting techniques, metamorphic malware will become much easier to develop and outbreak eventually. To fully understand the metamorphic engine, the core part of the metamorphic malware, we attempt to systematically study the evolution of metamorphic malware over time. Unlike the previous work, we do not require any prior knowledge about the metamorphic engine in use. Instead, we perform trace-based semantic binary diffing to compare mutation code iteratively and memoize semantically equivalent basic blocks. We have developed a prototype, called MetaHunt, and evaluated it with 1,400 metamorphic malware variants. Our experimental results show that MetaHunt can accurately capture the semantics of unknown metamorphic engines, and all of the comparisons converge in a reasonable time. Besides, MetaHunt identifies several metamorphic engine bugs, which lead to a semantics-breaking transformation. We summarize our experience learned from our empirical study, hoping to stimulate designing mutation-aware solutions to defend this threat proactively.