This paper describes the development of a proposed framework of metrics for the evaluation of the performance of aircraft guidance systems. The methodologies and metrics developed remain generally agnostic to whether or not the aircraft is manned. Although more complicated missions such as autonomous exploration/search, ferry, surveillance, multi-agent collaboration, and manned flight may be addressed at a later time, A-B flight scenarios are chosen to study the proposed metrics. The proposed metrics will form building blocks for the more complicated missions. Metrics development has thus far generally focused on NOE flight, and in particular on the observability of the vehicle throughout its mission. That is, a formulation of probability of detection by potential and generally unknown threats in the mission area will be the main metric. Secondary metrics provide insight into the vehicle's trajectory quality in terms of safety and comfort, experienced by both humans and machines are described as well. Scalability of the benchmarking system is also important and benchmarking should be general enough to allow guidance algorithms to be graded independently of the vehicle platform, for instance. Non-dimensionalization metrics will address this concern.