A significant portion of the internet of things (IoT) devices will become reliable products in our daily life if and only if they are equipped with strong human computer interaction (HCI) technologies, specifically visual interaction with users through affective computing. One of the major challenges faced in affective computing is recognizing facial expressions and the true emotions behind them. Despite numerous studies performed, current detection systems are ineffective at correctly identifying facial expressions with reliable accuracy, especially in case of negative expressions. Several research projects attempted to extract the recognition process that humans follow to identify facial expressions in order to replicate in smart machines without a significant success. This paper describes our interdisciplinary project whose goal is to extract and define the recognition process that humans follow when identifying the facial expressions of others. We monitor this process by identifying and analyzing the regions of interest participants look at when they are shown static emotions samples under a specific experimental setup. This paper reports the current status of data collection, experimental setup, and initial data visualization.