Studies show that users do not reliably click more often on headlines classifed as clickbait by automated classifers. Is this because the linguistic criteria (e.g., use of lists or questions) emphasized by the classifers are not psychologically relevant in attracting interest, or because their classifcations are confounded by other unknown factors associated with assumptions of the classifers? We address these possibilities with three studies-a quasi-experiment using headlines classifed as clickbait by three machine-learning models (Study 1), a controlled experiment varying the headline of an identical news story to contain only one clickbait characteristic (Study 2), and a computational analysis of four classifers using real-world sharing data (Study 3). Studies 1 and 2 revealed that clickbait did not generate more curiosity than non-clickbait. Study 3 revealed that while some headlines generate more engagement, the detectors agreed on a classifcation only 47% of the time, raising fundamental questions about their validity.