It is widely acknowledged that the language we use reflects numerous psychological constructs, including our thoughts, feelings, and desires. Can the so called "noncognitive" traits with known links to success, such as growth mindset, leadership ability, and intrinsic motivation, be similarly revealed through language? We investigated this question by analyzing students' 150-word open-ended descriptions of their own extracurricular activities or work experiences included in their college applications. We used the Common Application-National Student Clearinghouse data set, a six-year longitudinal dataset that includes college application data and graduation outcomes for 278,201 U.S. high-school students. We first developed a coding scheme from a stratified sample of 4,000 essays and used it to code seven traits: growth mindset, perseverance, goal orientation, leadership, psychological connection (intrinsic motivation), self-transcendent (prosocial) purpose, and team orientation, along with earned accolades. Then, we used standard classifiers with bag-of-n-grams as features and deep learning techniques (recurrent neural networks) with word embeddings to automate the coding. The models demonstrated convergent validity with the human coding with AUCs ranging from .770 to .925 and correlations ranging from .418 to .734. There was also evidence of discriminant validity in the pattern of inter-correlations (rs between -.206 to .306) for both human- and model-coded traits. Finally, the models demonstrated incremental predictive validity in predicting six-year graduation outcomes net of sociodemographics, intelligence, academic achievement, and institutional graduation rates. We conclude that language provides a lens into noncognitive traits important for college success, which can be captured with automated methods.
tags: college success