Developmental Stage

Publication Type

Results: Hutt, Stephen

This study answered novel questions about the connection between high school extracurricular dosage (number of activities and participation duration) and the attainment of a bachelor’s degree. Using data from the Common Application and the National Student Clearinghouse (N = 311,308), we found that greater extracurricular participation positively predicted bachelor’s degree attainment. However, among students who ultimately earned a bachelor’s degree, participating in more than a moderate number of high school activities (3 or 4) predicted decreasing odds of earning a bachelor’s degree on time (within 4 years). This effect intensified as participation duration increased, such that students who participated in the greatest number of high school activities for the most years were the most likely to delay college graduation.

It is widely acknowledged that the language we use reflects numerous psychological constructs, including our thoughts, feelings, and desires. Can the so called "noncognitive" traits with known links to success, such as growth mindset, leadership ability, and intrinsic motivation, be similarly revealed through language? We investigated this question by analyzing students' 150-word open-ended descriptions of their own extracurricular activities or work experiences included in their college applications. We used the Common Application-National Student Clearinghouse data set, a six-year longitudinal dataset that includes college application data and graduation outcomes for 278,201 U.S. high-school students. We first developed a coding scheme from a stratified sample of 4,000 essays and used it to code seven traits: growth mindset, perseverance, goal orientation, leadership, psychological connection (intrinsic motivation), self-transcendent (prosocial) purpose, and team orientation, along with earned accolades. Then, we used standard classifiers with bag-of-n-grams as features and deep learning techniques (recurrent neural networks) with word embeddings to automate the coding. The models demonstrated convergent validity with the human coding with AUCs ranging from .770 to .925 and correlations ranging from .418 to .734. There was also evidence of discriminant validity in the pattern of inter-correlations (rs between -.206 to .306) for both human- and model-coded traits. Finally, the models demonstrated incremental predictive validity in predicting six-year graduation outcomes net of sociodemographics, intelligence, academic achievement, and institutional graduation rates. We conclude that language provides a lens into noncognitive traits important for college success, which can be captured with automated methods.