W09: In-Class Modelling Competitions and Language Models for Data Generation


By Adam Gilbert, Sarah Dumnich


Information

Incorporating competitive elements into classroom activities can significantly enhance the learning experience. However, traditional Kaggle competitions pit students against seasoned practitioners, where students new to modeling can’t generally expect to be competitive. Seeing the distance between their model performance metrics and those achieved by other competitors can be discouraging to learners new to Statistical Modeling.

This project explores the use of private Kaggle competitions as a method to enhance student motivation and learning. Instructors have and can organize semester-long, single-class, cross-course, or even intercollegiate competitions, providing students engaging opportunities to construct and assess models. This presentation discusses various competition setups, lessons learned, and the impact of debriefing sessions with students. The project has been implemented at a pair of small residential colleges in the Northeastern United States with 15-20 students from a wide variety of majors and with an introductory applied statistics course prerequisite (but many have not had calculus or linear algebra). 

An additional innovative aspect discussed is the use of ChatGPT or similar Language Model Assistants (LMAs) to generate simulated datasets for these competitions. By leveraging an LMA’s capabilities, instructors can create realistic datasets that challenge students' modeling skills and encourage creative thinking. The use of LMAs in data generation offers a unique and effective way to enrich in-class modeling competitions, making them more engaging and educational for learners.


Recording

Adam_Gilbert_Supplementary Items.pdf