Modeling in R and Weka for Course Enrollment Prediction

  • Amanda Watkins California State University Northridge
  • Adam Kaplan California State University Northridge
Keywords: R, Time Series Forecasting, Weka, Machine Learning

Abstract

Predicting course enrollment is a common university resource planning problem. California State University Northridge (CSUN) faces many unique challenges when predicting student enrollment in its undergraduate Computer Science (CS) and Computer Information Technology (CIT) courses. In this paper, we discuss the design of an enrollment prediction tool which applies three Time Series models using R and four Time Series models using Weka utilizing a database of 19 semesters of enrollment data. The seven different models are tested against varying amounts of holdout data to see which can best predict enrollment for undergraduate CS and CIT courses to within one standard class size of 25 students. Predictions on holdout data are compared both in modified form, with numbers rounded up and negative values zeroed out, and unmodified form. All models were most accurate when predicting three semesters of holdout data using the maximum available enrollment data from Spring term of 2010 to Spring term of 2015 for training. The best resulting predictions were accurate within one standard class size of 25 students for 93.5% of Computer Science Department (CSD) courses, and the worst predictions were accurate within one standard class size for 77.4% of CSD courses.

References

K. R. Balachandran and D. Gerwin. Variable-Work Models for Predicting Course Enrollments, Operations Research, INFORMS, 3, 1971.

California State University. The California State University Analytic Studies Statistical Reports, http://www.calstate.edu/as/stats.shtml, 2016.

California State University Northridge. CSUN Outreach Publications View Book, http://www.csun.edu/sites/default/files/viewbook.pdf, 2013.

C. Chen. An Integrated Enrollment Forecast Model, IR Applications, Association for Institutional Research, 15, 2008.

K. S. Felts and M. Ehlert. Prediction Model for Course Demand at MU, Enrollment Management and Institutional Research, University of Missouri-Columbia, 2009.

M. Graczyk, T. Lasota, and B. Trawiski. Comparative Analysis of Premises Valuation Models Using KEEL, RapidMiner, and WEKA, First International Conference Computational Collective Intelligence (ICCCI), 2009.

S. Gvaladze. Evaluating Methods for Time-Series Forecasting Applied to Energy Consumption Predictions for Hvaler (kommune), Master’s Thesis, Computer Science, Østfold University College, Halden, 2015.

Hyndman, R.J. and Khandakar, Y. Automatic time series forecasting: The forecast package for R, Journal of Statistical Software, 26 (3), 2008. Institutional Research and Analysis Office for the University of Hawai’i System. Enrollment Projections for the University of Hawai’i System Fall 2013 to Fall 2018, University of Hawai’i Department Report, Honolulu, HI, 2013.

A. Kardan, H. Sadeghi, S. S. Ghidary, and M. R. F. Sani. Prediction of Student Course Selection in Online Higher Education Institutes Using Neural Network, Computers and Education, Elsevier, 65, 2013.

C. Kraft. Planning, Scheduling, and Timetabling in a University Setting, PhD Dissertation, Mathematical Sciences, Clemson University, 2007.

C. Napagoda. Web Site Visit Forecasting Using Data Mining Techniques, International Journal of Scientific and Technology Research, 12, 2013.

National Center for Education Statistics. College Navigator, http://nces.ed.gov/collegenavigator/, 2016.

M. D. Nemes and A. Butoi. Data Mining on Romanian Stock Market Using Neural Networks for Price Prediction, Informatica Economic, 3, 2013.

I. Ognjanovic, D. Gasevic, and S. Dawson. Using Institutional Data to Predict Student Course Selections in Higher Education, Internet and Higher Education, Elsevier, 29, 2016.

P. Ramsey, A. Watts, and L. Sklar. Institutional Knowledge Management Enrollment Projection Model, Southern Association for Institutional Research, Savannah, GA, 2015.

M. Reinstadler, M. Braunhofer, M. Elahi, and F. Ricci. Predicting Parking Lots Occupancy in Bolzano, Academic Project, Computer Science, Free University of Bolzano Italy, Bolzano, 2013.

E. Reiss. Best Practices in Enrollment Modeling: Navigating Methodology and Processes, Southern Association for Institutional Research, Lake Buena Vista FL, 2012.

Rickes Associates Inc. California State University Northridge: Teaching, Learning, Office, and Research Space Needs Assessment, 2015.

J. F. Shepanski. Fast learning in Artificial Neural Systems: Multilayer Perseptron Training using Optimal Estimation, Proc. IEEE 2nd Intern. Conf. Neural Nets, 1988.

U.S. News. U.S. News Colleges California State University Northridge 2016 Overview, http://colleges.usnews.rankingsandreviews.com/best-colleges/csun-1153, 2016.

University of California. The University of California at a Glance, http://universityofcalifornia.edu/sites/default/files/uc-at-a-glance-mar-2016.pdf, 2016.

University of California Los Angeles. UCLA Academic Planning and Budget: Campus Statistics for Enrollment, http://www.aim.ucla.edu/enrollment2.aspx, 2016.

University of Colorado Boulder. CU Boulder: Planning, Budget and Analysis - Enrollment Projections, http://www.colorado.edu/pba/enrlproj/, 2015.

Web Traffic Time Series Forecasting: Forecast Future Traffic to Wikipedia Pages, https://www.kaggle.com/c/web-traffic-time-series-forecasting/discussion/43795, 2017.

Published
2018-03-29