Core Course Analysis for Undergraduate Students in Mathematics
In this work, we develop statistical tools to understand core courses at the university level. Traditionally, professors and administrators label courses as "core" when the courses contain foundational material. Such courses are often required to com…
Authors: Ritvik Kharkar, Jessica Tran, Charles Z. Marshak
Core Course Analysis f or Undergr aduate Students in Mathematics Ritvik Kharkar Univ ersity of Calif or nia, Los Angeles ritvikmath@ucla.edu Jessica T ran Univ ersity of California, Los Angeles jessicatran@ucla.edu Charles Z. Marshak Univ ersity of California, Los Angeles cmarshak@math.ucla.edu May 4, 2016 In this work, we dev elop statistical tools to understand core courses at the uni versity le vel. T raditionally , professors and administrators label courses as "core" when the courses contain foundational material. Such courses are often required to complete a major , and, in some cases, allocated additional educational resources. W e identify two ke y attributes which we expect core courses to ha ve. Namely , we expect core courses to be highly correlated with and highly impactful on a student’ s ov erall mathematics GP A. W e use two statistical procedures to measure the strength of these attrib utes across courses. The first of these procedures fashions a metric out of standard correlation measures. The second utilizes sparse regression. W e apply these methods on student data coming from the Uni versity of California, Los Angeles (UCLA) department of mathematics to compare core and non-core course work. 1 . I N T R O D U C T I O N Educators, professors, and mathematicians are continually improving mathematics education [ 8 , 28 ]. There is a consensus that mathematics education di vides into two categories: design of math curriculum and the development of teaching strategies. In particular , members from the Mathematical Association of America and the American Mathematical Society [ 18 ] are regu- larly analyzing the core requirements of undergraduate mathematics and constantly improving the ef fectiv eness of mathematics teachers [ 7 , 24 , 29 ]. In this paper we focus on the design of the curriculum using tools to understand core courses using grade statistics. The under graduate math curriculum is the set of courses required for an undergraduate to earn his or her degree. Departmental faculty and counselors carefully select courses which pro- vide a well-rounded mathematics education. T raditionally , core courses within the mathematics curriculum are those that contain the most foundational material and that inform subsequent course work. These courses are often ones that the department encourages students to take early to ensure they get the most from their major . W e will analyze the core curriculum using techniques and tools in Educational Data Mining (EDM). EDM is an interdisciplinary research area that aims to improve education utilizing data analysis [ 15 , 19 , 20 , 31 ]. Although researchers have been collecting and analyzing educational 1 data for a long time, it has only recently been established as its own field, due to the accessibility of statistical tools and the emer gence of EDM journals and conferences [ 1 , 21 ]. T ools de veloped in EDM research hav e been more quickly adopted due to the challenge of serving a growing undergraduate population [ 10 ]. Recent advancements in education made possible through data analysis tools include course recommendation systems [ 4 , 9 , 30 ], student clustering [ 16 ], grade predictions [ 3 , 17 ], and personalized education through online medium [ 23 , 27 ]. Adapting such data mining tools for assessing student performance is a difficult task but will help administrators e ven better serve their student body . The focus of this paper will be to provide statistical criteria that validates the department’ s core course selection. Specifically , we compared the grade distribution from a single math course to the distrib ution of GP As. W e applied this methodology to the grades of UCLA math- ematics students from 2000-2015. The viewpoint we dev elop here can be easily adapted to any other department for a similar analysis. T o compare course statistics and GP A statistics we employ tw o fundamental statistical tools: the correlation coefficient and linear regression. The correlation coefficient is a way to measure the co variance of two distrib utions [ 14 ]. Regression is a tool that linearly models ho w particular measured inputs influence a response [ 14 ]. It informs some of our most important models for business, biology , machine learning, and en vironmental science [ 5 , 13 , 26 ]. Ultimately , we will use these tools to compare the grades earned by students in a single class with these students’ ov erall GP As. The data-centric vie wpoint we de velop of fers a statistical methodology to v alidate the core courses within a univ ersity’ s curriculum. Using this methodology , we can identify those courses that especially impact a student’ s study . Any discrepancy between the core courses defined by the department and those found from this statistical viewpoint can prompt further discussion of an ef fectiv e way to design the curriculum. Ultimately , by analyzing the university curriculum, we hope to aid departments in better meeting their goals and preparing exiting students better for industry and academic jobs. W e now discuss how this paper is or ganized. In section 2. of our paper , we describe our data-centric approach to core course analysis providing justification for the use of correlation coef ficients and regression. In section 2. , we precisely describe the statistical setup and provide some sample output. In section 3. , we apply this methodology of determining core courses to pure mathematics majors. 2 . C H A R A C T E R I Z I N G C O R E C O U R S E S Core courses are those that are the most foundational for a well-rounded mathematics education and are often allocated additional educational resources. In this study , we consider the core courses selected by the UCLA mathematics department and identify statistical attributes that we e xpect such courses to have. W e dev elop statistical tools to measure the strength of these attributes across courses. Using these tools, we then compare the core and non-core courses to v alidate the department’ s core course selection. At UCLA, the core courses are Linear Algebra and Real Analysis. All mathematics students must take these courses and are recommended to take them early in their course sequence. The first attribute we consider is the correlation between the course grade of a particular student and his or her ov erall mathematics GP A. By in vestigating this correlation, we analyze the relationship between rankings determined by grade compared to rankings determined by overall 2 mathematics GP A. W e expect that the course grades of core courses defined by the department would be highly correlated with one’ s ov erall mathematics GP A. The second attribute we consider is the impact of a particular course grade on ov erall mathe- matics GP A. The impact of a course grade on a student’ s overall mathematics GP A signifies ho w sensiti ve o verall mathematics GP A is to the course grade. W e examine impact by performing lasso regression in order to select the most important courses. When we input courses from the mathematics curriculum as variables into lasso, the algorithm selects a subset of courses which are most relev ant to ov erall mathematics GP A. W e conclude that the courses assigned the high- est coef ficients by lasso regression are those ha ving the greatest bearing on o verall mathematics GP A. W e now describe the correlation metric used to measure the first attrib ute. 2 . 1 . M I X E D C O R R E L A T I O N M E T R I C T o measure the correlation between course grades and a student’ s o verall mathematics GP A, we combine two correlation coefficients. W e then analyze these correlation coefficients to measure ho w well the ranking determined by a single course grade compares to the ranking of ov erall mathematics GP A. W e first consider Spearman’ s ρ [ 22 ] which we will denote by ρ S . Gi ven a sample of n students, we can rank students both by ov erall mathematics GP A as well as by grade in a particular course. Let r GP A 1 , ..., r GP A n be the ranking of students by ov erall mathematics GP A and let r C 1 , ..., r C n be the ranking of students by grade in course C . The measure ρ S is defined as ρ S = 1 − 1 Z n X i =1 ( r GP A i − r C i ) 2 where Z = n ( n 2 − 1) 6 normalizes the sum. The coefficient ρ S ranges from − 1 to 1 in which positiv e v alues indicate an agreement between the two rankings. In the context of our study , we use ρ S to measure how much a rank determined by a student’ s grade in a course agrees with his or her ov erall mathematics GP A. W e will also employ the coefficient of determination, R sq , a normalized measure of covari- ance between grade in a particular course and ov erall mathematics GP A. Let g C 1 , ..., g C n be the grade in course C of each student and let g GP A 1 , ..., g GP A n be the ov erall mathematics GP A of each student. The sample means of these quantities are defined as g C = 1 n n X i =1 g C i g GP A = 1 n n X i =1 g GP A i . R sq is defined as follo ws, R sq = P n i =1 ( g C i − g C )( g GP A i − ¯ g GP A ) q P n i =1 ( g C i − ¯ g C i ) 2 q P n i =1 ( g GP A i − ¯ g GP A ) 2 . Using these two correlation coef ficients, we define a new holistic metric as ρ H = q ( ρ S ) 2 + ( R sq ) 2 . This ne w metric, ρ H takes into account both correlation metrics and is the ` 2 distance to the ori- gin. 3 Figure 1: Scatter plot of ρ S v alues against R sq v alues for the Math/Econ, Pure Math, Applied Math courses. The triangular points are courses which are required for all majors and the round points are courses which are not required for all majors. W e giv e an example of ho w these metrics are used to identify core courses. W e will only con- sider students who ha ve taken at least fi ve mathematics courses and at least one upper division mathematics course at UCLA so we can identify those courses that might impact students who hav e made a significant time commitment to their mathematical de velopment. W e pro vide an example that includes the three lar gest mathematics majors at UCLA: Mathematics/Economics (Math/Econ), Pure Mathematics (Pure Math), and Applied Mathematics (Applied Math) and will only consider students admitted as freshmen. W e omit transfer students since these students usually ha ve dif ferent training than students that are admitted right from high school. W e report the findings in T ables 1 and 2 . After generating the ρ S and R sq v alues for each student group in each of the courses, we plot the two metrics to understand how well the rankings matched. The scatter plot in Figure 1 plots ρ S against R sq using the v alues from T ables 1 and 2 . 4 T able 1: Mixed Correlation Coefficients Course ρ S R sq ρ H Linear Algebra 0.821 0.655 1.05 Complex Analysis 0.819 0.657 1.049 Ord. Diff. Eq. 0.818 0.654 1.047 Num. Analysis (Part II) 0.827 0.614 1.03 Probability (Part I) 0.806 0.62 1.02 Probability (Part II) 0.814 0.607 1.015 Game Theory 0.805 0.619 1.015 Num. Analysis (Part I) 0.794 0.62 1.007 Nonlinear/Linear Dif f. Eq. 0.805 0.603 1.006 Optimization 0.801 0.601 1.001 Combinatorics 0.797 0.601 0.998 Dif f. Eq. 0.787 0.61 0.996 T able 2: Mixed Correlation Coefficients Course ρ S R sq ρ H Dif f. Geometry (P art I) 0.802 0.582 0.991 Math Modeling 0.797 0.584 0.988 Real Analysis (Part I) 0.781 0.6 0.984 Multi var . Calc. (P art II) 0.785 0.589 0.981 Algebra for App. 0.765 0.552 0.943 Real Analysis (Part II) 0.764 0.546 0.939 Linear Algebra 0.744 0.548 0.924 Discrete Math 0.738 0.506 0.895 Abs. Algebra (Part I) 0.721 0.512 0.885 Multi var . Calc. (P art I) 0.696 0.423 0.814 Single V ar . Calc. (P art II) 0.657 0.383 0.76 Single V ar . Calc. (P art I) 0.629 0.34 0.715 5 W e can calculate ρ S and R sq for each course. W e can then calculate the ` 2 distance from the origin to ( ρ S , R sq ) to get a measure of how accurately grades in each course correlate with ov erall mathematics GP A v alues. Thus, the larger this distance, the stronger the correlation between course grade and overall mathematics GP A. Figure 2 shows some of the courses closest to the origin. Since our sample population is from a di verse set of majors, we indicate o verlapping required courses with a (*). In the disc ussion of our results, we use course labellings as assigned by the math department of UCLA, see Appendix A . From this bar plot, we see that the only required upper di vision courses that appear from this metric are Linear Algebra, Complex Analysis and Real Analysis. This v alidates that Linear Algebra is a foundational course and also agrees with the department’ s decision to recommend this course as a first upper di vision course. Complex Analysis, although not considered a core course by the department, follo ws immediately after Linear Algebra. On the other hand, Real Analysis, determined a core course by the department, ranks further do wn in our list. W e will perform regression analysis to find which courses are most impactful. 6 Figure 2: Bar plot of ρ H v alues of a sample of courses from the correlation metric. The top courses were calculated based on the distance from the origin. Bars colored grey are core courses as defined by the mathematics department. The asterisk (*) indicates courses that are required. 2 . 2 . L A S S O R E G R E S S I O N M E T R I C In order to further validate the department’ s selection of core courses, we will no w turn our attention to analyzing the impact of a particular course’ s grade on ov erall mathematics GP A. W e model the ov erall mathematics GP A as a linear response to the dif ferent course grades a student earns. W e chose to use a linear model due to the v ersatility of such models, the fact that GP A is a linear function of all course grades, and the apparent linear relationship between core course grades and ov erall mathematics GP A as seen in Figure 3 . W e proceed by selecting the most impactful courses using sparse regression. This regression serv es a dual purpose as it also assigns zero coefficient v alues to courses which have minimal bearing on overall mathematics GP A. This effecti vely allows us to disqualify certain courses from being considered core. A ke y application of lasso regression is grade prediction. That is, gi ven the impactful courses selected by lasso regression, we can input the grades in these courses into a different re gression model aimed to wards predicting overall mathematics GP A. This re gression model, better suited to calculate the exact values of the coef ficients for each course, will allow us to estimate a student’ s ov erall mathematics GP A based on his or her grade in a few courses. 2.2.1. Course Selection with Lasso W e proceed by selecting the most impactful courses with lasso re gression [ 11 ]. Lasso regression uses an ` 1 penalty to ensure the coefficients for the linear model are sparse. A non-zero regres- sion coefficient indicates that the corresponding course is impactful while a zero-coefficient indicates the corresponding course is not. By remo ving the courses that ha ve little bearing on 7 Figure 3: Scatter plot of a verage grade in Linear Algebra, Real Analysis, and Prob- ability Theory (P art I) against o verall student GP A for all students who took all three courses. The ridge regression prediction function is drawn through the points. a student’ s overall mathematics GP A, we are able to extract the most impactful courses and v alidate core course selection by the mathematics department. Let C 1 , . . . , C m be the m a vailable courses in the math department. W e can model a student i ’ s ov erall GP A g GP A i as g GP A i = β 0 + β 1 g C 1 i + β 2 g C 2 i + ... + β m g C m i + i where g C k i = ( student i ’ s grade in class k if student i took class k 0 otherwise When considering all n students, we can write this model in the matrix form g GP A = G β + where the matrix G ∈ R n × m contains all of our grade data for all m math courses at UCLA for all n students. The coef ficient vector β is determined as arg min β || g GP A − G β || 2 2 + α || β || 1 (1) for α > 0 . For larger α , we obtain sparser solutions. As α approaches 0 , we obtain solutions that resemble those obtained by ordinary least squares. While there is no closed form solution for Eq. 1 , the minimization can be solv ed using standard techniques in con ve x optimization [ 2 ]. Throughout this paper , we select α = 0 . 1 when using a lasso model because, experimentally , it has giv en that approximately 80% of the coefficients are zero. W e note that the β coef ficients produced by lasso regression have a smaller v ariance than those produced by an ordinary least 8 T able 3: Lasso regression coefficient values for Pure Mathematics, Mathemat- ics/Economics, and Applied Mathematics freshman admits. Course Lasso Coefficient Complex Analysis 0.05734 Dif f. Geometry (Part I) 0.0502 Num. Analysis (Part II) 0.04429 Dif f. Eq. (Lo wer Di v .) 0.04375 Probability (Part I) 0.03322 Math Modeling 0 Multi var . Calc. (Part I) 0 Math History 0 squares regression [ 25 ]. Unfortunately , this comes at the price of the lasso regression β ’ s being more biased estimators. The results of the lasso regression for freshman mathematics students from the majors Pure Mathematics, Applied Mathematics, and Mathematics/Economics are sho wn in T able 3 . 2.2.2. Grade Prediction with Ridge Gi ven the courses selected by lasso regression, we will no w utilize ridge re gression to assign accurate numerical weights to this subset of courses and thereby gauge precisely ho w impactful each course in our subset is for o verall mathematics GP A. In addition, we will also hav e a meaningful prediction function for overall mathematics GP A based on grades in a fe w impactful courses. W e use ridge regression because it will help us account for the multicollinearity between the grades in particular courses, our explanatory variables. W e expect multicollinearity to exist due to the fact that students who do well in one course tend to do well in others, and vice v ersa. This multicollinearity , if unchecked, can ske w our re gression coef ficients, and produce false v alues [ 6 ]. It is important to note that while ridge re gression helps to correct for multicollinearity is- sues by imposing an ` 2 penalty , it causes the resulting re gression coef ficients to be biased [ 12 ], systematically shifted to wards lo wer v alues. But, upon running our tests with both ridge regres- sion and ordinary least square regression, we find that ridge regression preserves the relati ve ordering of the coefficients, produces a higher accuracy lev el as measured by the coef ficient of determination R sq , and de viates from the ordinary least squares β v alues only slightly . In our grade prediction process, we will consider three courses as input into our ridge model. The first two courses will always be Linear Algebra (Part I) and Real Analysis (Part I), because they are considered core courses by the mathematics department. W e will c ycle through each of the top three courses and last course in T able 3 as our third course in the ridge model, thereby running the ridge model four separate times. W e run the model four times because the lasso model selected the top three courses, indicating that overall mathematics GP A is especially 9 sensiti ve to grades in each selected course. W e run the model a fourth time with a course that lasso indicated as non-core to check whether our accuracy le vel diminishes as expected. W e expect our accurac y lev el to diminish as we pick courses with progressiv ely lo wer lasso coef ficients. W e use a linear ridge model so that our model for student i is g GP A = β 0 + β 1 g 115 A i + β 2 g 131 A i + β 3 g C 3 i + i . where C 3 takes on the label of each of the top three courses in T able 3 as well as the last course in T able 3 . W e can compactly express the implied system of equations for n students as g GP A = G β + where G ∈ R n × 4 is a matrix encoding each student’ s grade in each of the three courses we are using. The β coefficient v ector is giv en by arg min β || g GP A − G β || 2 2 + α || β || 2 (2) for α > 0 . W e find that α = 2.5 giv es the highest accuracy in our predicted overall mathematics GP A values, ˆ g GP A . For each separate run of the ridge model, we consider all students in the population who hav e taken Linear Algebra, Real Analysis, and the rotating third course. W e then take a random 90% sample of this group as our training set and use it to generate a coefficient vector β using Eq. 2 . The remaining 10% of the data is used to test this function. W e repeat this procedure 100 times, taking a different 90% random sample of the students each time, and then take the av erage of the 100 resulting β vectors to generate an a veraged prediction model for overall mathematics GP A. W e also average the accuracy lev els, measured by R sq , ov er 100 runs of a particular combination of three courses. W e find that by using Complex Analysis, the course with the highest lasso coefficient in T able 3 , as our third course, we get the highest R sq v alue. This accuracy le vel drops by 5% when we use Differential Geometry (P art I), having the second highest lasso coef ficient. Furthermore, when we use History of Mathematics as our third course, listed as having a zero lasso coef ficient, our accuracy drops by 17% relati ve to the model with Comple x Analysis. Our ridge model thus serves to validate the results of our lasso re gression. The f act that courses assigned higher lasso coef ficients also lead to higher grade prediction accuracy lev els provides e vidence that our lasso coefficients are accurately ranked. Furthermore, the dramatic decline in accuracy attributed to using courses with zero lasso coefficient in the ridge model verifies that these courses ha ve little bearing on a student’ s ov erall mathematics GP A. 3 . C O R E C O U R S E A N A L Y S I S F O R U C L A M A T H S T U D E N T S No w that we ha ve established methods to v alidate the core courses, we apply these methods on sev eral dif ferent math majors at UCLA. The UCLA mathematics department houses se ven dif ferent majors. While each major has a different set of course requirements, the department considers Real Analysis and Linear Algebra as the core courses for each major and these two courses are required for all majors. W e expect grades in these tw o core courses to be highly 10 correlated with and highly impactful of a student’ s o verall mathematics GP A, more so than non- core courses. W e compare the two core and the other non-core courses within a major using the statistical tools outlined in the previous section. W e highlight those departmental non-core courses that have high β i and ρ H v alues relati ve to departmental core courses and interpret this apparent discrepancy . 3 . 1 . C O R E C O U R S E A N A L Y S I S F O R P U R E M A T H E M A T I C S F R E S H M E N W e no w analyze core and non-core courses for those students majoring in pure mathematics that were admitted to UCLA as freshmen. W e found that our course analysis was the most interesting for these students because man y non-core courses were both highly correlated to and highly impactful of these students’ overall mathematics GP A, more so than the two departmental core courses. W e carefully discuss the findings for this major belo w . W e also provide some possible explanations as to wh y many non-core courses exhibited higher correlation and impact on ov erall mathematics GP A than core courses did. First, we study ρ H of courses when considering pure mathematics students. W e display the ρ H -v alues in Figure 4 . In this bar plot, we show the nine courses with the greatest ρ H . W e see that the two core courses Linear Algebra (Part I) and Real Analysis (Part I) are in this list, b ut are not the top rank ed. Surprisingly , the top ranked courses are Comple x Analysis, T opology , and Linear/Nonlinear Differential Equations. While ev ery pure math student is required to take Complex Analysis, they are not required to take it early in their major as it is not required for later coursework. T opology and Linear/Nonlinear Dif ferential Equations are not required at all. While core courses were highly correlated with a student’ s ov erall mathematics GP A, we were surprised that Complex Analysis scored highest. Unlike most other courses in the pure mathematics major , Complex Analysis has no upper -division prerequisite courses. Indeed, until recently , it was also required for Electrical Engineering students who generally do not take the two core mathematics courses. Moreo ver , no other upper di vision mathematics course at UCLA requires Comple x Analysis. On the surface, we e xpect that Comple x Analysis could be tak en in a student’ s last quarter without impacting their performance in other courses. As such, we also expect that such a course w ould have lo wer correlation to overall mathematics GP A. Next, we study the lasso coefficients β i and display the results in T able 4 . Here, the course with the highest β i v alue is again Complex Analysis, followed by Probability Theory (Part I), Honors Abstract Algebra (Part I), and Differential Equations. Linear Algebra and Real Analy- sis, the core courses, both have zero as their coefficients, suggesting that non-core courses have greater impact on ov erall mathematics GP A. More formally , those courses with non-zero coef- ficients are the most important inputs for a linear model of ov erall mathematics GP A. W e are surprised to find that Complex Analysis ranks as the top course under both the ρ H and β i metrics because rank in these metrics is mutually exclusi ve. That is, a course can have a high ρ H v alue, being highly correlated with overall mathematics GP A, without having a high β i v alue, indicat- ing that it is highly impactful on ov erall mathematics GP A. Then, the fact that Complex Analysis ranks as the top course under both metrics says that it is more correlated and more impactful on ov erall mathematics GP A than any other mathematics course for our pure mathematics freshman admit group. T o understand why Comple x Analysis was the course with highest ρ H and β i v alues, we looked for other explanatory statistics. W e found that regular faculty more frequently teach Complex Analysis than temporary faculty do. W e believ e this is one possible factor in explaining 11 T able 4: Summary statistics for Pure Mathematics freshman admits. Complex Anal- ysis is the top ranked course under both correlation and re gression metrics. Course ρ H Rank ρ H β i Rank β i Complex Analysis 1 1.039 1 0.3774 T opology 2 1.038 – 0 Linear Algebra 5 0.951 – 0 Real Analysis (Part I) 6 0.9507 – 0 Probability (Part I) 10 0.8224 2 0.00824 Honors Abs. Algebra (Part I) 11 0.811 3 0.01919 Dif f. Eqs. (Lower Di v .) 14 0.8012 4 0.001573 high ρ H and β i v alues for Comple x Analysis. Permanent faculty ha ve more teaching experience and are better able to assess students in their course. As such, Complex Analysis grades more strongly correlate with the overall mathematics GP A. As to whether students are getting more adept training from re gular f aculty for the rest of their math major is beyond the scope of this work. Moreover , in this data set, we cannot determine whether a re gular or temporary faculty assigned a particular student’ s grade and so do not comment on this further . Having selected some courses from the lasso model, we can no w generate a prediction func- tion to estimate overall mathematics GP A for pure mathematics freshman admit students. W e choose to run ridge re gression using Linear Algebra (Part I), Comple x Analysis, and Probability Theory (Part I). W e choose the latter two courses because the y are assigned non-zero lasso coef- ficients as seen in T able 4 . W e choose Linear Algebra (Part I) because it is a departmental core course and has a relati vely high correlation rank in T able 4 . From our original student popula- tion of pure mathematics students who hav e taken at least one upper di vision course, we further sample the students who hav e taken all three courses. The results of our ridge regression are as follo ws ˆ g GP A = 0 . 74 + 0 . 29 g 132 + 0 . 27 g 115 A + 0 . 18 g 170 A . W e obtain an R sq of 87%. Although we use grades in the above three courses to estimate overall mathematics GP A, in general students take far more courses. In this sample alone, the a verage number of courses tak en is 16 courses with standard de viation of 4.11. T aking three standard de viations belo w the mean of 16, we still arriv e at a number higher than 3 courses, indicating that it is unlikely to find students whose overall mathematics GP A ’ s are determined only by the three courses used in the model. In Figure 5 we can see a scatter plot of the results. 12 Figure 4: Bar plot of courses of pure mathematics freshman student admits. Bars colored gre y are core courses as defined by the mathematics department. Courses marked with an asterisk (*) are required courses. Figure 5: Scatter plot of grade v alues in Complex Analysis, Linear Algebra, and Probability Theory (Part I) plotted against overall GP A with ridge prediction func- tion. 13 T able 5: Summary statistics for Math/Econ freshman admits. Course ρ H Rank ρ H β i Rank β i Abs. Algebra (Part I) 1 1.08271 – 0 Single V ar . Calc. (Part I) 2 1.02221 – 0 Multi var . Calc. (Part II) 3 1.01405 3 0.68589 Num. Analysis (Part II) 4 1.00849 – 0 Dif f. Eq. (Lo w . Di v .) 5 0.986724 7 0.010956 Linear Algebra (Part I) 6 0.94111 – 0 Real Analysis (Part I) 13 0.9101 1 0.08471 Optimization 12 0.916562 2 0.081447 3 . 2 . S U R V E Y O F O T H E R M A T H E M A T I C S M A J O R S While we have focused primarily on the Pure Mathematics freshman admit group for our anal- yses, we ha ve also used the outlined methods on tw o other large groups of mathematics majors. Namely , we hav e obtained results for the Math/Econ freshman admit group and the Applied Mathematics freshman admit group. W e display the results for these groups in T able 5 and T able 6 . As we can observe from T able 5 , within the Math/Econ freshman group, there are two sur- prising results. Firstly , the course Abstract Algebra (Part I), which is not required for Math/Econ students, has a high ρ H v alue. Additionally , Real Analysis (Part I), which is a core course, has a high β i v alue. The result of the course grade in Real Analysis (Part I) having high impact is surprising because of the few number of theoretical courses students of the Math/Econ major are required to take. Additionally , the result of a course grade in Abstract Algebra having high correlation with one’ s ov erall mathematics GP A is surprising because it is both theoretical and not required within the major . Although theoretical content may prepare students for a well- rounded foundation for their future course work, most courses within the Math/Econ curriculum tend to be applied. As we can observe from T able 5 and T able 6 for both the Math/Econ and Applied Mathemat- ics freshman groups, Complex Analysis does not hav e highly ranked ρ H and β i v alues in either of these student groups. This contrasts with the results found in Section 3.1. . Another result to highlight is that in both the Applied Math freshman group and the Math/Econ freshman group, Linear Algebra is not selected as an impactful course by lasso re gression. Furthermore, in the Applied Mathematics freshman group, Real Analysis (Part I) is also not selected as an impactful course through lasso regression. Although both courses are considered core by the mathematics department, other courses, such as Probability Theory (Part II) and Mathematical Modeling are more correlated with and impactful on ov erall mathematics GP A. 14 T able 6: Summary statistics for Applied Mathematics freshman admits. Course ρ H Rank ρ H β i Rank β i Probability (Part II) 1 1.0759 5 0.02965 Linear/Nonlinear Dif f. Eq. 2 1.0714 – 0 Combinatorics 3 1.0616 – 0 Math. Modeling 4 1.055 1 0.08497 Optimization 5 1.0506 – 0 Num. Analysis (Part II) 11 0.9966 2 0.06327 Complex Analysis 9 1.01803 3 0.04523 Real. Analysis (P art I) 13 0.98749 – 0 Linear Algebra (Part I) 15 0.95237 – 0 4 . S U M M A R Y A N D I N T E R P R E T A T I O N O F R E S U L T S In this paper , we de veloped methods to analyze the core courses selected by the mathematics department. W e identified statistical attrib utes we expect the core courses to hav e. The first attribute was a correlated metric, denoted by ρ H , which measured the correlation between a course grade and overall mathematics GP A. The second attribute was a metric of impact, denoted by β i , which measured the impact a course grade has on ov erall mathematics GP A. W e tested our methods on various mathematics majors at UCLA to v alidate the selection of core courses as selected by the mathematics department. The most surprising results came from the pure mathematics freshman group. W e summarize our results for pure mathematics students in T able 4 . In this table, one can see that Complex Analysis appears in both results for the mixed correlation metric and lasso regression. This suggests that the grade in Comple x Analysis is both highly correlated to and impactful on one’ s ov erall mathematics GP A. Although this result is unique in that there is a single course that is highly correlated to and highly impactful on o verall mathematics GP A within this student group, it may also suggest a weakness in our methods, as other student groups did not hav e a single course that ranked highest by both measures. W e can interpret the results from the pure mathematics freshmen by looking at instructors throughout the 15 year period. W e see that Complex Analysis is more frequently taught by regular faculty than many other upper di vision courses at UCLA. In particular , about 52 % of the professors who taught Complex Analysis were regular faculty . W e can compare this to 46 % of regular faculty who taught Real Analysis, and 39 % of regular faculty who taught Linear Algebra. This may help explain the high correlation attributed to Complex Analysis. For instance, re gular faculty may tend to giv e grades more indicativ e of a student’ s overall performance. Additionally , regular faculty may teach a course more consistently than temporary faculty so that the grade in the course is a more accurate reflection of the student’ s performance. If this is the case, the math department may benefit from assigning more re gular faculty to courses which are seen as foundational. 15 T able 7: Sample of data provided by department of mathematics. ID T erm Major Code Classification Admit Class Subject Course Grade 4 . 1 . F U T U R E W O R K Our core course analysis could be expanded upon with the inclusion of more data and by refining the students we consider . W e purposefully selected versatile models to illustrate this core course analysis and how it could be used to better understand the mathematics curriculum. Going forward, we will utilize more directed and specific statistical models, deri ve the results from these models and compare them to the results presented in this paper . W e would also like to combine the lasso and ridge models to incorporate the sparsity of the ` 1 norm with the metrics coming from the ` 2 norm. In addition, we can expand our re gression models to include explanatory v ariables such as average course load, whether or not the student switched majors, and ho w long a student took to declare a major . The goal of including such v ariables is to introduce more sources of v ariation in ov erall GP A into our model than just grades. W e can analyze course rankings under the β i v alues from this more comprehensiv e regression model and compare the associated rankings to those outlined in this paper . W e can then in vestigate any discrepancies between the two rankings and provide the findings to the department which can act accordingly . A S T U D E N T D A T A In this appendix, we briefly discuss the data that we in vestigated in this paper . As mentioned earlier , each ro w represents a course a student took, his grade in that course, and other pertinent information. The so-called Student Classification in T able 7 indicates their academic standing: freshman (UFR), sophomore (USO), junior (UJR), and senior (USR). The Admit Class is the academic standing when a student first entered UCLA. A first-year non-transfer student may enter as as USO or a UFR based on the number of colle ge lev el credits earned while in high school. The grade v alues range from 0.0-4.3, where 4.3 indicates an A+ grade. Under the umbrella of the mathematics department their are two possible subject headings: MA TH and COMPT , the latter representativ e of the computer based courses in the program for computation. The Course codes can be vie wed at the UCLA Department of Mathematics . The Major Codes and corresponding major names can be vie wed at the UCLA Registrar . B A C K N O W L E D G E M E N T S W e would like to thank Dimitri Shlyaktenko and Charles Z. Marshak for all the guidance they hav e provided us during this project and the writing of this paper . W e thank Dimitri Shlyakht- enko and Andrea Bertozzi for their help with administrati ve issues related to use of this dataset. W e also want to thank Mihai Cucuringu for initially helping us parse through the data and M. Puck Rombach for additional o versight and helpful discussions. Finally , we would like to thank our REU group members Dillon Montag, Milica Had ˆ zi-T anovi ´ c, and Juan Carlos Apitz. Ritvik Kharkar w as supported by NSF grant DMS-1045536 and Jessica T ran was supported by the UC LEADS program. 16 R E F E R E N C E S (1) Ryan Baker and Kalina Y acef. The state of educational data mining in 2009: A revie w and future visions. JEDM-J ournal of Educational Data Mining , 1(1):3–17, 2009. (2) Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, and Jonathan Eckstein. Distrib uted optimization and statistical learning via the alternating direction method of multipliers. F oundations and T rends in Machine Learning , 3(1):1–122, 2011. (3) Brent Bridgeman, Laura McCamle y-Jenkins, and Nancy Ervin. Predictions of freshman grade-point aver - age from the revised and recentered SA T I: Reasoning test. ETS Resear ch Report Series , 2000(1):i–16, 2000. (4) Ko-Kang Chu, Maiga Chang, and Y en-T eh Hsia. Designing a course recommendation system on web based on the students’ course selection records. In W orld confer ence on educational multimedia, hypermedia and telecommunications , volume 2003, pages 14–21, 2003. (5) Augustinos I. Dimitras, Stelios H. Zanakis, and Constantin Zopounidis. A survey of business failures with an emphasis on prediction methods and industrial applications. Eur opean J ournal of Oper ational Resear ch , 90(3):487–513, 1996. (6) Robert R. Glauber Donald E. Farrar . Multicollinearity in regression analysis: The problem re visited. The Revie w of Economics and Statistics , 49(1):92–107, 1967. (7) Thomas G. Edwards. Current reform efforts in mathematics education. ERIC/CSMEE digest. 1994. (8) Mark W Ellis and Robert Q Berry III. The paradigm shift in mathematics education: Explanations and implications of reforming conceptions of teaching and learning. Mathematics Educator , 15(1):7–17, 2005. (9) Rosta F arzan and Peter Brusilovsky . Social navigation support in a course recommendation system. In Adaptive hypermedia and adaptive web-based systems , pages 91–100. Springer , 2006. (10) National Science Foundation. Science engineering indicators 2012. (11) Chris Hans. Bayesian lasso regression. Biometrika , 96(4):835–845, 2009. (12) David Harville. Extension of the gauss-markov theorem to include the estimation of random effects. The Annals of Statistics , 4(2):384–395, 1976. (13) Sarah B. Henderson, Bernardo Beckerman, Michael Jerrett, and Michael Brauer . Application of land use regression to estimate long-term concentrations of traffic-related nitrogen oxides and fine particulate matter . En vir onmental science & technology , 41(7):2422–2428, 2007. (14) Dennis E. Hinkle, W illiam W iersma, and Stephen G. Jurs. Applied statistics for the behavioral sciences. 2003. (15) Xing Liu. Fitting stereotype logistic regression models for ordinal response variables in educational re- search. J ournal of Modern Applied Statistical Methods , 13(2):31, 2014. (16) Robert McGill, John W . T uke y , and W ayne A. Larsen. V ariations of box plots. The American Statistician , 32(1):12–16, 1978. (17) Y annick Meier, Jie Xu, Onur Atan, and Mihaela v an der Schaar . Predicting grades. 2015. (18) David Lindsay Roberts. E.H. Moore’ s early twentieth-century program for reform in mathematics educa- tion. The American Mathematical Monthly , 108(8):689–696, 2001. (19) Cristobal Romero and Sebastian V entura. Educational data mining: A surv ey from 1995 to 2005. Expert systems with applications , 33(1):135–146, 2007. 17 (20) Cristóbal Romero and Sebastián V entura. Educational data mining: a revie w of the state of the art. Sys- tems, Man, and Cybernetics, P art C: Applications and Reviews, IEEE T ransactions on , 40(6):601– 618, 2010. (21) George Siemens and Ryan Baker . Learning analytics and educational data mining: tow ards communica- tion and collaboration. In Pr oceedings of the 2nd international confer ence on learning analytics and knowledge , pages 252–254. A CM, 2012. (22) C. Spearman. The proof and measurement of association between two things. The American J ournal of Psycholo gy , 15(1):72–101, 1904. (23) Cem T ekin, Jonas Braun, and Mihaela van der Schaar . etutor: Online learning for personalized education. In Acoustics, Speech and Signal Pr ocessing (ICASSP), 2015 IEEE International Confer ence on , pages 5545–5549. IEEE, 2015. (24) Mark Hoover Thames and Deborah Loe wenberg Ball. Making progress in us mathematics education: Lessons learned—past, present, and future. In V ital dir ections for mathematics education r esear ch , pages 15–44. Springer , 2013. (25) Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical So- ciety . Series B (Methodological) , pages 267–288, 1996. (26) Jack V . T u. Advantages and disadv antages of using artificial neural networks versus logistic regression for predicting medical outcomes. J ournal of clinical epidemiology , 49(11):1225–1231, 1996. (27) Daniel S W eld, Eytan Adar , L ydia Chilton, Raphael Hof fmann, Eric Horvitz, Mitchell K och, James Landay , Christopher H. Lin, and Mausam Mausam. Personalized online education—a cro wdsourcing challenge. In W orkshops at the T wenty-Sixth AAAI Confer ence on Artificial Intelligence , pages 1–31, 2012. (28) H W u. The mathematics education reform: Why you should be concerned and what you can do. The American Mathematical Monthly , 104(10):946–954, 1997. (29) H W u. On the education of mathematics majors , volume 36. MSRI Publications, 1999. (30) Jie Xu, T ianwei Xiang, and Mihaela van der Schaar . Personalized course sequence recommendations. arXiv pr eprint arXiv:1512.09176 , 2015. (31) Y onghong Jade Xu. An exploration of using data mining in educational research. Journal of Modern Applied Statistical Methods , 4(1):23, 2005. 18
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment