As noted in the Introduction, the goals of RadGrad include improving degree planning, engagement, retention, research participation, diversity, and graduate success. This section describes the design of onboard analytics to support the gathering for evidence to understand whether or not we are achieving those goals. Onboard analytics refers to data that can be collected automatically through the operation of RadGrad and its use by users.
Note that onboard analytics can provide only part of the evidence necessary to understand our progress toward these goals. Other forms of evidence, such as evidence gathered through interviews and focus groups, is also necessary. This section does not cover those additional forms of evidence.
Our approach to onboard analytics involves the following design concepts:
Subject Population: Our subject population for each snapshot consists of all students enrolled in courses intended for majors during that semester. For example, this consists of 53 courses for the UH computer science department: two 100-level courses, six 200-level courses, eleven 300-level courses, and thirty four 400-level courses. Currently, approximately 150 students are enrolled in 100-level courses each semester, dropping to approximately 75 per semester in 400-level courses. The members of the subject population will change each semester as new students enter the program and old students either graduate or abandon the program. Note that every student in the subject population will be eligible for a RadGrad account.
Student Profile: In order to assess whether or not RadGrad improves retention and/or diversity among women and underrepresented minorities, we need to determine whether each student self-identifies as Female and/or Native Hawaiian. The Student Profile will provide this information. Our assumption is that profile data associated with a student does not change during their time in our program. (If it does, for example, if a student transitions from a male to a female during their time in our program, then that student will be excluded from our analytics because gender transition is a significant confounding variable.) We can access institutional records to determine this information, and/or allow students to specify this themselves. We will label female students as F+, and Native Hawaiian students as NH+ in the following discussion.
Semester Snapshot: We will build a set of data structures called "Semester Snapshots", which provides information about the subject population during a given Fall or Spring semester.
Grade Level: We will assign each student in a Semester Snapshot to one of the following Grade Levels based upon the number of semesters in which they have taken CS courses: Year1 (one or two semesters); Year2 (three or four semesters); Year3 (five or six semesters); Year4 (seven or eight semesters); or Year5 (nine or more semesters). For the UH computer science department, the number of students in each Grade Level generally parallel the course levels: approximately 150 Year1s, dropping to 75 Year4s and Year5s. Most students completing the degree program move through at least the first three Grade Levels.
RadGrad active: We will categorize each student in a semester snapshot as either an active user of RadGrad (RadGrad+) or not active (RadGrad-). To be classified as RadGrad+ during a semester snapshot, the student must have: (a) logged into RadGrad at least once, and (b) changed their Degree Experience Plan (for example, by updating their set of Interests, Career Goals, or Opportunities). All other students will be classified as RadGrad-, indicating no modification of their DEP during that semester. While we intend for all students in a department to receive RadGrad training and develop an initial Degree Experience Plan, any further use of DEP/RadGrad is still voluntary and optional.
CoP active: Each Opportunity will be evaluated by RadGrad administrators to determine if it provides participants with the three characteristics of a Community of Practice. If so, the opportunity will be tagged as a CoP. If a student participates in at least one Opportunity tagged as a CoP during a semester, then they are classified as CoP+, otherwise they are classified as CoP-. Preliminary analysis of the current RadGrad instance indicates that approximately half of the 70 Opportunities provide Communities of Practice.
DEP Evolution: RadGrad's instrumentation allows us to see how interests, career goals, planned activities, and ICE points all evolve over time.
The DEDM provides a straightforward way to measure RadGrad adoption: it is simply the percentage of RadGrad+ students. Depending upon our analytic needs, we can compute adoption on a per semester basis over the entire subject population, by GradeLevel, aggregated over one or more semesters, or based on a slice the population that is F+ and/or NH+ students.
While we expect adoption to be high based upon our pilot studies, we do not expect universal adoption. If the overall adoption rate per semester is approximately 75%, and the overall subject population per semester is at least 400, then there will be at least 100 RadGrad- students per semester that we can use as a proxy for a "control" group to compare against the RadGrad+ "treatment" group. (Section threats discusses limitations with this approach.)
According to McCormick, engagement measures the extent to which student are participating in educational practices that are strongly associated with high levels of learning and personal development. The North American National Survey of Student Engagement provides a generic instrument for assessing student engagement, though some researchers raise concerns with the application of these and other generic tools to computer science.
Based on the literature, we believe that ICE scores provide a valid, if not superior, alternative measure of engagement for undergraduates in computer science programs because they provide a fine-grained measure of verified student involvement with faculty-curated educational activities, both curricular and extracurricular. (See the Threats section for limitations with this approach.)
To measure the impact of DEP/RadGrad on engagement, we will compare the average number of ICE points per RadGrad+ student in a given GradeLevel over the first two semesters of the project period to the subsequent four semesters of the project period. We interpret ICE points during the first semesters as a form of "pre-test" measure of engagement, i.e. these scores measure student engagement before RadGrad became a part of the subject population's undergraduate experience. In the final four semesters, we will be able to measure engagement over time for students for whom RadGrad has been a part of their degree experience since their second semester. We exclude RadGrad- students in order to increase the internal validity of ICE as a measure of engagement. To compare student engagement before and after RadGrad, we can use a paired-samples t-test.
If DEP/RadGrad has a positive impact on engagement, then we predict that the average number of ICE points per RadGrad+ student per GradeLevel will be significantly higher in the final four semesters compared to the first two semesters. For example, we predict that the average number of ICE points earned by Year3 RadGrad+ students in Fall 2020 will be significantly more than the average number of ICE points earned by Year3 RadGrad+ students in Fall 2018. Note that impact will attenuate if more and more students achieve 100 points for each of the three categories, which DEP/RadGrad defines as the goal state for undergraduate preparation.
The literature suggests that increased disciplinary engagement leads to increased retention. So, if DEP/\allowbreak RadGrad increases engagement (as assessed above), then we can predict that DEP/RadGrad will make a positive impact on retention.
In this study, we measure retention as follows: for any given semester, a student is considered "retained" if they increase at least one GradeLevel or graduate within two semesters. We can use this definition to calculate an overall "retention rate" for the entire population on a semester-by-semester basis, which is the percentage of retained students in the subject population for that semester. We can also calculate retention rates for subpopulations, such as the retention rate for FirstYear students in Spr 18. Note that because we require data from the following two semesters to calculate retention for any given semester, we will only be able to calculate retention for the first four semesters of this six semester project.
Given our definition of retention rate, we predict that the retention rate for RadGrad+ students will be higher than for RadGrad- students in any given semester, regardless of GradeLevel. However, we also expect this difference to be greater at lower GradeLevels, since Year4 and Year5 students are close to graduation and highly motivated to finish, so there is less attrition at this level. To compare retention rates, we can employ an independent samples t-test.
It is possible that improved retention among RadGrad+ students will have a ripple effect onto RadGrad- students (i.e. a rising tide lifts all boats). To see if that phenomena occurs, we will compare retention rates among RadGrad- students for a given GradeLevel over time. A ripple effect will manifest itself by an improving retention rate over time: for example, the retention rate for Year1 RadGrad- students in Spring 2020 will be higher than the rate for Year1 RadGrad- students in Spring 2018.
The literature also suggests that involvement with Communities of Practice should increase retention. To assess this, we can test to see if CoP+ students exhibit higher retention rates than CoP- students.
If DEP/RadGrad has a positive impact on diversity, then if RadGrad is adopted by F+ and NH+, then we predict that the percentage of F+ and NH+ students in a semester snapshot at the end of the project period will be significantly higher than the percentage of F+ and NH+ students in a semester snapshot at the beginning of the project period. To examine this hypothesis, we can compare the two samples using an independent samples t-test.
Because diversity is related to engagement and retention, we can assess these measures for F+ and NH+ students by doing the analyses as described above, but restricting ourselves to these subsets of our subject population.
Based upon the literature, we expect to observe high levels of adoption, plus positive changes in engagement and retention for females and Native Hawaiians over the course of the project period.
One threat to this quasi-experimental design is extremely high or extremely low adoption. If adoption is so high that there are almost no RadGrad- students, then there is no group to compare to RadGrad+. Conversely, if adoption is so low that there are no RadGrad+ users, then the entire design falls apart. Based upon our pilot studies, in which students reacted quite positively to RadGrad, we are hopeful that adoption will be high but not total.
A second threat is the presence of a large number of engaged students who are RadGrad-. These are students who are academic high achievers and who participate in extracurricular disciplinary activities, but who do not want to represent their degree experience in RadGrad. Again, our pilot studies provide contrary evidence: the subset of students we approached who evidenced high engagement were among the most enthusiastic about using RadGrad.
A third and related threat is that our decision to use ICE scores to measure engagement means our engagement data cannot be compared to engagement data gathered using generic surveys such as NSSE. A supplemental outcome of this project will be a study in which we administer the NSSE survey to a random selection of students, then compare this data to their ICE scores to see if a correlation exists.
A fourth threat is self-selection. Our design does not randomly assign students to the RadGrad+ and RadGrad- or CoP+ and CoP- groups, which weakens the interpretation of our statistical tests. However, we believe our current design is best suited to obtaining initial evidence regarding the strengths and weaknesses of DEP/RadGrad, and can lay the groundwork for future studies for which stronger experimental controls might be appropriate.
Finally, the computer science department associated with this study is not static. New initiatives, teaching approaches, and faculty are likely to be introduced over the course of the project. Any of these could also have a significant positive impact on engagement, retention, and diversity (and, to be honest, we hope that they will.) As such changes appear, we will use interview data to gather evidence as to their potential impact, and they will be incorporated into the interpretation of the results from this project.