Contribution: This paper shows how significantly computer science instructors can disagree when grading code writing questions in introductory computer science (CS1) exams. Certain solutions are considered 'very good' by some instructors and 'very bad' by others. The study identifies four factors as possible contributors to such differences between graders: 1) disagreement on the role of syntax in CS1; 2) absence of objective measures for the seriousness of logic errors; 3) variance in how rubrics are constructed; and 4) the use of subjective judgement while grading. Background: Code writing questions are widely used in CS1 exams. Understanding differences in how graders approach such questions can help the CS education community become more informed about how to design valid assessments for summative and formative purposes. Research Questions: Do graders differ significantly in how they score student programs? What are the main factors to which differences between graders could be attributed? Methodology: A number of CS1 instructors and TAs were asked to score pieces of code resembling student solutions to CS1 exam questions, and were asked to answer open ended questions on how they approach grading in CS1 exams. Responses were then analyzed qualitatively and quantitatively. Findings: There is no consensus on how code writing questions in CS1 exams should be graded. Differences in opinion on how many points to award can be very significant, even for very simple programs and between graders in the same institution. Some graders assign grades that do not necessarily agree with how many points they believe the student solution is worth.
All Science Journal Classification (ASJC) codes
- Electrical and Electronic Engineering
- Computer science
- manual grading
- partial credit
- student assessment