Performance and Learning in a Two-Stage Exam

Differential Performance Gains and Collaborative Dynamics in a Norwegian Geography Course

Authors

  • Gidske Andersen Universitetet i Bergen

Keywords:

two-stage exam, collaborative assessment, group dynamics, motivation, geography

Abstract

Two-stage exams are assessment forms that allow students to take a test individually, then to take it collaboratively in groups. The second stage connects testing and learning through social interaction, particularly peer dialogue, thereby enabling immediate, internal feedback to support self-regulation. Most published research has evaluated implementations in US and Canadian colleges, and no studies from Norway have been identified. This study evaluates the implementation of a two-stage exam in a geography course. Results are based on a comparison of grades between the two stages, complemented with responses from a questionnaire. Findings indicate that collaboration improves student performance, even to the extent that group results are better than the highest individual result. However, low-performing students tend to benefit the most. Groups collaborated effectively, as indicated by consensus-finding, with the most pronounced effect in the cohort with greater prior collaborative experience. Questionnaire responses suggest that this prior experience fostered a safe and trusting environment for exchanging ideas during peer dialogue. Altogether, the findings suggest that collaboration was effective and that positive interdependence existed, promoting learning. For instructors considering two-stage exams, the findings presented here suggest that this format offers pedagogical advantages and may be a timely alternative at a moment when AI challenges several other assessment forms that aim to align active learning and testing. 

References

Beatty, I. D. (2015). Collaboration or copying? Student behavior during two-phase exams with individual and team phases. Physics education research conference 2015. PER conference, College Park, MD.

Biggs, J. (1999). What the student does: Teaching for enhanced learning. Higher education research & development, 18(1), 57–75.

Bloom, B. S., Engelhart, M. D., Furst, E. J., Hill, W. H., & Krathwohl, D. R. (1956). Taxonomy of educational objectives: The classification of educational goals. Handbook 1: Cognitive domain. Longman New York.

Bloom, D. (2009). Collaborative test taking: Benefits for learning and retention. College Teaching, 57(4), 216– 220.

Boitshwarelo, B., Reedy, A. K., & Billany, T. (2017). Envisioning the use of online tests in assessing twenty-first century learning: a literature review. Research and Practice in Technology Enhanced Learning, 12, 1– 16.

Brown, G. T., & Abdulnabi, H. H. (2017). Evaluating the quality of higher education instructor-constructed multiple-choice tests: Impact on student grades. Frontiers in Education,

Bruno, B. C., Engels, J., Ito, G., Gillis-Davis, J., Dulai, H., Carter, G., Fletcher, C., & Böttjer-Wilson, D. (2017). TwoStage Exams A Powerful Tool for Reducing the Achievement Gap in Undergraduate Oceanography and Geology Classes. Oceanography, 30(2), 198–208. https://doi.org/10.5670/oceanog.2017.241

Bryer, J., Speerschneider, K., & Bryer, M. J. (2016). Package ‘likert’. Likert: Analysis and Visualization Likert Items (1.3. 5)[Computer software]. Available online at: https://CRAN. R-project. org/package= likert (accessed June, 2025).

Butler, A. C. (2018). Multiple-choice testing in education: Are the best practices for assessment also good for learning? Journal of Applied Research in Memory and Cognition, 7(3), 323–331.

Callaghan, K., Kestin, G., Klales, A., McCarty, L., & Deslauriers, L. (2025). Active learning through flexible collaborative exams: Improving assessments across disciplines. Active Learning in Higher Education. https://doi.org/10.1177/14697874251344293

Cooke, J. E., Weir, L., & Clarkston, B. (2019). Retention following two-stage collaborative exams depends on timing and student performance. CBE—Life Sciences Education, 18(2), ar12.

Deslauriers, L., McCarty, L. S., Miller, K., Callaghan, K., & Kestin, G. (2019). Measuring actual learning versus feeling of learning in response to being actively engaged in the classroom. Proceedings of the National Academy of Sciences, 116(39), 19251–19257.

Deslauriers, L., Schelew, E., & Wieman, C. (2011). Improved Learning in a Large-Enrollment Physics Class. Science, 332(6031), 862–864. https://doi.org/10.1126/science.1201783

Efu, S. I. (2019). Exams as learning tools: A comparison of traditional and collaborative assessment in higher education. College Teaching, 67(1), 73–83.

Fink, L. D. (2013). Creating significant learning experiences: An integrated approach to designing college courses. John Wiley & Sons.

Gilley, B. H., & Clarkston, B. (2014). Collaborative testing: Evidence of learning in a controlled in-class study of undergraduate students. Journal of College Science Teaching, 43(3), 83–91.

Giuliodori, M. J., Lujan, H. L., & DiCarlo, S. E. (2008). Collaborative group testing benefits high-and lowperforming students. Advances in Physiology Education, 32(4), 274–278.

Hall, D., & Buzwell, S. (2013). The problem of free-riding in group projects: Looking beyond social loafing as reason for non-contribution. Active Learning in Higher Education, 14(1), 37–49.

Harpe, S. E. (2015). How to analyze Likert and other rating scale data. Currents in pharmacy teaching and learning, 7(6), 836–850.

Jang, H., Lasry, N., Miller, K., & Mazur, E. (2017). Collaborative exams: cheating? Or learning? American Journal of Physics, 85(3), 223–227.

Johnson, D. W., & Johnson, R. T. (2009). An educational psychology success story: Social interdependence theory and cooperative learning. Educational researcher, 38(5), 365–379.

Karau, S. J., & Williams, K. D. (1993). Social loafing: A meta-analytic review and theoretical integration. Journal of personality and social psychology, 65(4), 681.

Kassambara, A. (2023). ggcorrplot: Visualization of a Correlation Matrix using 'ggplot2'. R package version

1.4.1.

Lee, T. R. C., Pye, M., Lilje, O., Nguyen, H. D., Hockey, S., De Bruyn, M., & Van den Berg, F. T. (2022). Two-stage Examinations in STEM: A Narrative Literature Review. International Journal of Innovation in Science and Mathematics Education, 30(5).

Levy, D., Svoronos, T., & Klinger, M. (2023). Two-stage examinations: Can examinations be more formative experiences? Active Learning in Higher Education, 24(2), 79–94.

Ley, K., Hodges, R., & Young, D. (1995). Partner testing. Research and Teaching in Developmental Education, 23–30.

Mahoney, J. W., & Harris-Reeves, B. (2019). The effects of collaborative testing on higher order thinking: Do the bright get brighter? Active Learning in Higher Education, 20(1), 25–37.

Malthe-Sørenssen-utvalget. Notat: Foreløpige vurderinger. Retrieved 31.01.2026 from https://malthesorenssenutvalget.no/notat-forelopige-vurderinger/

Martin, A. P. (2018). A Quantitative Framework for the Analysis of Two-Stage Exams. International Journal of Higher Education, 7(4), 33–54.

Nicol, D. (2007). E‐assessment by design: using multiple‐choice tests to good effect. Journal of Further and higher Education, 31(1), 53–64.

Nicol, D., & Selvaretnam, G. (2022). Making internal feedback explicit: harnessing the comparisons students make during two-stage exams. Assessment & Evaluation in Higher Education, 47(4), 507–522.

R Core Team, R. (2025). R: A language and environment for statistical computing.

Revelle, W. (2025). psych: Procedures for psychological, psychometric, and personality research. R package version 2.5.3.

Rieger, G. W., & Heiner, C. E. (2014). Examinations that support collaborative learning: The students' perspective. Journal of College Science Teaching, 43(4), 41–47.

Rieger, G. W., & Rieger, C. L. (2020). Collaborative assessment that supports learning. In Active learning in college science: The case for evidence-based practice (pp. 821–837). Springer.

Sibley, J., & Ostafichuk, P. (2023). Getting started with team-based learning. Taylor & Francis.

Tuckman, B. W. (1965). Developmental sequence in small groups. Psychological bulletin, 63(6), 384.

Venables, W. N., & Ripley, B. D. (2002). Modern Applied Statistics with S (Fourth ed.). Springer.

Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D. A., François, R., Grolemund, G., Hayes, A., Henry, L., & Hester, J. (2019). Welcome to the Tidyverse. Journal of open source software, 4(43), 1686.

Wieman, C. E., Rieger, G. W., & Heiner, C. E. (2014). Physics exams that promote collaborative learning. The physics teacher, 52(1), 51–53.

Wood, S. N. (2011). Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 73(1), 3–36.

Zipp, J. F. (2007). Learning by exams: The impact of two-stage cooperative tests. Teaching Sociology, 35(1), 62–76. https://doi.org/Doi 10.1177/0092055x0703500105

Downloads

Published

2026-05-19 — Updated on 2026-05-21

Versions

How to Cite

Andersen, G. (2026). Performance and Learning in a Two-Stage Exam : Differential Performance Gains and Collaborative Dynamics in a Norwegian Geography Course. Norsk Tidsskrift for Scholarship of Teaching and Learning. Retrieved from https://boap.uib.no/index.php/norsotl/article/view/4843 (Original work published May 19, 2026)

Issue

Section

Inquiry Article