Shiyu (Vivienne) Zeng - Faculty of Graduate Studies

Shiyu (Vivenne) Zeng

B.S.Eng. (̽��ϵ��, 2023)

Notice of the Final Oral Examination for the Degree of Master of Science

Topic

Automated Classification of Pull Requests in Scientific Software using LLMs

Department of Computer Science

Date & location

Monday, March 30, 2026
3:00 P.M.
Engineering Computer Science Building
Room 555 and Virtual

Reviewers

Supervisory Committee

Dr. Neil Ernst, Department of Computer Science, ̽��ϵ�� (Supervisor)
Dr. Daniel German, Department of Computer Science, UVic (Member)

External Examiner

Dr. Italo Santos, Department of Information and Computer Science, University of Hawaii

Chair of Oral Examination

Dr. Richard Marcy, School of Public Administration, UVic

Abstract

Scientific software relies on contributions that combine domain-specific expertise with software engineering skills, but identifying which contributions require deep scientific knowledge remains a persistent challenge in project maintenance. We analyzed 1,074 pull requests from three established scientific repositories, Trilinos, Mantid, and AMReX, and developed a binary classification framework that distinguishes contributions requiring scientific knowledge from those focused on software concerns. Our approach achieves near-human reliability, with DeepSeek-R1 demonstrating a Krippendorff’s α of 0.789 through iterative prompt refinement and human validation. The analysis reveals different review characteristics: scientific contributions require 67% longer review times, involve 64% more unique reviewers, generate twice the discussion comments, and undergo over 300% more revision cycles than software-focused changes. These patterns persist after controlling for the size of the pull request and the effects of the repository. A validation study on 75 PlasmaPy issues achieves 89.33% accuracy, indicating the framework applies to other contribution types. These findings establish that LLM-based classification can effectively support automated triage in interdisciplinary software teams. This enables more efficient allocation of scarce domain expertise while empirically confirming that scientific contributions require different review processes.

Back to oral exams

̽��ϵ��