Shiyu (Vivenne) Zeng
-
B.S.Eng. (探花系列, 2023)
Topic
Automated Classification of Pull Requests in Scientific Software using LLMs
Department of Computer Science
Date & location
-
Monday, March 30, 2026
-
3:00 P.M.
-
Engineering Computer Science Building
-
Room 555 and Virtual
Reviewers
Supervisory Committee
-
Dr. Neil Ernst, Department of Computer Science, 探花系列 (Supervisor)
-
Dr. Daniel German, Department of Computer Science, UVic (Member)
External Examiner
-
Dr. Italo Santos, Department of Information and Computer Science, University of Hawaii
Chair of Oral Examination
-
Dr. Richard Marcy, School of Public Administration, UVic
Abstract
Scientific software relies on contributions that combine domain-specific expertise with software engineering skills, but identifying which contributions require deep scientific knowledge remains a persistent challenge in project maintenance. We analyzed 1,074 pull requests from three established scientific repositories, Trilinos, Mantid, and AMReX, and developed a binary classification framework that distinguishes contributions requiring scientific knowledge from those focused on software concerns. Our approach achieves near-human reliability, with DeepSeek-R1 demonstrating a Krippendorff’s α of 0.789 through iterative prompt refinement and human validation. The analysis reveals different review characteristics: scientific contributions require 67% longer review times, involve 64% more unique reviewers, generate twice the discussion comments, and undergo over 300% more revision cycles than software-focused changes. These patterns persist after controlling for the size of the pull request and the effects of the repository. A validation study on 75 PlasmaPy issues achieves 89.33% accuracy, indicating the framework applies to other contribution types. These findings establish that LLM-based classification can effectively support automated triage in interdisciplinary software teams. This enables more efficient allocation of scarce domain expertise while empirically confirming that scientific contributions require different review processes.