Profile
My current role as a Research Computing Facilitator at Rochester Institute of Technology enables me to combine my passion for software engineering, security, research, and teaching to assist multi-disciplinary researchers in their quest to discover. Outside of work, I relax by reading (mostly non-fiction), building Lego (sets and custom designs), brewing beer (I’m not good at it yet), and studying the works of J.R.R. Tolkien.
Education
Advisor: Dr. Andrew Meneely
Committee: Dr. Daniel Krutz, Dr. Mehdi Mirakhorli, Dr. Emily Prud’hommeaux
Dissertation Title: Human Error Assessment in Software Engineering
Dissertation Abstract: Software engineers work under strict constraints, balancing a complex, multi-phase development process on top of user support and professional development. Despite their best efforts, software engineers experience human errors, which manifest as software defects. While some defects are simple bugs, others can be costly security vulnerabilities. Practices such as defect tracking and vulnerability disclosure help software engineers reflect on the outcomes of their human errors (i.e. software failures), and even the faults that led to those failures, but not the underlying human behaviors. While human error theory from psychology research has been studied and applied to medical, industrial, and aviation accidents, researchers are only beginning to systematically reflect on software engineers’ human errors. Some software engineering research has used human error theories from psychology to help developers identify and organize their human errors (mistakes) during requirements engineering activities, but developers need an improved and systematic way to reflect on their human errors during other phases of software development. The goal of this dissertation is to help software engineers confront and reflect on their human errors by creating a process to document, organize, and analyze human errors. To that end, our research comprises three phases: (1) systematization (i.e. identification and taxonomization) of software engineers’ human errors from literature and development artifacts into a Taxonomy of Human Errors in Software Engineering (T.H.E.S.E.), (2) evaluation and refinement of T.H.E.S.E. based on software engineers’ perceptions and natural language insights, and (3) creation of a human error informed micro post-mortem process and the Human Error Reflection Engine (H.E.R.E.), a proof-of-concept GitHub workflow facilitating human error reflection. In demonstrating the utility of T.H.E.S.E. and our micro post-mortem process, the software development community will be closer to inculcating the wisdom of historical developer human errors, enabling them to engineer higher quality and more secure software.
Dissertation Artifacts:
- Data:
- 88.6 Million Developer Comments from GitHub – A collection of software developer comments from GitHub issues, commits, and pull requests. We collected 88,640,237 developer comments from 17,378 repositories. In total, this dataset includes: 54,252,380 issue comments (from 13,458,208 issues), 979,642 commit comments (from 49,710,108 commits), and 33,408,215 pull request comments (from 12,680,373 pull requests).
- 1,237 Annotated Developer Apologies from GitHub – A collection of software developer comments from GitHub with automated apology annotations. This dataset also includes manual apology classifications from two annotators and resolutions for their disagreements.
- 200 Annotated Developer Human Errors from GitHub – A collection of software developer comments from GitHub with manual human error categorizations.
- 162 Human Errors Descriptions – A collection of human error descriptions with T.H.E.S.E. categorizations and related discussion collected during our user study with software engineering students.
- Systematic Literature Review of Human Errors in Software Engineering – A systematic literature review of 68 research studies (identified from 284 total papers) which yielded 192 total human errors in software engineering.
- Tools:
- Automated Apology Classifier – A naïve keyword-based approach to automatically classifying apologies in natural language. This approach accounts for false-positives and yields near perfect recall (99%) and high F1 (87%).
- Taxonomy of Human Errors in Software Engineering (T.H.E.S.E.) – A taxonomy of 31 categories of human error in software engineering spanning slips, lapses, and mistakes.
- Human Error Informated Micro Post-Mortem Process – A human error reflection process that we designed to accompany T.H.E.S.E. This process makes it easy for software engineers to confront and reflect on their human errors.
- Human Error Reflection Engine (H.E.R.E.) – A semi-automated workflow that facilitates our human error informed micro post-mortem process on GitHub. This workflow serves as a proof-of-concept while also lowering the barrier to entry for software engineering teams who wish to adopt human error reflection with T.H.E.S.E.
- Code:
- Downloading Developer Comments & Apology Classification – Code used to collect software developer comments from GitHub and classify their apologies.
- Evaluating Sentence-BERT Models for Human Error Classification – Jupyter notebooks used to evaluate pretrained Sentence-BERT models for human error classification.
Notable Coursework: Regression Analysis, Nonparametric Statistics & Bootstrapping, Fundamentals of Computer Networking, Cyberinfrastructure Foundations, Neural Networks for Data Science, Fundamentals of Instructional Technology, Teaching Skills Workshop, Introduction to Greographic Information Systems
Concentration: Computational Linguistics
Notable Linguistics Coursework: Introduction to Language Science, Language & Linguistics, Evolving English Language, Psycholinguistics, Introduction to Natural Language Processing, Spoken Language Processing, Science & Analytics of Speech, Language & Culture, Language & Sexuality, Language Technology
Notable Software Engineering & Computer Science Coursework: Introduction to Computer Science Theory, Principles of Data Mining, Mathematical Models of Software, Engineering of Concurrent & Distributed Software Systems, Software Process & Project Management, Software Performance Engineering, Engineering of Software Subsystems, Personal Software Engineering, Engineering Secure Software, Discrete Mathematics for Computing, Linear Algebra, Applied Statistics
Research Publications
Presentations
Research Positions
Design, implementation, and analysis of research experiments in the areas of human factors in software engineering, notably applying human error theory, natural language processing, and machine learning to study software engineers’ behaviors, workplace interactions, and security posture.
Developed the concept of Geographic Information Capacity (GIC), a framework for measuring/analyzing the ability of communities (towns, cities, states, and countries) to understand, access, and utilize geographic information for disaster risk management, and identifying vulnerable communities for prioritizing disaster risk mitigation.
Applied natural language processing techniques to a dataset of 788,437 code reviews from the Chromium project to examine the discourse of software developers through analysis of inquisitiveness, sentiment analysis, politeness, formality, propositional density, uncertainty detection, and syntactic complexity.
Developed a set of distinct case study activities using genuine linguistic datasets to aid student learning and engagement in introductory linguistics classes. Enhanced the visualization capabilities of an existing web application, Linguine, that aided in the analysis of the case study data.
Adapted natural language processing techniques to a corpus of speech transcriptions collected from college-aged males with and without autism spectrum disorder. Examined the trajectories of linguistic development in autism through analysis of various syntactic-, semantic-, and discourse-based metrics.
Academic Positions
Serving as a coach for Software Engineering Senior Project teams.
Responsible for judging 23 K-12 student presentations/papers related to climate change.
Responsible for preparing coursework (slides, exams, quizzes, homework assignments, projects) and delivering lectures for 2 semesters of the Engineering Secure Software Course. This course covers the principles and practices forming the foundation for developing secure software systems. Coverage ranges across the entire development lifecycle, including requirements, design, implementation, and testing. Emphasis is on practices and patterns that reduce or eliminate security breaches in software intensive systems, and on testing systems to expose security weaknesses.
Responsible for grading homework and project assignments. Responsible for assisting during lectures.
Responsible for grading homework and project assignments. Responsible for assisting during lectures.
Responsible for grading homework and project assignments. Responsible for assisting during lectures.
Responsible for grading homework and project assignments. Responsible for assisting during lectures.
Technical Positions
Responsible for providing advanced computing support for RIT researchers. Assisting researchers with accessing Research Computing services, such as batch processing with Slurm, data storage on a Ceph cluster, software package management with Spack, and implementing high-performance computational workflows. Responsible for development of documentation and training resources for researchers. Responsible for planning and organizing professional and social events for researchers. Technologies include Spack, Slurm, Ceph, Ansible, Jekyll. Languages include Python, C, C++, Bash, HTML/CSS.
Responsible for providing advanced computing support for RIT researchers. Assisting researchers with accessing Research Computing services, such as batch processing with Slurm, data storage on a Ceph cluster, software package management with Spack, and implementing high-performance computational workflows. Responsible for development of documentation and training resources for researchers. Responsible for planning and organizing professional and social events for researchers. Technologies include Spack, Slurm, Ceph, Ansible, Jekyll. Languages include Python, C, C++, Bash, HTML/CSS.
Implementation of web browser history logging for Linux/Windows competition machines. Implementation of virtual machine backup scripts. Integration with Laforge, an in-house competition infrastructure management system. Technologies included Splunk, Veeam. Languages included Bash, PowerShell, Python.
Implementation of new features and improvements to existing features in PICS, the purchasing and inventory control system for KGCOE’s multidisciplinary senior design projects. Technologies included Active Directory, AngularJS, CLAWS, GitLab, MySQL, oVirt. Lanuguages included HTML/CSS, JavaScript, PHP.
Implementation of improvements to Linguine, a pre-existing linguistics learning tool. Implementation of new linguistic analyses and visualizations. Technologies included MongoDB, NodeJS. Languages included HTML/CSS, JavaScript, Python.
Management of multiple linguistics websites. Design and implementation of Language Science Department website. Technologies included CLAWS, GitLab. Languages included HTML/CSS, JavaScript.
Implementation of ATLAS, a web application to facilitate management of IT service tickets within the Kate Gleason College of Engineering and self-service group management. Technologies included Active Directory, CLAWS, GitLab, MySQL. Languages included HTML/CSS, JavaScript, PHP.
Assisting with day-to-day maintenance of servers, user-support activities, as well as other duties as assigned. Technologies included CLAWS, oVirt.
Management of a team of 5-10 student programmers. Design and implementation of autonomous and user-controlled robot code using a variety of hardware sensors, including potentiometers, speed controllers and ultrasonics. Languages included Java, LabVIEW.
Journal/Conference Reviewer Experience
Mentorship Experience
Training, advising, assigning projects, and validating student employees’ work in the areas of software package management (Spack), build systems (Setuptools, Poetry, CMake), and software development (Python, Bash, LDAP, Git).
Providing advice, reviewing and validating software development artifacts, facilitating collaboration with external project sponsors.
Advising and reviewing graduate students’ work in the areas of software engineering, research methods, natural language processing, computer security, and technical writing.
Advising and assisting middle school students in the areas of sofware development, community service, and gracious professionalism.