Automation of error reporting processing based on stack trace analysis

  • The rapid growth of large-scale software systems has led to the adoption of automatic error reporting platforms collecting millions of crash reports from real users. Central to these reports is the stack trace — a record of function calls leading to failure — which serves as a crucial diagnostic resource. However, the sheer volume, diversity, and redundancy of reports create bottlenecks: developers are overwhelmed by duplicates and highly variable submissions from the same defect, impeding efficient issue resolution. Existing deduplication and triage solutions in industry and academia mainly rely on string-matching, information retrieval, or graph-based heuristics. While efficient, string and IR methods often miss semantic and contextual nuances; graph-based models lose detail about individual reports, reducing accuracy. These limitations cause missed linkages between related errors and fragmentation of bug databases. The lack of scalable algorithms, real-world benchmarks, and advanced learning methods further restricts current tools. This dissertation advances automation of error report processing via stack trace analysis. It introduces (1) hybrid similarity metrics extending traditional techniques, (2) deep learning models for robust similarity estimation, (3) aggregation strategies leveraging group-level information, (4) scalable solutions for industrial use, (5) the first models for automated developer assignment in stack trace–centered triage, and (6) methods for interpreting and highlighting the most informative stack frames. The research is validated on multiple proprietary and open datasets, including new benchmarks released as part of this work. Together, these contributions provide a unified, reproducible foundation for scalable, accurate, and actionable error report deduplication, grouping, assignment, and tooling in real-world software engineering.

Download full text

Cite this publication

  • Export Bibtex
  • Export RIS

Citable URL (?):

Search for this publication

Search Google Scholar Search Catalog of German National Library Search OCLC WorldCat Search Bielefeld Academic Search Engine
Meta data
Publishing Institution:IRC-Library, Information Resource Center der Constructor University
Granting Institution:Constructor Univ.
Author:Aleksandr Khvorov
Referee:Alexander Omelchenko, Dmitry Vetrov, Denis Stepanov
Advisor:Alexander Omelchenko
Persistent Identifier (URN):urn:nbn:de:gbv:579-opus-1013275
Document Type:PhD Thesis
Language:English
Date of Successful Oral Defense:2025/06/25
Date of First Publication:2025/11/05
Academic Department:Computer Science & Electrical Engineering
PhD Degree:Computer Science
Call No:2025/14

$Rev: 13581 $