Graph-Based Modeling of Service Dependencies for Predicting Failure Propagation in Distributed Systems
Abstract
Distributed systems increasingly rely on microservice architectures whose runtime behavior is shaped by rapidly changing service-to-service dependencies. When a fault occurs, symptoms often spread non-locally across the dependency structure, obscuring the initiating component and delaying remediation. This paper presents a graph-based modeling approach that (i) constructs a dynamic service dependency graph from request traces and operational telemetry, (ii) learns a propagation-aware representation of services and interactions, and (iii) predicts the likelihood and extent of failure propagation under partial and noisy observations. The proposed design combines structural causality encoded in traces with temporal signals from service health indicators to produce actionable outputs: predicted blast radius, ranked impacted services, and attributed propagation paths. We further define evaluation protocols for propagation prediction, discuss deployment constraints (instrumentation, concept drift, and multi-tenancy), and outline a reproducible experimentation methodology aligned with modern AIOps practices.
How to Cite This Article
Nilesh Mutyam (2024). Graph-Based Modeling of Service Dependencies for Predicting Failure Propagation in Distributed Systems . International Journal of Multidisciplinary Evolutionary Research (IJMER), 5(1), 113-116. DOI: https://doi.org/10.54660/IJMER.2024.5.1.113-116