ProvTalk: Towards Interpretable Multi-level Provenance Analysis in Networking Functions Virtualization (NFV)

Azadeh Tabiban (CIISE, Concordia University, Montreal, QC, Canada), Heyang Zhao (CIISE, Concordia University, Montreal, QC, Canada), Yosr Jarraya (Ericsson Security Research, Ericsson Canada, Montreal, QC, Canada), Makan Pourzandi (Ericsson Security Research, Ericsson Canada, Montreal, QC, Canada), Mengyuan Zhang (Department of Computing, The Hong Kong Polytechnic University, China), Lingyu Wang (CIISE, Concordia University, Montreal, QC, Canada)

Network functions virtualization (NFV) enables agile deployment of network services on top of clouds. However, as NFV involves multiple levels of abstraction representing the same components, pinpointing the root cause of security incidents can become challenging. For instance, a security incident may be detected at a different level from where its root cause operations were conducted with no obvious link between the two. Moreover, existing provenance analysis techniques may produce results that are impractically large for human analysts to interpret due to the inherent complexity of NFV. In this paper, we propose ProvTalk, a provenance analysis system that handles the unique multi-level nature of NFV and assists the analyst to identify the root cause of security incidents. Specifically, we first define a multi-level provenance model to capture the dependencies between NFV levels. Next, we improve the interpretability through three novel techniques, i.e., multi-level pruning, mining-based aggregation, and rule-based natural language translation. We implement ProvTalk on a Tacker-OpenStack NFV platform and validate its effectiveness based on real-world security incidents. We demonstrate that ProvTalk captures management API calls issued to all NFV services, and produces more interpretable results by significantly reducing the size of the provenance graphs (up to 3.6 times smaller than the existing one-level pruning scheme and two orders of magnitude via multi-level aggregation scheme). Our user study shows that ProvTalk facilitates the analysis task of real-world users by generating more interpretable results.