Fenghao Dong (CMU)
Network packet traces are critical for security tasks which includes longitudinal traffic analysis, system testing, and future workload forecasting. However, storing these traces over extended periods is costly and subject to compliance constraints. Deep Generative Compression (DGC) offers a solution by generating inexact but structurally accurate synthetic traces that preserve essential features without storing full sensitive data. This paper examines key research questions on the feasibility, cost-competitiveness, and scalability of DGC for large-scale, real-world network environments. We investigate the types of applications that benefit from DGC and design a framework to reliably operate for them. Our initial evaluation indicates that DGC can be an alternative to standard storage techniques (such as gzip or sampling) while meeting regulatory needs and resource limits. We further discuss open challenges and future directions, such as improving efficiency in streaming operations, optimizing model scalability, and addressing privacy risks in this scenario.