Liwei Zhang (Beijing University of Posts and Telecommunications), Linghui Li (Beijing University of Posts and Telecommunications), Xiaotian Si (Beijing University of Posts and Telecommunications), Ziduo Guo (Beijing University of Posts and Telecommunications), Xingwu Wang (Beijing University of Posts and Telecommunications), Kaiguo Yuan (Beijing University of Posts and Telecommunications), Bingyu Li (School of Cyber Science and Technology, Beihang University)
Federated learning enables decentralized model training without exposing raw data, making it a promising paradigm for privacy-preserving machine learning. However, it remains vulnerable to membership inference attacks (MIAs), where adversaries infer whether a specific data point is included in the training set, posing serious privacy risks and compromising data locality. Existing defenses against MIAs suffer from significant limitations: some incur substantial performance degradation, while others fail to provide protection against both passive and active attack vectors. To address these challenges, in this paper, we propose a unified defense framework that simultaneously mitigates both passive and active MIAs in federated learning, while preserving the utility of the target model. First, we incorporate a modified entropy regularization during teacher model training to enhance uncertainty on member data, offering stronger resistance to inference attacks than standard regularization. Second, we utilize a Conditional Variational Autoencoder (CVAE) to generate class-conditional synthetic data for supervised student training, which avoids direct exposure of sensitive data and provides better utility than unlabeled alternatives. Finally, we design a contribution-aware aggregation strategy that adjusts the influence of local models based on their utility, mitigating the impact of malicious clients during model aggregation. Experimental results on four benchmark datasets show that the proposed method significantly reduces the success rate of various membership inference attacks, outperforming existing state-of-the-art defenses. Moreover, it consistently maintains high model accuracy, demonstrating its practicality for real-world federated learning deployments.