Dongliang Mu (Huazhong University of Science and Technology), Yuhang Wu (Pennsylvania State University), Yueqi Chen (Pennsylvania State University), Zhenpeng Lin (Pennsylvania State University), Chensheng Yu (George Washington University), Xinyu Xing (Pennsylvania State University), Gang Wang (University of Illinois at Urbana-Champaign)

In the past three years, the continuous fuzzing projects Syzkaller and Syzbot have achieved great success in detecting kernel vulnerabilities, finding more kernel bugs than those found in the past 20 years. However, a side effect of continuous fuzzing is that it generates an excessive number of
crash reports, many of which are “duplicated” reports caused by the same bug. While Syzbot uses a simple heuristic to group (deduplicate) reports, we find that it is often inaccurate. In this
paper, we empirically analyze the duplicated kernel bug reports to understand: (1) the prevalence of duplication; (2) the potential costs introduced by duplication; and (3) the key causes behind the duplication problem. We collected all of the fixed kernel bugs from September 2017 to November 2020, including 3.24 million crash reports grouped by Syzbot under 2,526 bug reports (identified by unique bug titles). We found the bug reports indeed had duplication: 47.1% of the 2,526 bug reports are duplicated with one or more other reports. By analyzing the metadata of these reports, we found undetected duplication introduced extra costs in terms of time and developer efforts. Then we organized Linux kernel experts to analyze a sample of duplicated bugs (375 bug reports, unique 120 bugs) and identified 6 key contributing factors to the duplication. Based on these empirical findings, we proposed and prototyped actionable strategies for bug deduplication. After confirming their effectiveness using a ground-truth dataset, we further applied our methods and identified previously unknown duplication cases among open bugs.

View More Papers

Speeding Dumbo: Pushing Asynchronous BFT Closer to Practice

Bingyong Guo (Institute of Software, Chinese Academy of Sciences), Yuan Lu (Institute of Software Chinese Academy of Sciences), Zhenliang Lu (The University of Sydney), Qiang Tang (The University of Sydney), jing xu (Institute of Software, Chinese Academy of Sciences), Zhenfeng Zhang (Institute of Software, Chinese Academy of Sciences)

Read More

Demo #15: Remote Adversarial Attack on Automated Lane Centering

Yulong Cao (University of Michigan), Yanan Guo (University of Pittsburgh), Takami Sato (UC Irvine), Qi Alfred Chen (UC Irvine), Z. Morley Mao (University of Michigan) and Yueqiang Cheng (NIO)

Read More

Demo #14: In-Vehicle Communication Using Named Data Networking

Zachariah Threet (Tennessee Tech), Christos Papadopoulos (University of Memphis), Proyash Poddar (Florida International University), Alex Afanasyev (Florida International University), William Lambert (Tennessee Tech), Haley Burnell (Tennessee Tech), Sheikh Ghafoor (Tennessee Tech) and Susmit Shannigrahi (Tennessee Tech)

Read More

Hazard Integrated: Understanding Security Risks in App Extensions to...

Mingming Zha (Indiana University Bloomington), Jice Wang (National Computer Network Intrusion Protection Center, University of Chinese Academy of Sciences), Yuhong Nan (Sun Yat-sen University), Xiaofeng Wang (Indiana Unversity Bloomington), Yuqing Zhang (National Computer Network Intrusion Protection Center, University of Chinese Academy of Sciences), Zelin Yang (National Computer Network Intrusion Protection Center, University of Chinese Academy…

Read More