Miaoqian Lin (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, China), Kai Chen (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, China), Yi Yang (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, China), Jinghua Liu (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, China)

Modern software often provides diverse APIs to facilitate development. Certain APIs, when used, can affect variables and require post-handling, such as error checks and resource releases. Developers should adhere to their usage specifications when using these APIs. Failure to do so can cause serious security threats, such as memory corruption and system crashes. Detecting such misuse depends on comprehensive API specifications, as violations of these specifications indicate API misuse. Previous studies have proposed extracting API specifications from various artifacts, including API documentation, usage patterns, and bug patches. However, these artifacts are frequently incomplete or unavailable for many APIs. As a result, the lack of specifications for uncovered APIs causes many false negatives in bug detection.

In this paper, we introduce the idea of API Specification Propagation, which suggests that API specifications propagate through hierarchical API call chains. In particular, modern software often adopts a hierarchical API design, where high-level APIs build on low-level ones. When high-level APIs wrap low-level ones, they may inherit the corresponding specifications. Based on this idea, we present APISpecGen, which uses known specifications as seeds and performs bidirectional propagation analysis to generate specifications for new APIs. Specifically, given the seed specifications, APISpecGen infers which APIs the specifications might propagate to or originate from. To further generate specifications for the inferred APIs, APISpecGen combines API usage and validates them using data-flow analysis based on the seed specifications. Besides, APISpecGen iteratively uses the generated specifications as new seeds to cover more APIs. For efficient and accurate analysis, APISpecGen focuses only on code relevant to the specifications, ignoring irrelevant semantics. We implemented APISpecGen and evaluated it for specification generation and API misuse detection. With 6 specifications as seeds, APISpecGen generated 7332 specifications. Most of the generated specifications could not be covered by state-of-the-art work due to the quality of their sources. With the generated specifications, APISpecGen detected 186 new bugs in the Linux kernel, 113 of them have been confirmed by the developers, with 8 CVEs assigned.

View More Papers

Oreo: Protecting ASLR Against Microarchitectural Attacks

Shixin Song (Massachusetts Institute of Technology), Joseph Zhang (Massachusetts Institute of Technology), Mengjia Yan (Massachusetts Institute of Technology)

Read More

Mysticeti: Reaching the Latency Limits with Uncertified DAGs

Kushal Babel (Cornell Tech & IC3), Andrey Chursin (Mysten Labs), George Danezis (Mysten Labs & University College London (UCL)), Anastasios Kichidis (Mysten Labs), Lefteris Kokoris-Kogias (Mysten Labs & IST Austria), Arun Koshy (Mysten Labs), Alberto Sonnino (Mysten Labs & University College London (UCL)), Mingwei Tian (Mysten Labs)

Read More

PropertyGPT: LLM-driven Formal Verification of Smart Contracts through Retrieval-Augmented...

Ye Liu (Singapore Management University), Yue Xue (MetaTrust Labs), Daoyuan Wu (The Hong Kong University of Science and Technology), Yuqiang Sun (Nanyang Technological University), Yi Li (Nanyang Technological University), Miaolei Shi (MetaTrust Labs), Yang Liu (Nanyang Technological University)

Read More

No Source Code? No Problem! Twenty Years of Research...

Jack W. Davidson, Professor of Computer Science in the School of Engineering and Applied Science, University of Virginia

Read More