Yi Yang (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, China), Jinghua Liu (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, China), Kai Chen (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, China), Miaoqian Lin (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, China)

As the basis of software resource management (RM), strictly following the RM-API constraints guarantees secure resource management and software. To enhance the RM-API application, researchers find it effective in detecting RM-API misuse on open-source software according to RM-API constraints retrieved from documentation and code. However, the current pattern-matching constraint retrieval methods have limitations: the documentation-based methods leave many API constraints irregularly distributed or involving neutral sentiment undiscovered; the code-based methods result in many false bugs due to incorrect API usage since not all high-frequency usages are correct.
Therefore, people propose to utilize Large Language Models (LLMs) for RM-API constraint retrieval with their potential on text analysis and generation. However, directly using LLMs has limitations due to the hallucinations. The LLMs fabricate answers without expertise leaving many RM APIs undiscovered and generating incorrect answers even with evidence introducing incorrect RM-API constraints and false bugs.

In this paper, we propose an LLM-empowered RM-API misuse detection solution, ChatDetector, which fully automates LLMs for documentation understanding which helps RM-API constraints retrieval and RM-API misuse detection. To correctly retrieve the RM-API constraints, ChatDetector is inspired by the ReAct framework which is optimized based on Chain-of-Thought (CoT) to decompose the complex task into allocation APIs identification, RM-object (allocated/released by RM APIs) extraction and RM-APIs pairing (RM APIs usually exist in pairs). It first verifies the semantics of allocation APIs based on the retrieved RM sentences from API documentation through LLMs.
Inspired by the LLMs' performance on various prompting methods, ChatDetector adopts a two-dimensional prompting approach for cross-validation. At the same time, an inconsistency-checking approach between the LLMs' output and the reasoning process is adopted for the allocation APIs confirmation with an off-the-shelf Natural Language Processing (NLP) tool. To accurately pair the RM-APIs, ChatDetector decomposes the task again and identifies the RM-object type first, with which it can then accurately pair the releasing APIs and further construct the RM-API constraints for misuse detection. With the diminished hallucinations, ChatDetector identifies 165 pairs of RM-APIs with a precision of 98.21% compared with the state-of-the-art API detectors. By employing a static detector CodeQL, we ethically report 115 security bugs on the applications integrating on six popular libraries to the developers, which may result in severe issues, such as Denial-of-Services (DoS) and memory corruption. Compared with the end-to-end benchmark method, the result shows that ChatDetector can retrieve at least 47% more RM sentences and 80.85% more RM-API constraints. Since no work exists specified in utilizing LLMs for RM-API misuse detection to our best knowledge, the inspiring results show that LLMs can assist in generating more constraints beyond expertise and can be used for bug detection. It also indicates that future research could transfer from overcoming the bottlenecks of traditional NLP tools to creatively utilizing LLMs for security research.

View More Papers

PBP: Post-training Backdoor Purification for Malware Classifiers

Dung Thuy Nguyen (Vanderbilt University), Ngoc N. Tran (Vanderbilt University), Taylor T. Johnson (Vanderbilt University), Kevin Leach (Vanderbilt University)

Read More

Was This You? Investigating the Design Considerations for Suspicious...

Sena Sahin (Georgia Institute of Technology), Burak Sahin (Georgia Institute of Technology), Frank Li (Georgia Institute of Technology)

Read More

DeFiIntel: A Dataset Bridging On-Chain and Off-Chain Data for...

Iori Suzuki (Graduate School of Environment and Information Sciences, Yokohama National University), Yin Minn Pa Pa (Institute of Advanced Sciences, Yokohama National University), Nguyen Thi Van Anh (Institute of Advanced Sciences, Yokohama National University), Katsunari Yoshioka (Graduate School of Environment and Information Sciences, Yokohama National University)

Read More

Sheep's Clothing, Wolf's Data: Detecting Server-Induced Client Vulnerabilities in...

Fangming Gu (Institute of Information Engineering, Chinese Academy of Sciences), Qingli Guo (Institute of Information Engineering, Chinese Academy of Sciences), Jie Lu (Institute of Computing Technology, Chinese Academy of Sciences), Qinghe Xie (Institute of Information Engineering, Chinese Academy of Sciences), Beibei Zhao (Institute of Information Engineering, Chinese Academy of Sciences), Kangjie Lu (University of Minnesota),…

Read More