Jie Lin (University of Central Florida), David Mohaisen (University of Central Florida)

Large Language Models (LLMs) have demonstrated strong potential in tasks such as code understanding and generation. This study evaluates several advanced LLMs—such as LLaMA-2, CodeLLaMA, LLaMA-3, Mistral, Mixtral, Gemma, CodeGemma, Phi-2, Phi-3, and GPT-4—for vulnerability detection, primarily in Java, with additional tests in C/C++ to assess generalization. We transition from basic positive sample detection to a more challenging task involving both positive and negative samples and evaluate the LLMs’ ability to identify specific vulnerability types. Performance is analyzed using runtime and detection accuracy in zero-shot and few-shot settings with custom and generic metrics. Key insights include the strong performance of models like Gemma and LLaMA-2 in identifying vulnerabilities, though this success varies, with some configurations performing no better than random guessing. Performance also fluctuates significantly across programming languages and learning modes (zero- vs. few-shot). We further investigate the impact of model parameters, quantization methods, context window (CW) sizes, and architectural choices on vulnerability detection. While CW consistently enhances performance, benefits from other parameters, such as quantization, are more limited. Overall, our findings underscore the potential of LLMs in automated vulnerability detection, the complex interplay of model parameters, and the current limitations in varied scenarios and configurations.

View More Papers

Securing BGP ASAP: ASPA and other Post-ROV Defenses

Justin Furuness (University of Connecticut), Cameron Morris (University of Connecticut), Reynaldo Morillo (University of Connecticut), Arvind Kasiliya (University of Connecticut), Bing Wang (University of Connecticut), Amir Herzberg (University of Connecticut)

Read More

KernelSnitch: Side Channel-Attacks on Kernel Data Structures

Lukas Maar (Graz University of Technology), Jonas Juffinger (Graz University of Technology), Thomas Steinbauer (Graz University of Technology), Daniel Gruss (Graz University of Technology), Stefan Mangard (Graz University of Technology)

Read More

Wallbleed: A Memory Disclosure Vulnerability in the Great Firewall...

Shencha Fan (GFW Report), Jackson Sippe (University of Colorado Boulder), Sakamoto San (Shinonome Lab), Jade Sheffey (UMass Amherst), David Fifield (None), Amir Houmansadr (UMass Amherst), Elson Wedwards (None), Eric Wustrow (University of Colorado Boulder)

Read More

No Source Code? No Problem! Twenty Years of Research...

Jack W. Davidson, Professor of Computer Science in the School of Engineering and Applied Science, University of Virginia

Read More