Rei Yamagishi, Shinya Sasa, and Shota Fujii (Hitachi, Ltd.)

Codes automatically generated by large-scale language models are expected to be used in software development. A previous study verified the security of 21 types of code generated by ChatGPT and found that ChatGPT sometimes generates vulnerable code. On the other hand, although ChatGPT produces different output depending on the input language, the effect on the security of the generated code is not clear. Thus, there is concern that non-native English-speaking developers may generate insecure code or be forced to bear unnecessary burdens. To investigate the effect of language differences on code security, we instructed ChatGPT to generate code in English and Japanese, each with the same content, and generated a total of 450 codes under six different conditions. Our analysis showed that insecure codes were generated in both English and Japanese, but in most cases they were independent of the input language. In addition, the results of validating the same content in different programming languages suggested that the security of the code tends to depend on the security and usability of the API provided by the programming language of the output.

View More Papers

Experimental Analyses of the Physical Surveillance Risks in Client-Side...

Ashish Hooda (University of Wisconsin-Madison), Andrey Labunets (UC San Diego), Tadayoshi Kohno (University of Washington), Earlence Fernandes (UC San Diego)

Read More

Understanding and Analyzing Appraisal Systems in the Underground Marketplaces

Zhengyi Li (Indiana University Bloomington), Xiaojing Liao (Indiana University Bloomington)

Read More

Differentially Private Dataset Condensation

Tianhang Zheng (University of Missouri-Kansas City), Baochun Li (University of Toronto)

Read More