Martin Schwaighofer (Johannes Kepler University Linz), Martim Monis (INESC-ID and IST, University of Lisbon), Nuno Saavedra (INESC-ID and IST, University of Lisbon), Joao F. Ferreira (INESC-ID and Faculty of Engineering, University of Porto), Rene Mayrhofer (Johannes Kepler University Linz)
Implicit and floating dependencies, uncontrolled network access, and other impurities in build environments create intransparent supply chains. Discovering the identity and chain of custody of every dependency that could have affected the output of a build process is an error-prone forensic and reverse engineering task. Hermetically isolated build steps, linked by hashes of their inputs and outputs, make functional package management a principled alternative, which ensures that software depends only and exactly on declared inputs listed in a reproducible build recipe. Generating build recipes, which satisfy these additional constraints, however, is a complex and time-consuming task, creating a barrier to adoption. We present Vibenix, an AI-powered assistant that automatically generates such recipes as Nix expressions. Vibenix employs an agentic architecture that leverages a large language model (LLM) to iteratively refine a build recipe based on build outcomes, guided by deterministic rules and a structured feedback loop. We evaluated Vibenix on a dataset of 472 packaging tasks from the Nixpkgs repository. Vibenix successfully builds 424 of the 472 tasks (89.8%). Manual validation of a subset of 48 packages showed that 45.8% of Vibenix-built packages are functionally correct. Vibenix’s post-build refinement process, based on the evaluator–optimizer pattern and leveraging a VM environment for runtime testing, results in an additional 37.5% increase in functionally correct packages, demonstrating that this refinement mechanism is a promising and important contribution to automated software packaging. Vibenix demonstrates how LLMs, as an unreliable building block, can be leveraged to generate rigorously defined, complete, and easy to audit dependency trees, for real-world software projects.