ammmir 17 hours ago

Two months ago, I started exploring how LLMs can securely run arbitrary code. Since then, we've seen Manus and others build code inside sandboxes and I believe there are some YC startups in this space, too! I wrote a blog post [1] about building a simplistic version of this using Jupyter Notebook, but since then I've built a fully open source sandboxing server with more ergonomic HTTP endpoints (MCP should be next I guess?) and a half-decent UI for humans (see the demo video in the README).

A novel concept that I haven't seen implemented properly yet, perhaps useful for AI coding agents, is that a sandbox can be forked at any point. Similar to how you can fork a PostgreSQL database, you can fork a sandbox, which creates an independent sandbox with all of the changes in it. Technically, I tried to implement this with checkpoint/restore using CRIU, but ran into some issues with nesting beyond 2 levels deep and custom user namespaces for security. And it was difficult to use get CRIU to work with Linux programs that use shared memory segments, and other Unixy things. I ended up switching to file system diffs and using reflinks on XFS to get some Copy-on-Write semantics.

Features:

* Automatic HTTPS with unique URL per sandbox (no need to deal with ingresses or exposing ports)

* Static token auth or GitHub app auth

* Built-in UI

* Multi-tenant ready: each user gets their own network

* List, download, and upload files into sandboxes

* Fork sandboxes to create arbitrary depths of clones

It's still in early stages, but it should be usable. I'd love your feedback and ideas on where to take this :) Personally, I want to use this as a code execution backend for local AI agents.

[1] https://amirmalik.net/2025/03/07/code-sandboxes-for-llm-ai-a...