I feel like these background agents still aren't doing what I want from a developer experience perspective. Running in an inaccessible environment that pushes random things to branches that I then have to checkout locally doesn't feel great.
AI coding should be tightly in the inner dev loop! PRs are a bad way to review and iterate on code. They are a last line of defense, not the primary way to develop.
Give me an isolated environment that is one click hooked up to Cursor/VSCode Remote SSH. It should be the default. I can't think of a single time that Claude or any other AI tool nailed the request on the first try (other than trivial things). I always need to touch it up or at least navigate around and validate it in my IDE.
Right, that is closer to what I was hoping this announcement would be. I really just want a (mobile/web) companion to whatever CLI environment I have Claude Code running in. That would perfectly fill in the exact niche missing in my local dev server VM setup I remote into with any combination of SSH, VS Code Remote, or via Web (VS Code Tunnel from vscode.dev and a ttyd remote CLI session in the browser).
It would be great to be able to check in on Claude on a walk or something to make sure it hasn't gone off the rails or send it a quick "LGTM" to keep moving down a large PLAN.md file without being tethered to a keyboard and monitor. I can SSH from my phone but the CLI ergonomics are ... not great with an on screen keyboard, when all it really needs is just needs a simple threaded chat UI.
I've seen a couple Github projects and "Happy Coder" on a Show HN which I haven't got around to setting up yet which seem in the ballpark of what I want, but a first party integration would always be cool.
I tried Happy Coder for a bit. It seemed exactly what I was missing but about 1/2 the time session notifications weren't coming through and the developers of the tool seem busy pushing it off in other directions rather than in making the core functionality bullet-proof so I gave up on it. Unfortunate. Hopefully something else pops up or Anthropic bakes it into their own tooling.
I agree and I also think the problem is deeper than that. It's about not being able to do most code testing and debugging remotely. You can't really test anything remotely really... Its in an ephemeral container without any of your data, just your repo. You can't have the model do npm run dev and browse to see the webpage, click around, etc. You can't compile or run anything heavy, you can't persist data across sessions/days, etc.
I like the idea of background agents running in the cloud but it has to be a more persistent environment. It also has to run on a GUI so it can develop web applications or run the programs we are developing, and run them properly with the GUI and requiring clicking around, typing things etc. Computer use, is what we need. But that would probably be too expensive to serve to the masses with the current models
Definitely sounds cool. But the problem hasn't even been solved locally yet. Distributed microservices, 3rd party dependencies, async callbacks, reasonable test data, unsatisfiable validations, etc. Every company has their own hacked together local testing thing that mostly doesn't work.
That said, maybe this is the turning point where these companies work toward solving it in earnest, since it's a key differentiator of their larger PLATFORM and not just a cost. Heck, if they get something like that working well, I'd pay for it even without the AI!
Edit: that could end up being really slick too if it was able to learn from your teammates and offer guidance. Like when you're checking some e2e UI flows but you need a test item that has some specific detail, it maybe saw how your teammate changed the value or which item they used or created, and can copy it for you. "Hey it looks like you're trying to test this flow. Here's how Chen did it. Want me to guide you through that?" They can't really do that with just CLI, so the web interface could really be a game changer if they take full advantage of it.
What you're describing feels like the next major evolution and is likely years away (and exciting!).
I'm mainly aiming for a good experience with what we have today. Welding an AI agent onto my IDE turned out to be great. The next incremental step feels like being able to parallelize that. I want four concurrent IDEs with AI welded onto it.
Exactly, I want to go to sleep knowing I have an AI working in a computer developing my project. Then wake up to the finished website/program, fully tested top to bottom backend frontend UI etc.
idk, we’ve (humans) gotten this far with them. I don’t think they are the right tool for AI generated code and coding agents though, and that these circles are being forced to fit into those squares. imho it’s time for an AI-native git or something.
PRs work well for what they are. Ship off some changes you're strongly confident about and have another human who has a lot of context read through it and double check you. It's for when you think you've finished your inner loop.
AI is more akin to pair programming with another person sitting next to you. I don't want to ship a PR or even a branch off to someone sitting next to me. I want to discuss and type together in real time.
Agree, each agent creating a PR and then coordinating merges is a pain.
I’d like
- agent to consolidate simple non-conflicting PRs
- faster previews and CI tests (Render currently)
- detect and suggest solutions for merge conflicts
Codex web doesn’t update the PR which is also something to change, maybe a setting, but for web Code agents (?) I’d like the PR once opened to stay open
Also PRs need an overhaul in general. I create lots of speculative agents, if I like the solution I merge, leading to lots of PRs
Thank you. Every time these agentic cloud tools come out, I wonder to myself whether I'm not using them right or misunderstand vs, say, local Cursor development paradigm.
Plus they generate so much noise with all the extra commits and comments that go to everyone in slack and email rather than just me.
I just run the agent directly on separate testing/dev servers via remote-ssh in VS Code to have an IDE to sanity check stuff. Just far simpler than local dev and other nonsense.
this is a great point. The inner / outer loop is big. I think AI pushing PRs is kind of like pushing drafts to the public in social media. I don't want folks seeing PRs and such until I feel good about it. It adds a lot of noise, and increases build costs unless your CI/CD treats them differently which I don't know anyone doing.
Not quite. This doesn't (yet) have an option where you can connect your local IDE to their remote containers to edit files directly. It's more of a fire-and-forget thing where you can eventually suck the resulting code down to your local machine using "claude --teleport ..." - but then it's not running in the cloud any more.
CEO at Ona (formerly Gitpod) here. Every ephemeral environment Ona creates can directly connect to your Desktop IDE for easy handoff. Our team goes from prompt -> iterating in conversation -> VS Code Web -> VS Code Desktop/Cursor depending on task complexity and quality of the agent output. We call this progressive engagement and have written about it here https://ona.com/docs/ona/best-practices#progressive-engageme...
Thanks, I'll give it a shot. I wish your site would show me what it actually looks like. It's a lot of words and fancy marketing images and I have no feel for the product. It leaves me unsure if I should invest my time.
I'd love to see a short animation of what it would actually look like to do the core flow. Prompt -> environment creation -> iterating -> popping open VSCode Web -> Popping open Cursor desktop.
Also, a lot of the links on that page you linked me to are broken:
* "manual edits and Ona Agents is very powerful."
* "Ona’s automations.yaml extends it with tasks and services"
* "devcontainer.json describes the tools"
I signed up and tried it with Cursor. It is very close, but still has a lot of rough edges that make it hard to switch:
* Once in Cursor I can't click on modified files or lines and have my IDE jump to it. Very hard to review changes.
* I closed the Ona tab and couldn't figure out how to get it back so I could prompt it again.
* I can't pin the Ona tab to the right like Cursor does
* Is there a way to select lines and add them to context?
* Is there a way I can pick a model?
Yes, but my point is often times I don't want to. Sometimes there are changes I can make it seconds. I don't want to wait 15+ seconds for an AI that might do it wrong or do too much.
Also it isn't always about editing. It is about seeing the surrounding code, navigating around, and ensuring the AI did the right thing in all of the right places.
Huge waste of time. You are being sold a bill of goods whose only purpose is to make you a dumb dev. Like woah, an llm can use cdp!! Who cares. Cant wait till people start waking up to this grift. These things are making people so dumb and a few richer, thats it.
Hey, Kanjun from Imbue here! This is exactly why we built Sculptor (https://imbue.com/sculptor), a desktop UI for Claude Code.
Each agent has its own isolated container. With Pairing Mode, you can sync the agent's code and git state directly into your local Cursor/any IDE so you can instantly validate its work. The sync is bidirectional so your local changes flow back to the agent in realtime.
Happy to answer any questions - I think you'll really like the tight feedback loop :)
AI coding should be tightly in the inner dev loop! PRs are a bad way to review and iterate on code. They are a last line of defense, not the primary way to develop.
Give me an isolated environment that is one click hooked up to Cursor/VSCode Remote SSH. It should be the default. I can't think of a single time that Claude or any other AI tool nailed the request on the first try (other than trivial things). I always need to touch it up or at least navigate around and validate it in my IDE.