hu3's favorites | Hacker News

redhale 6 days ago | parent | context | on: Opus 4.5 is not the normal AI agent experience tha...

Not necessarily responding to you directly, but I find this take to be interesting, and I see it every time an article like this makes the rounds.

Starting back in 2022/2023:

- (~2022) It can auto-complete one line, but it can't write a full function.

- (~2023) Ok, it can write a full function, but it can't write a full feature.

- (~2024) Ok, it can write a full feature, but it can't write a simple application.

- (~2025) Ok, it can write a simple application, but it can't create a full application that is actually a valuable product.

- (~2025+) Ok, it can write a full application that is actually a valuable product, but it can't create a long-lived complex codebase for a product that is extensible and scalable over the long term.

It's pretty clear to me where this is going. The only question is how long it takes to get there.

pella 11 days ago | parent | context | on: Why users cannot create Issues directly

2025-12-30 https://x.com/mitchellh/status/2006114026191769924

"Slop drives me crazy and it feels like 95+% of bug reports, but man, AI code analysis is getting really good. There are users out there reporting bugs that don't know ANYTHING about our stack, but are great AI drivers and producing some high quality issue reports.

This person (linked below) was experiencing Ghostty crashes and took it upon themselves to use AI to write a python script that can decode our crash files, match them up with our dsym files, and analyze the codebase for attempting to find the root cause, and extracted that into an Agent Skill.

They then came into Discord, warned us they don't know Zig at all, don't know macOS dev at all, don't know terminals at all, and that they used AI, but that they thought critically about the issues and believed they were real and asked if we'd accept them. I took a look at one, was impressed, and said send them all.

This fixed 4 real crashing cases that I was able to manually verify and write a fix for from someone who -- on paper -- had no fucking clue what they were talking about. And yet, they drove an AI with expert skill.

I want to call out that in addition to driving AI with expert skill, they navigated the terrain with expert skill as well. They didn't just toss slop up on our repo. They came to Discord as a human, reached out as a human, and talked to other humans about what they've done. They were careful and thoughtful about the process.

People like this give me hope for what is possible. But it really, really depends on high quality people like this. Most today -- to continue the analogy -- are unfortunately driving like a teenager who has only driven toy go-karts."

"Examples: https://github.com/ghostty-org/ghostty/discussions?discussio... "

bsnnkv 11 days ago | parent | context | on: Why users cannot create Issues directly

I'm a fan of this. My own projects on GitHub have an action[1] which autocloses and autolocks any opened issues until they have been reviewed and accepted by me, and I only consider feature requests from sponsors.

The real miss here is that there isn't a way on GitHub to only allow maintainers to create issues, instead we are left with these subpar workarounds.

[1]: https://github.com/LGUG2Z/komorebi/blob/master/.github/workf...

tialaramex 12 days ago | parent | context | on: Rust--: Rust without the borrow checker

If you write correct Rust code it'll work, the borrowck is just that, a check, if the teacher doesn't check your homework where you wrote that 10 + 5 = 15 it's still correct. If you write incorrect code where you break Rust's borrowing rules it'll have unbounded Undefined Behaviour, unlike the actual Rust where that'd be an error this thing will just give you broken garbage, exactly like a C++ compiler.

Evidently millions of people want broken garbage, Herb Sutter even wrote a piece celebrating how many more C++ programmers and projects there were last year, churning out yet more broken garbage, it's a metaphor for 2025 I guess.

doublet00th 13 days ago | parent | context | on: Everything as code: How we manage our company in o...

I built something like this at my previous startup, Pangea [1]. Overall I think looking back on our journey I'd sign up for it again, but it's not a panacea.

Here were the downsides we ran into

- Getting buy in to do everything through the repo. We had our feature flags controlled via a yaml file in the repo as well, and pretty quickly people got mad at the time it took for us to update a feature flag (open MR -> merge MR -> have CI update feature flag in our envs), and optimizing that took quite a while. It then made branch invariants harder to reason about (everything in the production branch is what is in our live environments, but except for feature flags). So, we moved that out of the monorepo into an actual service.

- CI time and complexity. When we started getting to around 20 services that deployed independently, GitLab started choking on the size of our CI configuration and we'd see a spinner for about 5 minutes before our pipeline even launched. Couple that with special snowflakes like the feature flag system I mentioned above, eventually it got to the point that only a few people knew exactly how rollouts edge cases worked. The juice was not worth the squeeze at that point (the juice being - "the repo is the source of truth for everything")

- Test times. We ran some e2e UI tests with Cypress that required a lot of beefy instances, and for safety we'd run them every single time. Couple that with flakiness, and you'd have a lot of red pipelines when the goal was 100% green all the time.

That being said, we got a ton of good stuff out of it too. I distinctly remember one day that I updated all but 2 of our services to run on ARM without involving service authors and our compute spend went down by 70% for that month because nobody was using the m8g spot instances, which had just been released.

[1]: https://pangea.cloud/

mceachen 16 days ago | parent | context | on: Last Year on My Mac: Look Back in Disbelief

Followed by “where is the back button.”

Answer: sometimes apps let you swipe right from the left margin, sometimes there may be a left arrow in the upper left, but it may not be visible unless you enable tinted Liquid Glass, but also look in the bottom left, there may be a less-than sign, and some times you have to force-quit the app and restart (like with Libby books borrowed via Kindle…)

teaearlgraycold 16 days ago | parent | context | on: Last Year on My Mac: Look Back in Disbelief

This year I've had to perform many hard resets on my MacBook, iPhone and even Apple Watch because they've locked up. And they're all relatively new devices. Apple needs to get its shit together. I already expect to move away from their mobile ecosystem when it comes time to upgrade.

angristan 16 days ago | parent | context | on: Building a macOS app to know when my Mac is therma...

> but what are you able to do about it?

On Macbooks with fans, I started tuning my fan curve with iStat Menus (https://bjango.com/help/istatmenus7/fans/#custom-fan-curve) because I noticed the default curve was lagging behind and thermal throttling kicked in before the fan even reach max speed.

For Apple Silicon specifically, I recently discovered that there is a "high power mode" (https://support.apple.com/en-us/101613) that allows the fans to run at higher speed. So I don't use the custom fan curves anymore, it helped me a lot (but it does get quite noisy on a 14" M4 Max)

For a Macbook Air, not much you can do besides closing stuff, or elevating the macbook and pointing a fan at it or things like that... but yeah it's a bit desperate!

2001zhaozhao 22 days ago | parent | context | on: A guide to local coding models

Under current prices buying hardware just to run local models is not worth it EVER, unless you already need the hardware for other reasons or you somehow value having no one else be able to possibly see your AI usage.

Let's be generous and assume you are able to get a RTX 5090 at MSRP ($2000) and ignore the rest of your hardware, then run a model that is the optimal size for the GPU. A 5090 has one of the best throughputs in AI inference for the price, which benefits the local AI cost-efficiency in our calculations. According to this reddit post it outputs Qwen2.5-Coder 32B at 30.6 tokens/s. https://www.reddit.com/r/LocalLLaMA/comments/1ir3rsl/inferen...

It's probably quantized, but let's again be generous and assume it's not quantized any more than models on OpenRouter. Also we assume you are able to keep this GPU busy with useful work 24/7 and ignore your electricity bill. At 30.6 tokens/s you're able to generate 993M output tokens in a year, which we can conveniently round up to a billion.

Currently the cheapest Qwen2.5-Coder 32B provider on OpenRouter that doesn't train on your input runs it at $0.06/M input and $0.15/M output tokens. So it would cost $150 to serve 1B tokens via API. Let's assume input costs are similar since providers have an incentive to price both input and output proportionately to cost, so $300 total to serve the same amount of tokens as a 5090 can produce in 1 year running constantly.

Conclusion: even with EVERY assumption in favor of the local GPU user, it still takes almost 7 years for running a local LLM to become worth it. (This doesn't take into account that API prices will most likely decrease over time, but also doesn't take into account that you can sell your GPU after the breakeven period. I think these two effects should mostly cancel out.)

In the real world in OP's case, you aren't running your model 24/7 on your MacBook; it's quantized and less accurate than the one on OpenRouter; a MacBook costs more and runs AI models a lot slower than a 5090; and you do need to pay electricity bills. If you only change one assumption and run the model only 1.5 hours a day instead of 24/7, then the breakeven period already goes up to more than 100 years instead of 7 years.

Basically, unless you absolutely NEED a laptop this expensive for other reasons, don't ever do this.

vanschelven 38 days ago | parent | context | on: Why Speed Matters

Reminds me of this:

"""On the first day of class, Jerry Uelsmann, a professor at the University of Florida, divided his film photography students into two groups.

Everyone on the left side of the classroom, he explained, would be in the “quantity” group. They would be graded solely on the amount of work they produced. On the final day of class, he would tally the number of photos submitted by each student. One hundred photos would rate an A, ninety photos a B, eighty photos a C, and so on.

Meanwhile, everyone on the right side of the room would be in the “quality” group. They would be graded only on the excellence of their work. They would only need to produce one photo during the semester, but to get an A, it had to be a nearly perfect image.

At the end of the term, he was surprised to find that all the best photos were produced by the quantity group. During the semester, these students were busy taking photos, experimenting with composition and lighting, testing out various methods in the darkroom, and learning from their mistakes. In the process of creating hundreds of photos, they honed their skills. Meanwhile, the quality group sat around speculating about perfection. In the end, they had little to show for their efforts other than unverified theories and one mediocre photo."""

from https://www.thehuntingphotographer.com/blog/qualityvsquantit...

mkornaukhov 41 days ago | parent | context | on: A Look at Rust from 2012

I do write mostly async code, too.

There are several ~~problems~~ subtleties that make usage of Rust async hindered IMHO.

- BoxFuture. It's used almost everywhere. It means there are no chances for heap elision optimization.

- Verbosity. Look at this BoxFuture definition: `BoxFuture<'a, T> = Pin<Box<dyn Future<Output = T> + Send + 'a>>;`. It's awful. I do understand what's Pin trait, what is Future trait, what's Send, lifetimes and dynamic dispatching. I *have to* know all these not obvious things just to operate with coroutines in my (possibly single threaded!) program =(

- No async drop and async trait in stdlib (fixed not so long ago)

I am *not* a hater of Rust async system. It's a little simpler and less tunable than in C++, but more complex than in Golang. Just I cannot say Rust's async approach is a good enough trade-off while a plethora of the decisions made in the design of the language are closest to the silver bullet.

alabhyajindal 50 days ago | parent | context | on: I built a faster Notion in Rust

I used to be very against closed source products but changed my mind recently. One of the founders of Obsidian makes some great points here: https://forum.obsidian.md/t/open-sourcing-of-obsidian/1515/1...

troupo 67 days ago | parent | context | on: Swift on FreeBSD Preview

You can watch Lattner's interview with Theprimeagen. It's a haphazardly designed language where pressure to ship from Apple as a whole overrides any design or development considerations.

That's why you end up with a compiler that barfs at even the simplest SwiftUI code because Swift's type system is overly complicated and undecidable. And makes the compiler dog slow.

That's why you end up with 200+ keywords [1] with more added each release.

That's how you end up with idiocy like `guard let self = self else { return }` (I think they "fixed" this with some syntax sugar) because making if statements understand nulls is beyond the capabilities of heroes apparently.

And this is just surface level that immediately came to mind.

[1] It's not a typo: https://x.com/jacobtechtavern/status/1841251621004538183

devinprater 68 days ago | parent | context | on: How I am deeply integrating Emacs

I love Emacs. My first intro to it was on the Braille Plus Mobile Manager back in like 2008 or so. That was a beautiful device that ran Linux and was developed for the blind. There's been nothing exactly like it since. The BT Speak is a poor ematation that runs on a Raspberry Pi 4 and is sluggish because Linux accessibility is hard and not optomized for such low-power devices.

Anyway, I began learning Emacs commands in the Emacs tutorial on that Braille Plus, , and they made sense to me. Unfortunately, Emacspeak only really works well on Linux and Mac, not Windows where all the blind people are. Speechd-el only works on Linux, since it uses Speech-dispatcher. I got Speechd-el talking on Termux for Android last night though, although it was rather laggy between key press and speech. Emacspeak development has paused, though, and Speechd-el seemingly hasn't been updated in half a year. Emacs itself has a lot going on for a normal screen reader to interpret which is why Emacs-specific speech interfaces are so useful.

A few examples:

* On Windows, with Windows Terminal and NVDA screen reader, arrow keys read where the cursor is, but C-n and C-p, C-f and C-b, all that, NVDA doesn't say anything. This is with the -nw command line option because the GUI is inaccessible. * Now, if I do M-x, it does say "minibuf help, M-x, Windows Powershell Terminal". From there, I can do list-package and RET and use arrow keys to go through packages, but N and P don't speak even though I know they move between packages. So it seems like the echo area works. * Programs like the calendar, though, really doesn't speak well with a screen reader. It just read the line, not the focused date. Using left and right jst say "1 2 3 4 5" etc. So custom interfaces don't work well. I shudder to think how it'd read Helm.

Lol maybe I can get AI to make a good speech server for Emacspeak for Windows.

csande17 73 days ago | parent | context | on: Futurelock: A subtle risk in async Rust

In case anyone else was confused: the link/quote in this comment are from the previous "async cancellation issue" write-up, which describes a situation where you "drop" a future: the code in the async function stops running, and all the destructors on its local variables are executed.

The new write-up from OP is that you can "forget" a future (or just hold onto it longer than you meant to), in which case the code in the async function stops running but the destructors are NOT executed.

Both of these behaviors are allowed by Rust's fairly narrow definition of "safety" (which allows memory leaks, deadlocks, infinite loops, and, obviously, logic bugs), but I can see why you'd be disappointed if you bought into the broader philosophy of Rust making it easier to write correct software. Even the Rust team themselves aren't immune -- see the "leakpocalypse" from before 1.0.

hitekker 73 days ago | parent | context | on: Futurelock: A subtle risk in async Rust

Skimming through, this document feels thorough and transparent. Clearly, a hard lesson learned. The footnotes, in particular, caught my eye https://rfd.shared.oxide.computer/rfd/397#_external_referenc...

> Why does this situation suck? It’s clear that many of us haven’t been aware of cancellation safety and it seems likely there are many cancellation issues all over Omicron. It’s awfully stressful to find out while we’re working so hard to ship a product ASAP that we have some unknown number of arbitrarily bad bugs that we cannot easily even find. It’s also frustrating that this feels just like the memory safety issues in C that we adopted Rust to get away from: there’s some dynamic property that the programmer is responsible for guaranteeing, the compiler is unable to provide any help with it, the failure mode for getting it wrong is often undebuggable (by construction, the program has not done something it should have, so it’s not like there’s a log message or residual state you could see in a debugger or console), and the failure mode for getting it wrong can be arbitrarily damaging (crashes, hangs, data corruption, you name it). Add on that this behavior is apparently mostly undocumented outside of one macro in one (popular) crate in the async/await ecosystem and yeah, this is frustrating. This feels antithetical to what many of us understood to be a core principle of Rust, that we avoid such insidious runtime behavior by forcing the programmer to demonstrate at compile-time that the code is well-formed

littlecranky67 3 months ago | parent | context | on: Apple M5 chip

And here I am, selling my Macbook M4 Pro to buy a Macbook Air and a dedicated gaming machine. I've tried gaming on the Macbook with Heroic, GPTK, Whiskey, RPCS3 emu and some native. When a game runs, the performance is stunning for a Laptop - but there is always glitches, bugs and annoyances that take out the joy. Needles to mention lack of support from any sort of online multiplayer, due to the lack of anticheat support.

I wish Apple would take gaming more seriously and make GPTK a first class citizen such as Proton on Linux.

vee-kay 3 months ago | parent | context | on: No I don't want to turn on Windows Backup with One...

Instead of Windows Backup (which relies on M$ OneDrive), you can enable (in Control panel settings) and use Windows File History.

File History is a backup feature in Windows that automatically saves copies of your files from specific folders, like Documents and Pictures, to an external drive or network location. It allows you to restore previous versions of your files if they are lost or damaged.

To enable File History in Windows, connect an external drive or network location, then go to Settings > Update & Security > Backup, and select "Add a drive" to choose your backup location. Finally, turn on File History to start backing up your files automatically.

oersted 3 months ago | parent | context | on: Devpush – Open-source and self-hostable alternativ...

This is a great opportunity to get HN's take on these tools: systems to streamline the management of containerized services deployed on self-managed hardware.

We've been running both CapRover and Coolify for a couple years. We quite like renting real dedicated servers (Hetzner, OVH), it is so much cheaper than the cloud and a minor management burden. These tools make it easy to bridge the gap and treat these physical servers like PaaS.

We have dozens of apps and services deployed on a couple large-ish servers with backups. Most modern back-ends do so little computationally and lots of containers can comfortably live together. 128GB of RAM and 64 cores can go a long way and surprisingly cheap in Hetzner, and having that fixed monthly cost removes a mental burden. It is cheap, simple and availability issues are so much rarer than people expect, maybe a couple mishaps a year that are easy to recover from and don't really have a meaningful impact for a startup.

Coolify feels more complete and mature, but frankly, after using both a lot, we now steer more towards the simplicity of CapRover. I see that Dokploy is also a major alternative to Coolify, don't know much about it.

How does /dev/push compare? Do you have any other recommendations in this vein? Or differing opinions on the tools I mentioned?

beanjuiceII 3 months ago | parent | context | on: Cancellations in async Rust

i am honestly glad i don't write rust anymore

malux85 3 months ago | parent | context | on: I spent the day teaching seniors how to use an iPh...

One of the things I’ve noticed with senior people is that fine motor control tends to start to go,

Things like double click a mouse is difficult to perform two very fast clicks, without also moving the mouse,

Same with iPhone, swiping without deviating, pressing TINY buttons, and even what constitutes a tap are difficult for the elderly. Yes there’s zoom but that only makes it 10% better, as I watch them

cmenge 3 months ago | parent | context | on: The RAG Obituary: Killed by agents, buried by cont...

We're processing tenders for the construction industry - this comes with a 'free' bucket sort from the start, namely that people practically always operate only on a single tender.

Still, that single tender can be on the order of a billion tokens. Even if the LLM supported that insane context window, it's roughly 4GB that need to be moved and with current LLM prices, inference would be thousands of dollars. I detailed this a bit more at https://www.tenderstrike.com/en/blog/billion-token-tender-ra...

And that's just one (though granted, a very large) tender.

For the corpus of a larger company, you'd probably be looking at trillions of tokens.

While I agree that delivering tiny, chopped up parts of context to the LLM might not be a good strategy anymore, sending thousands of ultimately irrelevant pages isn't either, and embeddings definitely give you a much superior search experience compared to (only) classic BM25 text search.

TheDong 4 months ago | parent | context | on: AirPods live translation blocked for EU users with...

Locked into iPhone.

I want to switch to Android, but I have all the following problems:

1. iMessage, unlike whatsapp etc, does not have an android app, and some of my family uses iMessage, so I would be kicked from various group chats

2. My grandma only knows how to use facetime, so I can't talk to her unless I have an iPhone

3. My apple books I purchased can't be read on android

4. Lose access to all my apps (android shares this one)

5. I have a friend who uses airdrop to share maps and files when we go hiking without signal, and apple refuses to open up the airdrop protocol so that I can receive those from android, or an airdrop app on android

6. ... I don't have a macbook, but if I did the sreen sharing, copy+paste sharing, and iMessage-on-macos would all not work with android.

It's obvious that apple has locked in a ton of stuff. Like, all other messages and file-sharing protocols except iMessage and airdrop work on android+iOS. Books I buy from google or amazon work on iOS or android.

Apple is unique here.

gregoriol 4 months ago | parent | context | on: CocoaPods trunk read-only plan

This is really sad, because the replacement, Swift Package Manager, is really crap: it lacks some useful features (an "outdated" command, meaningful commandline output, ...), is buggy as hell in xcode (most of the time xcode just crashes when you add/removed a dependency, error messages while getting a repository are not understandable and even often not visible entirely, many repositories have some old Package.swift that current developer tools won't read, ...), and worst of all, it stores the full repositories of all the dependencies with their full history on your machine and downloads them every time when you do CI properly, which often means GBs of data.

frollogaston 4 months ago | parent | context | on: Rv, a new kind of Ruby management tool

I think Go is going away. It occupies such a weird niche. People have said it's good for app backends, but you should really have exceptions (JS, Py, Java) for that sort of thing. For systems, just use Rust or worst case C++. For CLIs, it doesn't really matter. For things where portability matters like WASM, can't use Go. Bad syntax and type system on top of it.

What if Google spent all that time and money on something from the outside instead of inventing their own language? Like, Microsoft owns npm now.

wiz21c 4 months ago | parent | context | on: Why Nim?

I happen to prototype in python then move on to rust when things solidify... For those who don't know rust, its compiler is pretty slow. I mean, when you want to add a feature to an existing code base, each "run" you do to test the new feature imply relinking the whole project: that's very slow (or you have to split the project in sub project, but I'm lazy). In that situation, prototyping in python makes sense (at least to me :-)

jamesponddotco 5 months ago | parent | context | on: Getting good results from Claude Code

I have multiple system prompts that I use before getting to the actual specification.

1. I use the Socratic Coder[1] system prompt to have a back and forth conversation about the idea, which helps me hone the idea and improve it. This conversation forces me to think about several aspects of the idea and how to implement it.

2. I use the Brainstorm Specification[2] user prompt to turn that conversation into a specification.

3. I use the Brainstorm Critique[3] user prompt to critique that specification and find flaws in it which I might have missed.

4. I use a modified version of the Brainstorm Specification user prompt to refine the specification based on the critique and have a final version of the document, which I can either use on my own or feed to something like Claude Code for context.

Doing those things improved the quality of the code and work spit out by the LLMs I use by a significant amount, but more importantly, it helped me write much better code on my own because I know have something to guide me, while before I used to go blind.

As a bonus, it also helped me decide if an idea was worth it or not; there are times I'm talking with the LLM and it asks me questions I don't feel like answering, which tells me I'm probably not into that idea as much as I initially thought, it was just my ADHD hyper focusing on something.

[1]: https://github.com/jamesponddotco/llm-prompts/blob/trunk/dat...

[2]: https://github.com/jamesponddotco/llm-prompts/blob/trunk/dat...

[3]: https://github.com/jamesponddotco/llm-prompts/blob/trunk/dat...

ijustlovemath 5 months ago | parent | context | on: Zig Error Patterns

The website design is so pleasing, props!

worldsayshi 6 months ago | parent | context | on: Ukrainian hackers destroyed the IT infrastructure ...

Excel has the benefit of being understandable and fixable by a lot of regular office workers.

It's a bit surprising that we don't have that feature as a requirement for most IT infrastructure. It would make it so much more usable.

malkosta 6 months ago | parent | context | on: Building Modular Rails Applications: A Deep Dive i...

Offset-based pagination will be a problem on big tables.