Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The fancy online models can produce links for you. They might get the summary wrong, but they’ve got a link, you can follow it and check it out.

In this context they are more like conversational search engines. But that’s a pretty decent feature IMO.



If the output came from RAG (search) rather than the model itself, then a link is possible, but not if the model just generated the sequence of words by itself.

Note too that these models can, and do, make up references. If it predicts a reference is called for, then it'll generate one, and to the LLM it makes no difference if that reference was something actually in the training data or just something statistically plausible it made up.


They also search online and return links, though? And, you can steer them when they do that to seek out more "authoritative" sources (e.g. news reports, publications by reputable organizations).

If you pay for it, ChatGPT can spend upwards of 5 minutes going out and finding you sources if you ask it to.

Those sources can than be separately verified, which is up to the user - of course.


Right, but now you are not talking about an LLM generating from it's training data - you are talking about an agent that is doing web search, and hopefully not messing it up when it summarizes it.


Yes, because most of the things that people talk about (ChatGPT, Google SERP AI summaries, etc.) currently use tools in their answers. We're a couple years past the "it just generates output from sampling given a prompt and training" era.


It depends - some queries will invoke tools such as search, some won't. A research agent will be using search, but then summarizing and reasoning about the responses to synthesize a response, so then you are back to LLM generation.

The net result is that some responses are going to be more reliable (or at least coherently derived from a single search source) than others, but at least to the casual user, maybe to most users, it's never quite clear what the "AI" is doing, and it's right enough, often enough, that they tend to trust it, even though that trust is only justified some of the time.


The models listed in the quote have this capability, though, they must be RAGs or something.


RAG is a horrible term for agentic search. Please stop using it.

And, don’t argue with me about terms. It literally stands for retrieval (not store or delete or update) augmented generation. And as generation is implied with LLMs it really just means augmenting with retrieval.

But if you think about it the agent could be augmented with stores or updates as well as gets, so that’s why it’s not useful, plus nobody I’ve seen using RAG diagrams EVER show it as an agent tool. It’s always something the system DOES to the agent, not the agent doing it to the data.

So yeah, stop using it. Please.


What if you just read it a Retrieval AGent? It isn’t the conventionally accepted definition but it fits and it might make you happier.


If a plain LLM, not an agent, invokes a tool then that can still be considered as RAG. You seem to be thinking of the case where an agent retrieves some data then passes it to an LLM.


A year ago there were links to things that didnt exist. Has that changed?


I’m sure it is possible to get a model to produce a fake URL, but it seems like ChatGPT has some agentic feature where it actually searches in a search engine or something, and then gives you the URLs that it found.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: