More

srean · 2026-01-11T12:32:37 1768134757

And offer presidential pardons to those who sell it inside, illegally

sigwinch · 2026-01-11T13:47:28 1768139248

I can’t tell if GP is wry about Biden, or committing Type I error over Trump.

But yes today the questions are about how we treat politically-connected smugglers. What are the odds on a Justin Salsburey pardon?

srean · 2026-01-10T10:48:47 1768042127

By any chance is Ned Flanders modelled upon Florian Seiffert.

Link to photo fromthe pist: https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj...

https://en.wikipedia.org/wiki/File:Ned_Flanders.png

srean · 2026-01-10T10:08:52 1768039732

Yes! Archimedes was probably alien intelligence compared to his peers.

His method of exhaustion does qualify as integral calculus and it's not limited to that overwritten document. It appears many times in his work

srean · 2026-01-10T09:33:54 1768037634

There's a massive reduction in the whale song of the blue whales. Almost halved. They are presumably starving.

That something ginormous can be so elegant, beautiful and sleek is hard to conceive till one meets a blue whale. Let's let them thrive on the blue planet.

derektank · 2026-01-10T10:44:27 1768041867

The Blue Whale population has actually increased since the 70s. When they were critically endangered, their population numbered roughly 1,000-2,000 but population estimates for today put the number at roughly tenfold that. The 1966 worldwide moratorium on whaling has been incredibly successful and we’ve also seen recoveries in Humpback and Grey Whales.

srean · 2026-01-10T11:37:36 1768045056

Compared to numbers at peak whaling you are correct. I was commenting on a more recent phenomena.

https://www.nationalgeographic.com/animals/article/ocean-hea...

http://archive.today/2025.09.03-030523/https://www.nationalg...

srean · 2026-01-09T18:59:39 1767985179

Wouldn't mind killer whales and dolphins while we are at it. We aren't that great about sharing the environment though.

srean · 2026-01-09T13:02:16 1767963736

People go all dopey eyed about "frequency space", that's a red herring. The take away should be that a problem centric coordinate system is enormously helpful.

After all, what Copernicus showed is that the mind bogglingly complicated motion of planets become a whole lot simpler if you change the coordinate system.

Ptolemaic model of epicycles were an adhoc form of Fourier analysis - decomposing periodic motions over circles over circles.

Back to frequencies, there is nothing obviously frequency like in real space Laplace transforms *. The real insight is that differentiation and integration operations become simple if the coordinates used are exponential functions because exponential functions remain (scaled) exponential when passed through such operations.

For digital signals what helps is Walsh-Hadamard basis. They are not like frequencies. They are not at all like the square wave analogue of sinusoidal waves. People call them sequency space as a well justified pun.

My suspicion is that we are in Ptolemaic state as far as GPT like models are concerned. We will eventually understand them better once we figure out what's the better coordinate system to think about their dynamics in.

* There is a connection though, through the exponential form of complex numbers, or more prosaically, when multiplying rotation matrices the angles combine additively. So angles and logarithms have a certain unity, or character.

madhadron · 2026-01-09T14:41:48 1767969708

All these transforms are switching to an eigenbasis of some differential operator (that usually corresponds to a differential equation of interest). Spherical harmonics, Bessel and Henkel functions, which are the radial versions of sines/cosines and complex exponential, respectively, and on and on.

The next big jumps were to collections of functions not parameterized by subsets of R^n. Wavelets use a tree shapes parameter space.

There’s a whole, interesting area of overcomplete basis sets that I have been meaning to look into where you give up your basis functions being orthogonal and all those nice properties in exchange for having multiple options for adapting better to different signal characteristics.

I don’t think these transforms are going to be relevant to understanding neural nets, though. They are, by their nature, doing something with nonlinear structures in high dimensions which are not smoothly extended across their domain, which is the opposite problem all our current approaches to functional analysis deal with.

srean · 2026-01-09T16:50:34 1767977434

You may well be right about neural networks. Sometimes models that seem nonlinear turns linear if those nonlinearities are pushed into the basis functions, so one can still hope.

For GPT like models, I see sentences as trajectories in the embedded space. These trajectories look quite complicated and no obvious from their geometrical stand point. My hope is that if we get the coordinate system right, we may see something more intelligible going on.

This is just a hope, a mental bias. I do not have any solid argument for why it should be as I describe.

nihzm · 2026-01-09T19:28:28 1767986908

> Sometimes models that seem nonlinear turns linear if those nonlinearities are pushed into the basis functions, so one can still hope.

That idea was pushed to its limit by the Koopman operator theory. The argument sounds quite good at first, but unfortunately it can’t really work for all cases in its current formulation [1].

[1]: https://arxiv.org/abs/2407.08177

srean · 2026-01-09T19:57:35 1767988655

Quite so. Quite so indeed.

We know that under benign conditions and infinite dimensional basis must exist but finding it from finite samples is very non-trivial, we don't know how to do it in the general case.

madhadron · 2026-01-09T22:39:02 1767998342

I’m not sure what you mean by a change of basis making a nonlinear system linear. A linear system is one where solutions add as elements of a vector space. That’s true no matter what basis you express it in.

srean · 2026-01-10T08:46:10 1768034770

It depends on parameterization.

For example, if you prameterize the x,y coordinates of a plane-circular trajectory in terms the angle theta, it's nonlinear function of theta.

However, if you parameterized a point in terms of the tuple (cos \theta, sin \theta) it comes out as a scaled sum. Here we have pushed the nonlinear functions cos and sin inside the basis functions.

A conic section is nonlinear curve (not a line) when considered in the variables of and y. However, in the basis of x^2, xy, y^2, x, y it's linear (well, technically affine).

Consider the Naive Bayes classifier. It looks nonlinear till one parameterized it in log p, then it's linear in log-p and log-odds.

If one is ok with dimensional basis this linearisation idea can be pushed much further. Take a look at this if you are interested

https://math.stackexchange.com/questions/4471490/a-proper-ap...

fc417fc802 · 2026-01-09T15:10:58 1767971458

Note that I'm not great at math so it's possible I've entirely misunderstood you.

Here's an example of directly leveraging a transform to optimize the training process. ( https://arxiv.org/abs/2410.21265 )

And here are two examples that apply geometry to neural nets more generally. ( https://arxiv.org/abs/2506.13018 ) ( https://arxiv.org/abs/2309.16512 )

nihzm · 2026-01-09T16:04:49 1767974689

From the abstract and skimming a few sections of the first paper, imho it is not really the same. The paper is moving the loss gradient to the tangent dual space where weights reside for better performance in gradient descent, but as far as I understand neither the loss function nor the neural net are analyzed in a new way.

The Fourier and Wavelet transforms are different as they are self-adjoint operators (=> form an orthogonal basis) on the space of functions (and not on a finite dimensional vector space of weights that parametrize a net) that simplify some usually hard operators such as derivatives and integrals, by reducing them to multiplications and divisions or to a sparse algebra.

So in a certain sense these methods are looking at projections, which are unhelpful when thinking about NN weights since they are all mixed with each other in a very non-linear way.

srean · 2026-01-09T16:53:00 1767977580

Thanks a bunch for the references. Reading the abstract these used a different idea compared to what Fourier analysis is about, but nonetheless should be a very interesting read.

anamax · 2026-01-10T01:02:52 1768006972

> My suspicion is that we are in Ptolemaic state as far as GPT like models are concerned. We will eventually understand them better once we figure out what's the better coordinate system to think about their dynamics in.

Most deep learning systems are learned matrices that are multiplied by "problem-instance" data matrices to produce a prediction matrix. The time to do said matrix-multiplication is data-independent (assuming that the time to do multiply-adds is data-independent).

If you multiply both sides by the inverse of the learned matrix, you get an equation where finding the prediction matrix is a solving problem, where the time to solve is data dependent.

Interestingly enough, that time is sort-of proportional to the difficulty of the problem for said data.

Perhaps more interesting is that the inverse matrix seems to have row artifacts that look like things in the training data.

These observations are due to Tsvi Achler.

srean · 2026-01-10T09:59:51 1768039191

Neural nets are quite a bit more than matrix multiplications, at least in their current representation.

There are layers upon layers of nonlinearity, be it with softmax or sigmoid. In the tangent kernel view it does linearize.

alexlesuper · 2026-01-09T13:33:56 1767965636

I feel like this is the way we should have learned Fourier and Laplace transforms in my DSP class. Not just blindly applying formulas and equations.

patentatt · 2026-01-09T13:59:48 1767967188

I’d argue that most if not all of the math that I learned in school could be distilled down to analyzing problems in the correct coordinate system or domain! The actual manipulation isn’t that esoteric once you get in the right paradigm. And those professors never explained things at that kind of higher theoretical level, all I remember was the nitty gritty of implementation. What a shame. I’m sure there’s higher levels of mathematics that go beyond my simplistic understanding, but I’d argue it’s enough to get one through the full sequence of undergraduate level (electrical) engineering, physics, and calculus.

RossBencina · 2026-01-10T00:56:37 1768006597

> exponential functions remain (scaled) exponential when passed through such operations.

See also: eigenvalue, differential operator, diagonalisation, modal analysis

Xcelerate · 2026-01-09T15:23:04 1767972184

It’s kind of intriguing that predicting the future state of any quantum system becomes almost trivial—assuming you can diagonalize the Hamiltonian. But good luck with that in general. (In other words, a “simple” reference frame always exists via unitary conjugation, but finding it is very difficult.)

srean · 2026-01-09T16:39:35 1767976775

Indeed.

It's disconcerting at times, the scope of finite and infinite dimensional linear algebra, especially when done on a convenient basis.

srean · 2026-01-09T12:43:16 1767962596

You are spot on.

More here https://news.ycombinator.com/item?id=46553398

srean · 2026-01-09T12:33:14 1767961994

He did have a very long life, so there's that.

It's not easy to separate cause and effect from direct and strong correlations that we experience.

The job of a scientist is not to give up on a hunch with a flippant "correlation is not causation" but pursue such hunches to prove it this way or that (that is, prove it or disprove it). It's human to lean a certain way about what could be true.

srean · 2026-01-09T12:29:34 1767961774

There's also this notion of holding themselves to their own standards.

They, Newton included, would often feel that their work was not good enough, that it was not completed and perfected yet and therefore would be ammunition for conflict and ridicule.

Gauss did not publicize his work on complex numbers because he thought he would attacked for it. To us that may seem weird, but there is no dearth of examples of people who were attacked for their mostly correct ideas.

Deadly or life changing attacks notwithstanding, I can certainly sympathize. There's not in figuring things out, but the process of communicating that can be full of tediousness and drama that one maybe tempted to do without.

srean · 2026-01-09T18:38:20 1767983900

Weird typo in what I wrote. It's past the edit window. This is what I had meant to type:

There's joy in figuring things out, but the process of communicating what has been so figured can be tedious and full of drama -- the kind of drama that one maybe tempted to do without.

srean · 2026-01-07T16:56:31 1767804991

There's also the fact that if you don't have a convenient straight-edge around -- fold a sheet of paper, not too rough.

It's a good exercise in thinking, dhy that is so.