Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The M1 Ultra is fabricated as a single chip. The 12900K is fabricated as a single chip and is still a quarter the size of the M1 Ultra. Ryzen 3 puts 8 cores on a CCX instead of four because DDR memory controllers don't have infinite memory bandwidth (contrary to AMD's wishful nomenclature) and make shitty interconnects between banks of L3.

Chiplets are valid strategies that are going to be used in the future but there are still more tricks that CPU makers have up their sleeves that they need to use out of necessity. They're nowhere near their limits.



M1 ultra is two chips with an interconnect between them I thought? Or is the interconnect already on die with them?

(Edit: sounds like it is two: "Apple said fusing the two M1 processors together required a custom-built package that uses a silicon interposer to make the connection between chips. " https://www.protocol.com/bulletins/apple-m1-ultra-chip )


> M1 ultra is two chips with an interconnect between them I thought? Or is the interconnect already on die with them?

It's either depending on how you look at it. The active components of the interconnect are on the two M1 dies, but the interconnect itself goes through the interposer as well.


> The M1 Ultra is fabricated as a single chip.

I'm curious how much the M1 Ultra costs. It's such a massive single piece of glass I'd guess it's $1,200+. If that's the case it doesn't make sense to compare the M1 Ultra to $500 CPUs from Intel and AMD.


Dunno, M1 Ultra includes a decent GPU, which the $500 CPUs from Intel and AMD do not. Seems relatively comparable to a $700 GPU (like a RTX 3070 if you can find one) depending on what you are using. Sadly metal native games are rare, many use some metal wrapper and/or Rosetta emulation.

Seems pretty fair to compare an Intel alder lake or higher end AMD Ryzen AND a GPU (rtx 3070 or radeon 6800) to the M1 ultra, assuming you don't care about power, heat, or space.


Has anyone managed to reach the actual advertised 21 FP32 TFLOPS? I'm curious. Even BLAS or pure custom matmul stuff? How much of that is actually available? I can almost saturate and sustain an NVIDIA A40 or A4000 to their peak perf, so, wondering whether anyone written something there?


What are you using to make the comparison to a 3070? Not close in any benchmarks I've come across.


I estimate that Apple's internal "price" for the M1 Ultra is around $2,000. Since most of the chip is GPU, it should really be compared to a combo like 5950X + 6800 XT or 12900K + 3080.


It wouldn't surprise me. M1 Ultra has 114 billion transistors and a total area of ~860 square mm. For comparison, an RTX 3090 has 28 billion transistors and a total area of 628 square mm.


Wouldn't the price be primarily based on capital investment and not so much on the unit itself? After all, it's essentially a print out on a crystal using reeeeeally expensive printers. AFAIK Apple's relationship with TSMC is more than a customer relationship.


In a parallel universe where Intel builds and sells this CPU- what's the price? Single chip, die size of 860 square mm, 114 billion transistors, on package memory.

It just got me thinking the other day since all of these benchmarks pit it against $500-$1000 CPUs and it doesn't seem to fall in that price range at all. Look at this thing:

https://cdn.wccftech.com/wp-content/uploads/2022/03/2022-03-...


That's the whole package though, together with the RAM and everything. The Actual die is about the size of the thermal paste stain on that picture.


If there's a defective M1 Ultra, they can cut it in half and say those are two low-end M1 Max.


Wouldn't they only get, at most, one low-end M1 Max if there is a defect?


They sell cheaper models with some cores disabled, that's what I meant by low-end. Ever wondered what's the deal with the cheapest "7-core GPU" M1?


If the defect is in the right place, Apple apparently sells M1 Max chips with some GPU cores disabled.


also all the other shit that’s on the chip ram etc.


It is commonly said that on the new M1 macs, that the ram is on the chip, it is not. It is on the same substrate, but its just normal (fast) dram chips soldered on nearby.


"chip" is ambiguous the way you're using it.

The M1 Ultra is two dies in one package. The package is what goes on the motherboard.

You can also count the memory modules as packages as well. It's not incorrect to say that M1 Ultra has a bunch of LPDDR5 packages on it as well, each LPDDR5 package may have multiple dies in it as well, or the whole LPDDR5 package may be referred to as a stack ("Ultra has 16 stacks of memory")

But depending on context it also wouldn't be incorrect to say the M1 ultra is a package as a chip even if it's got more packages on it. From the context of the motherboard maker, the CPU BGA unit is the "package".

Anyway no, Ultra isn't a monolithic die in the sense you're meaning, it's two dies that are joined, Apple just uses a ridiculously fat pipe to do it (far beyond what AMD is using for Ryzen) such that it basically appears to be a single die. The same is true for AMD, Rome/Milan are notionally NUMA - running in NPS4 mode can squeeze some extra performance in extreme situations if applications are aware of it, and there's some weird oddities caused by memory locality in "unbalanced" configurations where each quadrant doesn't have the same amount of channels. It just doesn't feel like it because AMD has done a very good job hiding it.

However you're also right that we haven't reached the end of monolithic chips either. Splitting a chip into modules imposes a power penalty for data movement, it's much more expensive to move data off-chiplet than on-chiplet, and that imposes a limit on how finely you can split your chiplets (doing let's say 64 tiny chiplets on a package would use a huge amount of power moving data around, since everything is off-chip). There are various technologies like copper-copper bonding and EMIB that will hopefully lower that power cost in the future, but it's there.

And even AMD uses monolithic chips for their laptop parts, because of that. If any cores are running, the IO die has to be powered up, running its memory and infinity fabric links, and at least one CCD has to be powered up, even if it's just to run "hello world". This seems to be around 15-20W, which is significant in the context of a home or office PC.

It's worth noting that Ryzen is not really a desktop-first architecture. It's server-first, and AMD has found a clever way to pump their volumes by using it for enthusiast hardware. Servers don't generally run 100% idle, they are loaded or they are turned off entirely and rebooted when needed. If you can't stand the extra 20W at idle, AMD would probably tell you to buy an APU instead.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: