Monolithic 3D is also sometimes called "sequential 3D". In essence, instead of current 3D integration which fabricates several 2D chips, then thins them, adds TSVs, aligns and then finally bonds them, Monolithic 3D makes 3D chips layer by layer.
The advantages are that it can have extremely dense chip to chip connections, as much as ~10,000X as dense as TSVs (literally) and no ESD diode capacitance (can be as much as ~50fF!!).
The disadvantages are that copper interconnects melt above ~400 degrees celsius, and we are around ~1200 degrees celsius to make transistors. There are a few ways around this but they usually result in crummier transistors.
From what I understand, another disadvantage of monolithic 3D are the yield implications.
When you fabricate several 2D chips and then integrate them, it allows you to test those 2D chips for errors separately before the integration, which should give you better yields overall.
Yes, although this can be an extra challenge in monolithic 3D because it often involves oxide layers and thin silicon * between transistor layers, which makes the heat conduction much worse.
I don't think anyone's tried but there can be all sorts of "fun" issues, e.g. directly putting many high-k oxides onto transistors results in fermi level pinning.
EDIT: I did look at this idea some time back and I believe AMD even has a patent on it (though I might be mistaken), it's possible, but might cause unexpected issues.
So the issue is that stuff like this (e.g. Fermi level pinning induced by HfO2 directly on Si without a metal gate) tends to be industry "secret" stuff and tends to not be in textbooks. You'll usually catch this info from talking to process engineers and reading papers.
From the various articles in the thread it seems that these are single die computers, think something like bringing the cache on die from back in the day.