Question related to 3D mesh models in general: has any significant work been done on models oriented towards photogrammetry?
Case in point, I have a series of photos (48) that capture a small statue. The photos are high quality, the object was on a rotating platform. Lighting is consistent. The background is solid black.
These normally are ideal variables for photogrammetry but none of the various common applications and websites do a very good job creating a mesh out of it that isn't super low poly and/or full of holes.
I've been casually scanning huggingface for relevant models to try out but haven't really found anything.
COLMAP + CloudCompare with a good CUDA GPU (more VRAM is better) card will give reasonable results for large textured objects like buildings. Glass/Water/Mirror/Gloss will need coated to scan, dry spray on Dr.scholls foot deodorant seems to work fine for our object scans.
There are now more advanced options than Gaussian splatting, and these can achieve normal playback speeds rather than hours of filtering. I'll drop a citation if I recall the recent paper and example code. However, note this style of 3D scene recovery tends to be heavily 3D location dependent.
You can never be sure what someone's real intent is. They might mean "something meshlike". Personally I usually reply by asking for more info (I always have the XY Problem in my mind) but that is time consuming and some people assume you're being pendantic (I am however correct more often than not - people have posed the wrong question or haven't given critical parts of the context)
Yeah, I am explicitly asking about meshes, which is why I said that and also referenced photogrammetry. Sometimes people know what they're asking for help with.
Thanks for the links. Going to check them out this morning.
Just to be clear I wasn't singling you out. I don't know anything about you.
And furthermore I also often post questions that lack sufficient context.
My point was that it's always okay to ask for clarification or to assume that there was maybe some broader contact context or to offer a suggestion that doesn't follow the most literal interpretation of the question as asked.
Yeah some very impressive stuff with splats going on. But I haven't seen much about going from splats to high quality 3D meshes. I've tried one or two with pretty poor results.
Splats to meshes is an isosurface extraction problem, fundamentally. It's one of the great unsolved problems of computer graphics and having a good general algorithm would have massive ripple effects for any problem involving meshes.
It's a rabbit hole and I only really understood it when I realized that the minimum time between any GH committer's hobby example and implementing the 2003 state-of-the-art is ~4 years.
Fingers crossed that Gaussian Splatting makes the rewards high enough that resources get poured on this.
For this exact use case I used instant-ngp[0] recently and was really pleased with the results. There's an article[1] explaining how to prepare your data.
On the geometry side from the theoretical point of view you can repair meshes, [1], by inferring a signed or unsigned distance field from your existing mesh, then you contour this distance field.
If you like the distance field approach, there are also research work [2], to estimate neural unsigned distance fields directly, (kind of a similar way as Gaussian splats).
[1] https://github.com/nzfeng/signed-heat-3d [it works but it's research code, so buggy, not user friendly, and mostly on toy problems because complexity explode very quickly when using a grid the number of cells grows as a n^3, and then they solve a sparse linear system on top (so total complexity bounded by n^6), but tolerating approximations and writing things properly practical complexity should be on par with methods like finite element method in Computational Fluid Dynamics.
No. For small objects, it is typical to use a turntable to rotate the object; there are a number of commercial and DIY turntables with an automated motion system that can trigger the shutter after a specified degree of rotation.
The OC mentioned "static lighting". If they meant static, while the platform was spinning, then the lighting would be inconsistent, because the object would change lighting with each photo. You would have to fix the lighting to the platform to spin with the object, while taking the pictures to get consistent lighting.
I think you just nailed why I have been having a hard time with my photo set. It's the lighting. Well crap, because I don't have access to the statue or studio again. Thanks for the tip.
You could try generating per-view depth maps, going to a point cloud and meshing from there. (I suspect splats may reduce your accuracy as an intermediate.)
I’m not aware of a fully-baked workflow for that — though it may exist. The first step has gotten really good: the recent single-shot AI models for depth are pretty visually impressive (I don’t know about metric accuracy).
The ones I’m aware of are DUST3R and the newer MAST3R:
Photogrammetry generally assumes a fully static scene. If there are static parts of the scene which the camera can see and also rotating parts, the algorithm may struggle to properly match features between images.
Kiri engine is pretty easy to use and just released a good update for their 3DGS pipeline, and they have one of the better 3DGS to mesh options.
https://kiri-innovation.github.io/3DGStoMesh2/
>These normally are ideal variables for photogrammetry
Actually no, my friend learned this the hard way during a photogrammetry project, he rented a photo studio, and made sure the background were perfectly black and took the photos but the photogrammetry program (Meshroom I think) was struggling to reconstruct the mesh. I did some research and I learned that it uses features in the background to help position itself to make the meshes. So he redid his tests outside with "messy" backgrounds and it worked much much better.
This was a few years ago so I don't know if things are different now.
I’m not an expert, only dabbled in photogrammetry, but it seems to me that the crux of that problem is identifying common pixels across images in order to sort of triangulate a point in the 3D space. It doesn’t sound like something an LLM would be good at.
Case in point, I have a series of photos (48) that capture a small statue. The photos are high quality, the object was on a rotating platform. Lighting is consistent. The background is solid black.
These normally are ideal variables for photogrammetry but none of the various common applications and websites do a very good job creating a mesh out of it that isn't super low poly and/or full of holes.
I've been casually scanning huggingface for relevant models to try out but haven't really found anything.