You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm curious how you get the same image for each angle? If I were to write "chair front view", "chair side view", chair back view" etc in SD it will give me entirely different chairs in each image I generate. So how does this system generate a chair that looks the same in each reference image from different angles?
The text was updated successfully, but these errors were encountered:
The underlying 3D representation (NeRF) is constraining the system to provide view consistency. It is true that at each iteration, the guidance provided by the 2D diffusion is pretty random. But over many iterations and viewpoints, those conflicting signals are merged and resolved in the 3D body. The ultimate end goal of 2D diffusion is to make things look realistic, and towards that goal it will play along with whatever is rendered and presented to it.
A great analogy I like the most is SDEdit by Meng et. al. (I provided links & discussion in the project website).
It shows how diffusion models can be very cooperative, and intervention-friendly.
I'm curious how you get the same image for each angle? If I were to write "chair front view", "chair side view", chair back view" etc in SD it will give me entirely different chairs in each image I generate. So how does this system generate a chair that looks the same in each reference image from different angles?
The text was updated successfully, but these errors were encountered: