About Multi-scale VAE #9

hizening · 2024-12-12T13:36:27Z

Hi author, can I ask if the multi-scale VAE has been retrained in your work? Or are the parameters in the VAR loaded directly?

lxa9867 · 2024-12-12T14:31:14Z

Hi,

Thanks for your interest. We train imagefolder tokenizer from scratch. Comparing to VAR tokenizer, it achieves:

📈 Generation Quality（～300M param）gFID：3.30 v.s. 2.60 (ours)，
📈 Reconstruction Quality rFID: 0.9 v.s. 0.8 (ours),
📈 Token length：680 v.s. 286 (ours)，
📈 Linear probing：11.3 v.s. 58.0 (ours)，

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About Multi-scale VAE #9

About Multi-scale VAE #9

hizening commented Dec 12, 2024

lxa9867 commented Dec 12, 2024

About Multi-scale VAE #9

About Multi-scale VAE #9

Comments

hizening commented Dec 12, 2024

lxa9867 commented Dec 12, 2024