what's the difference between the model file base and dave, 1 shot and 3 shot. Can I give 5 shot or more? #5

lisenjie757 · 2024-06-15T07:14:18Z

And the demo only provide 3-shot inference, how to do zero-shot inference on my own image, can you provide a demo_zero? Thank you!

jerpelhan · 2024-06-16T09:47:40Z

The provided models are optimized for a specific number of inputs. I believe the method should work when adding more exemplars, but I am unsure if it yields better results. If you test this, let me know your findings.
I will see if in the future I have the time to add a zero-shot demo.

GioFic95 · 2024-06-27T12:01:27Z

I would also appreciate a clarification of the difference between the models base_0_shot.pth and DAVE_0_shot.pth, base_3_shot.pth and DAVE_3_shot.pth.

Moreover, I'm trying to implement a zero-shot demo, but I have some doubts:

From the zero-shot test script it looks like the num_objects parameter should be 3 (ref), in fact, zero-shot models (i.e. base_0_shot.pth and DAVE_0_shot.pth) have objectness shape equal to (3, 9, 256), where the first dimension is self.num_objects (ref). Could you explain why?
Running your demo with --num_objects 3 --model_name DAVE_0_shot --zero_shot and changing the model path to os.path.join(args.model_path, 'DAVE_0_shot.pth') it works (i.e. no errors), but the model still seems to be taking exemplars into account, since it produces different results when provided with different exemplars, both with and without --use_query_pos_emb. Is this expected?

So, could you please provide some hints or code references on how to implement a zero-shot demo?

Thank you!

jerpelhan · 2024-06-28T13:32:07Z

I would also appreciate a clarification of the difference between the models base_0_shot.pth and DAVE_0_shot.pth, base_3_shot.pth and DAVE_3_shot.pth.

base_3_shot.pth and base_0_shot.pth are LOCA weights, so only for the density map prediction part, without the weights for bounding box prediction.

From the zero-shot test script it looks like the num_objects parameter should be 3 (ref), in fact, zero-shot models (i.e. base_0_shot.pth and DAVE_0_shot.pth) have objectness shape equal to (3, 9, 256), where the first dimension is self.num_objects (ref). Could you explain why?

This is also a part of LOCA method. In few-shot it uses an exemplar pooling into 3x3 prototype. When flattened you get 9 (the second dimension). This is kept unchanged in zero-shot, just using trainable parameters instead of roi pooling from exemplars in the image.

Running your demo with --num_objects 3 --model_name DAVE_0_shot --zero_shot and changing the model path to os.path.join(args.model_path, 'DAVE_0_shot.pth') it works (i.e. no errors), but the model still seems to be taking exemplars into account, since it produces different results when provided with different exemplars, both with and without --use_query_pos_emb. Is this expected?

Did you run demo.py with all parameters unchanged except for the model name and the addition of --zero_shot? If not, please try running it this way and let us know if you encounter any issues. I will check this soon and post a demo for the zero-shot setup as soon as possible.

jerpelhan · 2024-06-28T14:03:20Z

I just saw what the issue probably is: In a few-shot setup, the image is resized based on the exemplar size. Since demo.py was created for few-shot counting, it includes resizing on line 74. Try adapting it. (modify the function resize in demo.py to return in line 24, before it resizes the image based on the bounding boxes.)

Additionally, note that DAVE in zero-shot performs two passes. In the first pass, it estimates the size of objects, based on which it resizes the image and performs a second pass, which improves the results (see main.py).

GioFic95 · 2024-07-02T12:07:50Z

Hi @jerpelhan, thank you for your reply.

Did you run demo.py with all parameters unchanged except for the model name and the addition of --zero_shot?

I tried running demo.py as you suggested. It doesn't raise any error but it still requires the exemplars. Providing them it looks like they are taken into account even if the --zero_shot parameter is used, since the results seem to change with different exemplars. It may be due to the different resizing that is applied.

Modify the function resize in demo.py to return in line 24, before it resizes the image based on the bounding boxes.

I finally found out what was making the code not working without the exemplars also when removing the dependency of the resize function on the bounding boxes: in COTR.forward(), the line self.num_objects = bboxes.shape[1] caused a RuntimeError: shape '[1, 0, 3, 3, -1]' is invalid for input of size 6912 when no bbox was provided.
By commenting this line, the code seems to work.

(image from the Video Object Counting Dataset)

Additionally, note that DAVE in zero-shot performs two passes. In the first pass, it estimates the size of objects, based on which it resizes the image and performs a second pass, which improves the results (see main.py).

I'll try to add this two-steps approach in the zero-shot demo as well, thank you very much.

jerpelhan · 2025-01-07T10:16:32Z

We wanted to share an update regarding this repository. We've developed a novel method, GeCo, which significantly outperforms the older approach in this repo by a large margin. You can check out the code and an easy-to-run demo here: https://github.com/jerpelhan/GeCo.

GioFic95 mentioned this issue Jul 3, 2024

Zero-shot demo #9

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

what's the difference between the model file base and dave, 1 shot and 3 shot. Can I give 5 shot or more? #5

what's the difference between the model file base and dave, 1 shot and 3 shot. Can I give 5 shot or more? #5

lisenjie757 commented Jun 15, 2024 •

edited

Loading

jerpelhan commented Jun 16, 2024

GioFic95 commented Jun 27, 2024

jerpelhan commented Jun 28, 2024 •

edited

Loading

jerpelhan commented Jun 28, 2024

GioFic95 commented Jul 2, 2024

jerpelhan commented Jan 7, 2025

what's the difference between the model file base and dave, 1 shot and 3 shot. Can I give 5 shot or more? #5

what's the difference between the model file base and dave, 1 shot and 3 shot. Can I give 5 shot or more? #5

Comments

lisenjie757 commented Jun 15, 2024 • edited Loading

jerpelhan commented Jun 16, 2024

GioFic95 commented Jun 27, 2024

jerpelhan commented Jun 28, 2024 • edited Loading

jerpelhan commented Jun 28, 2024

GioFic95 commented Jul 2, 2024

jerpelhan commented Jan 7, 2025

lisenjie757 commented Jun 15, 2024 •

edited

Loading

jerpelhan commented Jun 28, 2024 •

edited

Loading