Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jina-embeddings-v3 model #1072

Open
2 tasks done
yrayegan opened this issue Dec 4, 2024 · 2 comments
Open
2 tasks done

jina-embeddings-v3 model #1072

yrayegan opened this issue Dec 4, 2024 · 2 comments
Labels
new model Request a new model

Comments

@yrayegan
Copy link

yrayegan commented Dec 4, 2024

Model description

Support for jina-embeddings-v3

Prerequisites

  • The model is supported in Transformers (i.e., listed here)
  • The model can be exported to ONNX with Optimum (i.e., listed here)

Additional information

No response

Your contribution

None

@yrayegan yrayegan added the new model Request a new model label Dec 4, 2024
@nemphys
Copy link
Contributor

nemphys commented Jan 2, 2025

+1

@nemphys
Copy link
Contributor

nemphys commented Jan 5, 2025

@xenova I managed to use the ONNX model from huggingface with a few minor tweaks, but I don't know how you would like to handle them:

  1. The model config.json needs the "model_type": "xlm-roberta" property, otherwise its type is not properly detected by transformers.js
  2. The model_inputs during model execution need a 3rd parameter which is the task_id index from the lora_adaptations list ("retrieval.query", "retrieval.passage", "separation", "classification", "text-matching")
  3. The output results property is "text_embeds", which needs to be added to the relevant inspection list.

1/3 are easy, but for 2 I suppose we need to create special classes/parameters.

Also, there is the Matryoska embeddings thing; does this imply mere dimension clamping of the results or is it something more than that that needs special handling?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new model Request a new model
Projects
None yet
Development

No branches or pull requests

2 participants