Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type Safety for Tokenizer and Processor inputs #1131

Open
tsekiguchi opened this issue Jan 2, 2025 · 1 comment
Open

Type Safety for Tokenizer and Processor inputs #1131

tsekiguchi opened this issue Jan 2, 2025 · 1 comment
Labels
enhancement New feature or request

Comments

@tsekiguchi
Copy link

tsekiguchi commented Jan 2, 2025

Feature request

Currently, the way each class is constructed, Transformers.js is using ...args: any[] when passing arguments to the tokenizer and processor. This leads to a lot of developer confusion as each model requires different inputs. There is very little documentation besides the single example provided for most models, which has led to me to many hours of trying to figure out what I can pass to the function in order to make this particular model happy.

The broad approach of passing a string of a model for tokenizer / processor / model feels unstable. And while it might resemble the transformers python counterpart, it feels like a no guardrails approach for such an important library.

Ideally, this would require a bit of a restructure, with an alternative Model class so as to not break any existing code bases. This a class for a model could be instantiated from a list of currently supported models (with an option for custom should users want to convert specific models to ONNX on their own), and the class itself would contain the processor, tokenizer, and model with all the proper parameters and type definitions to allow users to more accurately pass information to be encoded, and to know what to expect upon decode.

A broad example:

// Create the specific class for this model
const clipModel = await baseModel.initialize(ApprovedModels.OpenCLIP, {
     dtype: 'f16',
     device: 'webgpu'
})

// All model specific methods are on this class
const tokens = clipModel.tokenizer(text)

const inputs = clipModel.processInput(imageInput, { padding: true, truncation: true })
// or some specific methods
const inputs = clipModel.processorImageOnly(imageInput, { padding: true, truncation: true })

// Use descriptive names for methods to have a better idea of output
const embeddings: ClipModelEmbeddings = clipModel.generateEmbeddings(inputs)

Motivation

It would be great for DX to provide a class that is specific to the model with all the proper options and definitions that devs are used to in Typescript. I want to provide a way to give feedback and guardrails for devs as they're implementing Transformers.js to make it a seamless, delightful experience.

Your contribution

Absolutely! I'd love to help out. It will take a lot of work, but I want to make sure that I have the support of Xenova before heading down this road.

Thank you!

@tsekiguchi tsekiguchi added the enhancement New feature or request label Jan 2, 2025
@tsekiguchi
Copy link
Author

I'll also add that this would give an opportunity to provide better error handling across the board. There is a lack of error handling / messages provided when something goes wrong, often causing the entire application to crash without logging an error. I believe that many of these could be prevented by generous use of try / catch that isn't present right now, along with system check guardrails when loading models (for example, getting OOM crashes when loading large models)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant