Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Links to pre-trained ed and eae models are broken #64

Open
lovering810 opened this issue Dec 16, 2024 · 5 comments
Open

Links to pre-trained ed and eae models are broken #64

lovering810 opened this issue Dec 16, 2024 · 5 comments

Comments

@lovering810
Copy link

When trying to run the test, I repeatedly encountered a FileNotFoundError regarding a config, which I eventually traced back to the actual problem: there are no such files at these URLs. What gets downloaded from here is an HTML page telling you that the link does not exist. This HTML page cannot be unzipped, of course, so the process errors out long before completing.

MODEL_NAMES = {
    "s2s-mt5-ed": "https://cloud.tsinghua.edu.cn/f/cdc4b333aff143ff870e/?dl=1",
    "s2s-mt5-eae": "https://cloud.tsinghua.edu.cn/f/f4ac92ac8f2c4e769282/?dl=1",
}

Neither of these links is valid, meaning that the single test found in the repo cannot run, and the library cannot be used out of the box to perform tasks like the one in the README. Where are the fully pretrained models now? Have they been taken down? If so, submit that the README and some other documentation should change to reflect that there is no out-of-the-box functionality. If not, what should these links be updated to?

My poor baby machine cannot train its own model, I need y'all's.

@h-peng17
Copy link
Member

The link is updated here: #61

@lovering810
Copy link
Author

lovering810 commented Dec 17, 2024

Thanks for the response! When I replace the old links with the new ones, I get the error

 The state dictionary of the model you are trying to load is corrupted. Are you sure it was properly saved?

Is there anything additional that I need in order to use these files? They are downloading correctly, as far as I can tell, the directory is successfully unzipped and contains the following files:

added_tokens.json       pytorch_model.bin       rng_state_3.pth         rng_state_7.pth         tokenizer_config.json
args.yaml               rng_state_0.pth         rng_state_4.pth         special_tokens_map.json trainer_state.json
config.json             rng_state_1.pth         rng_state_5.pth         spiece.model            training_args.bin
latest                  rng_state_2.pth         rng_state_6.pth         tokenizer.json          zero_to_fp32.py

Do I need to run zero_to_fp32.py?

@h-peng17
Copy link
Member

As an urgent candidate, perhaps you can try this model (https://huggingface.co/THU-KEG/ADELIE-DPO-1.5B). It's more general and powerful for information extraction, and its scale is similar to the previous one. I will recheck the issues with the old version of the model soon.

@h-peng17
Copy link
Member

Have you tried to load the model using

def get_model(model_args, model_name_or_path):

@lovering810
Copy link
Author

Thanks for your quick responses! In order:

HuggingFace

with LangChain

Trying the huggingface model ran into some problems (though admittedly, I am stacking LangChain on top of it): when invoking the model (whether set up using HuggingFaceEndpoint or HuggingFacePipeline), it throws a 404 and hangs. Here's a snippet from the HuggingFacePipeline approach

llm = HuggingFacePipeline.from_model_id(
    model_id="THU-KEG/ADELIE-DPO-1.5B",
    task="text-generation",
    pipeline_kwargs=dict(
        max_new_tokens=512,
        do_sample=False,
        repetition_penalty=1.03,
    ),
)
.......
2024-12-18 16:53:03,221 : DEBUG : urllib3.connectionpool : [https://huggingface.co:443](https://huggingface.co/) "HEAD [/THU-KEG/ADELIE-DPO-1.5B/resolve/main/model.safetensors](https://file+.vscode-resource.vscode-cdn.net/THU-KEG/ADELIE-DPO-1.5B/resolve/main/model.safetensors) HTTP/11" 404 0

In the case of the Endpoint approach, the 404 happens during invocation:


llm = HuggingFaceEndpoint(
    repo_id="THU-KEG/ADELIE-DPO-1.5B",
    task="text-generation",
    max_new_tokens=512,
    do_sample=False,
    repetition_penalty=1.03,
)

chat_model = ChatHuggingFace(llm=llm)
from langchain_core.messages import (
    HumanMessage,
    SystemMessage,
)
messages = messages = [
    SystemMessage(content="You're a business-minded language expert who can identify important events and entities in text"),
    HumanMessage(
        content=supercorpus[1].descriptions[0].value
    ),
]
ai_msg = chat_model.with_structured_output(EventsOfInterest).invoke(messages)
print(ai_msg)

In either case, I can't seem to get it to respond to the input.

HuggingFace Alone

It did occur to me that maybe this is a LangChain issue, so I bopped around with HuggingFace, following the example in the Llama model card, as there is no invocation example for the ADELIE-DPO-1.5B.
The code:

import transformers
import torch

model_id = "THU-KEG/ADELIE-DPO-1.5B"

pipeline = transformers.pipeline(
    "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
)
pipeline("Hey how are you doing today?")

The error:

ValueError: Could not load model THU-KEG/ADELIE-DPO-1.5B with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForCausalLM'>, <class 'transformers.models.qwen2.modeling_qwen2.Qwen2ForCausalLM'>). See the original errors:
...
ValueError: The current `device_map` had weights offloaded to the disk. Please provide an `offload_folder` for them. Alternatively, make sure you have `safetensors` installed if the model you are using offers the weights in this format.

(I do have safetensors installed)

Using infer.py

Yup, that's exactly what I tried (inasmuch as it is what's called by the test). I also did a bunch of stuff to try to add comments and logging and see what was going on and what might be causing the problem, but struck out. I'm happy to transition to using the HF model, I just need help with how, exactly, to leverage it.

Again, big thanks for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants