-
Notifications
You must be signed in to change notification settings - Fork 988
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Knowledge has an incompatible new v3 file format #1253
Comments
This is part of #160 The changes here originated from aakankshaduggal@5baf6df There are two major changes here. - When parsing a `qna.yaml` file from a taxonomy tree, adjust for the new schema for knowledge. There is no attempt to maintain compatibility with prior versions of the schema (v1, v2). - Change how we translate the taxonomy data into the dataset sent into the pipeline as input. Instead of implementing a sliding window approach of 3 sample qna pairs at a time over all chunks of the document, we now create a row per seed_example (context and associated qna pairs) for each chunk of knowledge docs. Co-authored-by: abhi1092 <[email protected]> Co-authored-by: shiv <[email protected]> Co-authored-by: Aakanksha Duggal <[email protected]> Signed-off-by: Russell Bryant <[email protected]>
We have existing v1 knowledge in the main branch which needs to be fixed or removed. |
xref #1260 |
v3 example: #1255 |
From @juliadenham 👍
|
We now support V3 knowledge, yay! |
See instructlab/sdg#160
A new v3 knowledge format has been added to InstructLab, with no backwards compatibility for v1 or v2 contributions - this till be released in InstructLab v0.18.0.
Existing knowledge contributions need to be updated, along with any documentation on creating knowledge contributions.
https://github.com/instructlab/instructlab/blob/main/scripts/test-data/e2e-qna-knowledge.yaml is an example of the new format
The text was updated successfully, but these errors were encountered: