-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement experimental ai-powered search #398
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your great contribution!
I have provided a quick initial review for now, will continue once current changes are solved.
test/utils/client.dart
Outdated
String? get openAiKey { | ||
return Platform.environment['OPEN_AI_API_KEY']; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will not work if the tests are run on web
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest also adding documentation in CONTRIBUTING.md for users who want to run tests to set this env variable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the code to cover flutter web, and updated the documentation in CONTRIBUTING.md file
required this.source, | ||
}); | ||
|
||
Map<String, Object?> toMap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since source
is shared between all embedders, this can be changed to:
Map<String, Object?> toMap(); | |
Map<String, Object?> toMap() { | |
'source': source, | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the code, now it is mapped only once in the base class as you described
|
||
@override | ||
Map<String, Object?> toMap() => { | ||
'source': source, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for all embedders: this can now be changed to
'source': source, | |
...super.toMap(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
lib/src/settings/embedder.dart
Outdated
'userProvided' => UserProvidedEmbedder.fromMap(map), | ||
'rest' => RestEmbedder.fromMap(map), | ||
'ollama' => OllamaEmbedder.fromMap(map), | ||
_ => throw Exception('unexpected embedder source'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be a breaking change if meilisearch adds new embedder versions, so I suggest adding an UnkownEmbedder
class that stores the map as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this makes more sense, however, I guess I'll need to update the structure a bit. It seems that I need to either remove source
field in the base class or make it nullable as unknown embedder's source
field might not be defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the code, now we return unknown embedder with the original data when the source is not handled.
lib/src/settings/embedder.dart
Outdated
this.url, | ||
this.documentTemplateMaxBytes, | ||
this.binaryQuantized, | ||
}) : super(source: 'openAi'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for all embedders: we are typing the source twice, which might lead to typos, once here and once in Embedder.fromMap
, we should probably unify this in a static const sourceValue = 'openAi';
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the code, now it is typed only once
lib/src/settings/distribution.dart
Outdated
Map<String, Object?> toMap() => { | ||
'mean': mean, | ||
'sigma': sigma, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this is used in query parameters, it should automatically exclude null values (as per meilisearch conventions), the logic for this is already present in Queryable
class, so it can be uesd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Distribution isn't used in the query parameters; it is a field of some embedders, and the fields in this class aren't actually nullable.
Hi, thanks for the good feedback. I'll be updating the code based on your comments soon. |
d3380ce
to
a3d03d3
Compare
@ahmednfwela any update here? |
I will do another review now, thanks for your contribution! @memishood |
@memishood since the fork belongs to an organization I can't directly push changes to your PR, so can you please merge main into this ? |
test/search_test.dart
Outdated
@@ -590,6 +591,60 @@ void main() { | |||
// }); | |||
// }); | |||
|
|||
group('Embedders', () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests need to be disabled if the openAI API key was not present, so that CI can pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, i have updated it
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #398 +/- ##
==========================================
- Coverage 75.83% 75.19% -0.64%
==========================================
Files 55 60 +5
Lines 1320 1524 +204
==========================================
+ Hits 1001 1146 +145
- Misses 319 378 +59 ☔ View full report in Codecov by Sentry. |
I have started a PR of my own suggested edits here: Phosum#1 |
improvements to "implement experimental ai-powered search meilisearch#398"
one last change to fix the test and we can merge 👍 |
Co-authored-by: Ahmed Fwela <[email protected]>
bors merge |
398: implement experimental ai-powered search r=ahmednfwela a=memishood # Pull Request ## Related issue Fixes #369 ## What does this PR do? I've implemented all the required steps mentioned in the issue above. I made every effort to respect the existing project conventions. I hope this pull request will be helpful and add value to Meilisearch 💜 **PR:** I didn't uncomment vector search tests in search_tests.dart as I'm focused on embedders capability. ### Some technical points that I might have considered wrong in the pull-req - Ensure `semanticRatio` is between 0.0 and 1.0 - Throw exception when embedder source is not one of them: `openAi`, `huggingFace`, `userProvided`, `rest`, `ollama` **Any comment & feedback is appreciated!** ## PR checklist Please check if your PR fulfills the following requirements: - [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [x] Have you read the contributing guidelines? - [x] Have you made sure that the title is accurate and descriptive of the changes? Thank you so much for contributing to Meilisearch! Co-authored-by: Emre Memis <[email protected]> Co-authored-by: ahmednfwela <[email protected]> Co-authored-by: Emre <[email protected]>
This PR was included in a batch that successfully built, but then failed to merge into main. It will not be retried. Additional information: {"message":"1 review requesting changes by reviewers with write access.","documentation_url":"https://docs.github.com/articles/about-protected-branches","status":"422"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
bors merge |
Build succeeded: |
407: fix: rename current_mean and current_sigma r=curquiza a=ahmednfwela # Pull Request ## Related issue as discussed on discord, the openapi file incorrectly contains `current_sigma` and `current_mean` when they should just be `sigma` and `mean` ## What does this PR do? fixes #398 ## PR checklist Please check if your PR fulfills the following requirements: - [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [ ] Have you read the contributing guidelines? - [ ] Have you made sure that the title is accurate and descriptive of the changes? Thank you so much for contributing to Meilisearch! Co-authored-by: ahmednfwela <[email protected]>
Pull Request
Related issue
Fixes #369
What does this PR do?
I've implemented all the required steps mentioned in the issue above. I made every effort to respect the existing project conventions. I hope this pull request will be helpful and add value to Meilisearch 💜
PR: I didn't uncomment vector search tests in search_tests.dart as I'm focused on embedders capability.
Some technical points that I might have considered wrong in the pull-req
semanticRatio
is between 0.0 and 1.0openAi
,huggingFace
,userProvided
,rest
,ollama
Any comment & feedback is appreciated!
PR checklist
Please check if your PR fulfills the following requirements:
Thank you so much for contributing to Meilisearch!