refactor -> select_model(functional) #468

michaelfeil · 2024-11-16T18:22:29Z

Related Issue

Checklist

I have read the CONTRIBUTING guidelines.
I have added tests to cover my changes.
I have updated the documentation (docs folder) accordingly.

Additional Notes

Add any other context about the PR here.

greptile-apps

PR Summary

This PR refactors the model selection and batch handling system to improve multiprocessing capabilities and support for new models like nomic-embed-text-v1.5.

Replaced direct model instantiation with factory functions in /libs/infinity_emb/infinity_emb/inference/batch_handler.py for better multiprocessing support
Added CallableReturningBaseTypeHint Protocol in /libs/infinity_emb/infinity_emb/transformer/abstract.py to improve type safety
Simplified select_model() in /libs/infinity_emb/infinity_emb/inference/select_model.py to return callable engine functions instead of tuples with timing info
Added tiktoken as required dependency in /libs/infinity_emb/pyproject.toml for nomic-embed-text-v1.5 support
Modified telemetry in /libs/infinity_emb/infinity_emb/infinity_server.py to use empty dicts instead of engine capabilities

_{9 file(s) reviewed, 8 comment(s)}
_{Edit PR Review Bot Settings | Greptile}

libs/infinity_emb/infinity_emb/inference/batch_handler.py

libs/infinity_emb/infinity_emb/inference/select_model.py

greptile-apps · 2024-11-16T18:24:01Z

libs/infinity_emb/infinity_emb/inference/select_model.py


-    # TODO: Can be parallelized
    for device_map in engine_args._loading_strategy.device_mapping:  # type: ignore


style: type: ignore on device_mapping access should be replaced with proper type annotation

greptile-apps · 2024-11-16T18:24:02Z

libs/infinity_emb/infinity_emb/inference/select_model.py

    assert len(engine_replicas) > 0, "No engine replicas were loaded"

-    return engine_replicas, min_inference_t, max_inference_t
+    return engine_replicas  # type: ignore


style: type: ignore on return is unnecessary since return type matches annotation

greptile-apps · 2024-11-16T18:24:02Z

libs/infinity_emb/infinity_emb/inference/select_model.py

@@ -57,55 +61,51 @@ def get_engine_type_from_config(
        return EmbedderEngine.from_inference_engine(engine_args.engine)


+def _get_engine_replica(unloaded_engine, engine_args, device_map) -> "BaseTypeHint":


style: function lacks type hints for unloaded_engine and device_map parameters

greptile-apps · 2024-11-16T18:24:30Z

libs/infinity_emb/infinity_emb/infinity_server.py

        )
        telemetry_log_info()
        app.engine_array = AsyncEngineArray.from_args(engine_args_list)  # type: ignore
        th = threading.Thread(
            target=send_telemetry_start,
-            args=(engine_args_list, [e.capabilities for e in app.engine_array]),  # type: ignore
+            args=(engine_args_list, [{} for e in app.engine_array]),  # type: ignore


logic: Passing empty dictionaries instead of actual engine capabilities will result in loss of telemetry data about model capabilities

greptile-apps · 2024-11-16T18:25:24Z

libs/infinity_emb/pyproject.toml

@@ -56,6 +56,7 @@ diskcache = {version = "*", optional=true}
 onnxruntime-gpu = {version = "1.19.*", optional=true} 
 tensorrt = {version = "^10.6.0", optional=true}
 soundfile = {version="^0.12.1", optional=true}
+tiktoken = "^0.8.0"


logic: tiktoken should be marked as optional since it's only needed for specific models. Add optional=true to the dependency.

greptile-apps · 2024-11-16T18:26:14Z

libs/infinity_emb/tests/unit_test/inference/test_select_model.py

@@ -16,3 +16,4 @@ def test_engine(engine):
            model_warmup=False,
        )
    )
+    [model_func() for model_func in model_funcs]


style: Consider catching potential exceptions when calling model functions - initialization could fail for various reasons

codecov-commenter · 2024-11-16T18:28:22Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 89.18919% with 8 lines in your changes missing coverage. Please review.

Project coverage is 79.56%. Comparing base (8ac0b3c) to head (80d65be).
Report is 33 commits behind head on main.

Files with missing lines	Patch %	Lines
...finity_emb/infinity_emb/inference/batch_handler.py	90.90%	3 Missing ⚠️
...nfinity_emb/infinity_emb/inference/select_model.py	90.00%	2 Missing ⚠️
libs/infinity_emb/infinity_emb/engine.py	66.66%	1 Missing ⚠️
.../infinity_emb/infinity_emb/transformer/abstract.py	75.00%	1 Missing ⚠️
...ty_emb/infinity_emb/transformer/embedder/neuron.py	50.00%	1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #468      +/-   ##
==========================================
+ Coverage   79.51%   79.56%   +0.05%     
==========================================
  Files          41       41              
  Lines        3417     3441      +24     
==========================================
+ Hits         2717     2738      +21     
- Misses        700      703       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

refactor -> select_model(functional)

3efc07c

greptile-apps bot reviewed Nov 16, 2024

View reviewed changes

add better typing

80d65be

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor -> select_model(functional) #468

refactor -> select_model(functional) #468

michaelfeil commented Nov 16, 2024

greptile-apps bot left a comment

greptile-apps bot Nov 16, 2024

greptile-apps bot Nov 16, 2024

greptile-apps bot Nov 16, 2024

greptile-apps bot Nov 16, 2024

greptile-apps bot Nov 16, 2024

greptile-apps bot Nov 16, 2024

codecov-commenter commented Nov 16, 2024 •

edited

Loading


		# TODO: Can be parallelized
		for device_map in engine_args._loading_strategy.device_mapping: # type: ignore

		@@ -57,55 +61,51 @@ def get_engine_type_from_config(
		return EmbedderEngine.from_inference_engine(engine_args.engine)


		def _get_engine_replica(unloaded_engine, engine_args, device_map) -> "BaseTypeHint":

refactor -> select_model(functional) #468

Are you sure you want to change the base?

refactor -> select_model(functional) #468

Conversation

michaelfeil commented Nov 16, 2024

Related Issue

Checklist

Additional Notes

greptile-apps bot left a comment

Choose a reason for hiding this comment

PR Summary

greptile-apps bot Nov 16, 2024

Choose a reason for hiding this comment

greptile-apps bot Nov 16, 2024

Choose a reason for hiding this comment

greptile-apps bot Nov 16, 2024

Choose a reason for hiding this comment

greptile-apps bot Nov 16, 2024

Choose a reason for hiding this comment

greptile-apps bot Nov 16, 2024

Choose a reason for hiding this comment

greptile-apps bot Nov 16, 2024

Choose a reason for hiding this comment

codecov-commenter commented Nov 16, 2024 • edited Loading

Codecov Report

codecov-commenter commented Nov 16, 2024 •

edited

Loading