Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add profile to use ollama + bump quarkus and langchain4j versions #46

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

mariofusco
Copy link

I added a maven profile to optionally run all steps using a local ollama server instead of connecting to OpenAI. I believe this could be a good alternative for users not having an OpenAI account. If required the same thing could be done also integrating Jlama for instance.

If you don't want to change all steps, another option could be adding one more step at the end of the workshop, also showing how to integrate a different server. Let me know which of the 2 possibilities you prefer and I will also update the documentation accordingly.

/cc @kdubois

@@ -22,6 +23,7 @@ public String onOpen() {
}

@OnTextMessage
@ActivateRequestContext
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This annotation now seems to be necessary both to execute queries not marked with @Transactional and to integrate the guardrail. I guess this is a consequence of the Quarkus version bump, but I'm not sure if this is now expected or a bug/regression that needs to be investigated. /cc @geoand

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is weird, I am not aware of a change that should not require this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm further digging into this and found that this is something that changed between 3.16.4 and 3.17.0 but I still don't know which specific commit caused this. I will keep investigating since at this point I believe that this is a regression or at least an unwanted side effect. I will appreciate any help or hint on this, anyway running this example, without that annotation, on 3.17.x produces the following exception:

2025-01-03 09:34:43,209 ERROR [dev.lan.qua.wor.CustomerSupportAgentWebSocket] (vert.x-worker-thread-1) Error calling the LLM: RequestScoped context was not active when trying to obtain a bean instance for a client proxy of CLASS bean [class=dev.langchain4j.quarkus.workshop.PromptInjectionDetectionService$$QuarkusImpl, id=l6z9IEVbH0Wg6OZgMapyH0p7Bck]
        - you can activate the request context for a specific method using the @ActivateRequestContext interceptor binding

Exception in CustomerSupportAgentWebSocket.java:29
          27      public String onTextMessage(String message) {
          28          try {
        → 29              return customerSupportAgent.chat(message);
          30          } catch (GuardrailException e) {
          31              Log.errorf(e, "Error calling the LLM: %s", e.getMessage());



Exception in PromptInjectionGuard.java:19
          17      @Override
          18      public InputGuardrailResult validate(UserMessage userMessage) {
        → 19          double result = service.isInjection(userMessage.singleText());
          20          if (result > 0.7) {
          21              return failure("Prompt injection detected");



Exception in CustomerSupportAgentWebSocket.java:29
          27      public String onTextMessage(String message) {
          28          try {
        → 29              return customerSupportAgent.chat(message);
          30          } catch (GuardrailException e) {
          31              Log.errorf(e, "Error calling the LLM: %s", e.getMessage());

: io.quarkiverse.langchain4j.runtime.aiservice.GuardrailException: RequestScoped context was not active when trying to obtain a bean instance for a client proxy of CLASS bean [class=dev.langchain4j.quarkus.workshop.PromptInjectionDetectionService$$QuarkusImpl, id=l6z9IEVbH0Wg6OZgMapyH0p7Bck]
        - you can activate the request context for a specific method using the @ActivateRequestContext interceptor binding
        at io.quarkiverse.langchain4j.runtime.aiservice.GuardrailsSupport.invokeInputGuardrails(GuardrailsSupport.java:47)
        at io.quarkiverse.langchain4j.runtime.aiservice.AiServiceMethodImplementationSupport.doImplement(AiServiceMethodImplementationSupport.java:239)
        at io.quarkiverse.langchain4j.runtime.aiservice.AiServiceMethodImplementationSupport.implement(AiServiceMethodImplementationSupport.java:131)
        at dev.langchain4j.quarkus.workshop.CustomerSupportAgent$$QuarkusImpl.chat(Unknown Source)
        at dev.langchain4j.quarkus.workshop.CustomerSupportAgent$$QuarkusImpl_ClientProxy.chat(Unknown Source)
        at dev.langchain4j.quarkus.workshop.CustomerSupportAgentWebSocket.onTextMessage(CustomerSupportAgentWebSocket.java:29)
        at dev.langchain4j.quarkus.workshop.CustomerSupportAgentWebSocket_WebSocketServerEndpoint.doOnTextMessage(Unknown Source)
        at io.quarkus.websockets.next.runtime.WebSocketEndpointBase$4.call(WebSocketEndpointBase.java:158)
        at io.quarkus.websockets.next.runtime.WebSocketEndpointBase$4.call(WebSocketEndpointBase.java:152)
        at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$4(ContextImpl.java:192)
        at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:270)
        at io.vertx.core.impl.ContextImpl$1.execute(ContextImpl.java:221)
        at io.vertx.core.impl.WorkerTask.run(WorkerTask.java:56)
        at org.jboss.threads.ContextHandler$1.runWith(ContextHandler.java:18)
        at org.jboss.threads.EnhancedQueueExecutor$Task.doRunWith(EnhancedQueueExecutor.java:2675)
        at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2654)
        at org.jboss.threads.EnhancedQueueExecutor.runThreadBody(EnhancedQueueExecutor.java:1627)
        at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1594)
        at org.jboss.threads.DelegatingRunnable.run(DelegatingRunnable.java:11)
        at org.jboss.threads.ThreadLocalResettingRunnable.run(ThreadLocalResettingRunnable.java:11)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: jakarta.enterprise.context.ContextNotActiveException: RequestScoped context was not active when trying to obtain a bean instance for a client proxy of CLASS bean [class=dev.langchain4j.quarkus.workshop.PromptInjectionDetectionService$$QuarkusImpl, id=l6z9IEVbH0Wg6OZgMapyH0p7Bck]
        - you can activate the request context for a specific method using the @ActivateRequestContext interceptor binding
        at io.quarkus.arc.impl.ClientProxies.notActive(ClientProxies.java:70)
        at io.quarkus.arc.impl.ClientProxies.getSingleContextDelegate(ClientProxies.java:30)
        at dev.langchain4j.quarkus.workshop.PromptInjectionDetectionService$$QuarkusImpl_ClientProxy.arc$delegate(Unknown Source)
        at dev.langchain4j.quarkus.workshop.PromptInjectionDetectionService$$QuarkusImpl_ClientProxy.isInjection(Unknown Source)
        at dev.langchain4j.quarkus.workshop.PromptInjectionGuard.validate(PromptInjectionGuard.java:19)
        at io.quarkiverse.langchain4j.guardrails.InputGuardrail.validate(InputGuardrail.java:40)
        at io.quarkiverse.langchain4j.guardrails.InputGuardrail.validate(InputGuardrail.java:14)
        at dev.langchain4j.quarkus.workshop.PromptInjectionGuard_ClientProxy.validate(Unknown Source)
        at io.quarkiverse.langchain4j.runtime.aiservice.GuardrailsSupport.guardrailResult(GuardrailsSupport.java:188)
        at io.quarkiverse.langchain4j.runtime.aiservice.GuardrailsSupport.invokeInputGuardRails(GuardrailsSupport.java:181)
        at io.quarkiverse.langchain4j.runtime.aiservice.GuardrailsSupport.invokeInputGuardrails(GuardrailsSupport.java:43)
        ... 21 more

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkouba It seems you worked a lot in that area, do you know something about this issue?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, the request context is always activated during invocation of an @OnTextMessage callback. Is it executed on the same thread? I mean the PromptInjectionGuard bean...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the callback is executed on the same thread, so this is not the problem. Also the 2 versions of Quarkus that I'm testing (3.16.4 and 3.17.5) use the same version of vertx (4.5.11), so the problem is not even there. However when I run the example with Quarkus 3.16 the vertx context contains the following

result = {ConcurrentHashMap@21699}  size = 6
 "io.quarkus.websockets.next.runtime.WebSocketConnectionBase" -> {WebSocketConnectionImpl@21711} "WebSocket connection [endpointId=dev.langchain4j.quarkus.workshop.CustomerSupportAgentWebSocket, path=/customer-support-agent, id=aeef3fe3-ddb8-4893-9dcf-23a31a07c353]"
 "aiservice.methodname" -> "chat"
 {Object@21714}  -> {Boolean@21715} true
 "io.quarkus.vertx.cdi-current-contextjakarta.enterprise.context.SessionScoped" -> {WebSocketSessionContext$SessionContextState@21717} 
 "aiservice.classname" -> "dev.langchain4j.quarkus.workshop.CustomerSupportAgent"
 "io.quarkus.vertx.cdi-current-contextjakarta.enterprise.context.RequestScoped" -> {RequestContext$RequestContextState@21720} 

while with 3.17 the last entry in this map, the one requested in the PromptInjectionGuard case, is missing thus causing the problem. I still don't know why, just brainstorming my findings here while keep investigating on this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, found it: quarkusio/quarkus#44323

See also https://quarkus.io/guides/websockets-next-reference#request-context

It might be that in your case a request scoped dependency is looked up programatically (i.e. no @Inject). In this case, you can set quarkus.websockets-next.server.activate-request-context to always.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkouba Thanks a lot for clarification. Properly injecting PromptInjectionDetectionService fixed the problem.

@cescoffier I added a second commit also implementing the fix suggested by Martin.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the original error message was:

Caused by: jakarta.enterprise.context.ContextNotActiveException: RequestScoped context was not active when trying to obtain a bean instance for a client proxy of CLASS bean 
[class=dev.langchain4j.quarkus.workshop.PromptInjectionDetectionService$$QuarkusImpl, id=l6z9IEVbH0Wg6OZgMapyH0p7Bck]
        - you can activate the request context for a specific method using the @ActivateRequestContext interceptor binding

PromptInjectionDetectionService$$QuarkusImpl is a bean class generated by quarkus-langchain4j. And the @RequestScoped was added automatically.

I looked at the second commit and the fix should not make any difference because simplified constructor injection is still a regular injection.

In other words, it's more likely the @ApplicationScoped added to the PromptInjectionDetectionService that fixed the problem - quarkus-langchain4j probably honors the scope annotation when generating the
PromptInjectionDetectionService$$QuarkusImpl class.

Back to the original problem - WS next analyzes the endpoint and its dependency tree and if no request scoped bean is found the request context is not activated. This means that PromptInjectionDetectionService$$QuarkusImpl is not detected as a part of the dependency tree.

I believe that this is the problematic part:
https://github.com/quarkiverse/quarkus-langchain4j/blob/main/core/runtime/src/main/java/io/quarkiverse/langchain4j/runtime/aiservice/GuardrailsSupport.java#L188

this.service = service;
}
@Inject
PromptInjectionDetectionService service;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, actually this should not make any difference because simplified constructor injection is still a regular injection 🤔

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you're right and what does make the difference is having added the @ApplicationScoped annotation on the PromptInjectionDetectionService, correct?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you're right and what does make the difference is having added the @ApplicationScoped annotation on the PromptInjectionDetectionService, correct?

Yes, I think so ;-)

#46 (comment)

@kdubois
Copy link
Contributor

kdubois commented Jan 7, 2025

I added a maven profile to optionally run all steps using a local ollama server instead of connecting to OpenAI. I believe this could be a good alternative for users not having an OpenAI account. If required the same thing could be done also integrating Jlama for instance.

If you don't want to change all steps, another option could be adding one more step at the end of the workshop, also showing how to integrate a different server. Let me know which of the 2 possibilities you prefer and I will also update the documentation accordingly.

/cc @kdubois

After discussing with @mariofusco I think the best option would be to add ollama (or Podman AI Lab) as an optional step at the end... perhaps something we can also just demo. Otherwise the attendees will have the added hurdle of needing to install an external tool (ollama) and then download a model which isn't ideal on conference wifi.

@mariofusco
Copy link
Author

I added a maven profile to optionally run all steps using a local ollama server instead of connecting to OpenAI. I believe this could be a good alternative for users not having an OpenAI account. If required the same thing could be done also integrating Jlama for instance.
If you don't want to change all steps, another option could be adding one more step at the end of the workshop, also showing how to integrate a different server. Let me know which of the 2 possibilities you prefer and I will also update the documentation accordingly.
/cc @kdubois

After discussing with @mariofusco I think the best option would be to add ollama (or Podman AI Lab) as an optional step at the end... perhaps something we can also just demo. Otherwise the attendees will have the added hurdle of needing to install an external tool (ollama) and then download a model which isn't ideal on conference wifi.

+1 I will rework this pull request to have ollama configuration as the last (and optional) step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants