Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: prompt generation strategy #6010

Open
wants to merge 9 commits into
base: feat/apex-oas
Choose a base branch
from

Conversation

mingxuanzhangsfdx
Copy link
Member

@mingxuanzhangsfdx mingxuanzhangsfdx commented Jan 16, 2025

What does this PR do?

Implement a bidder for prompt generation based on different strategies.

What issues does this PR fix or reference?

@W-17587267@

Functionality Before

The prompt generation is hard-coded with little flexibility.

Functionality After

Prompt generation will pick a strategy based on the effectiveness and pre-set preference.

@@ -0,0 +1,128 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this file out of the the commands folder. Maybe to src/oas, or something like that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I would like to see the two strategies exposed as distinct classes, with a proper interface defined for them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public interface GenerationStratey {
  bid (..): GenerationStrategyBid;
  generate: (..): GenerationResult;
  getBid: (): GenerationStrategyBid;
  getPromptTokenCount(): number;
}

Copy link
Contributor

@peternhale peternhale Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public class FooStrategy implements GenerationStrategy {
  public static bid (..): GenerationStrategyBid {
    ...
    return this;
  }  
  public generate: (..): GenerationResult {
    ...
    return fooResults;
  }  
  public getBid: (): GenerationStrategyBid {
    return this.bid
  } 
  getPromptTokenCount(): number {
    return return Math.floor(this.bid.prompt.length / 4);
  }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const bidders = [
  KitchenSinkStrategy.bid,
  VeryPickyStrategy.bid,
...
  FooStrategy.bid
];

...
const bids = await Promise.all(bidders.map(bidMethod => bidMethod()));

BAD_CLASS
}

export const PROMPT_TOKEN_MAX_LIMIT = 14 * 1024;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use 95% of these two values as the limits we will impose. The 95% size is the limit being imposed by callLLM.
Don't forget the sum of prompt and response tokens must be < 16K * 95%

const END_OF_PROMPT_TAG = '<|endofprompt|>';
const USER_TAG = '<|user|>';
const ASSISTANT_TAG = '<|assistant|>';
const systemPrompt = `
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think about how we can tune these prompts without having to rebuild.

return input;
}

public getTokenCount(prompt: string): number {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public getTokenCount(prompt: string): number {
public getPromptTokenCount(prompt: string): number {

return [GenerationStrategy.KITCHEN_SINK, 1];
} else {
let count = 0;
for (const method of context.methods) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe context.methods will include all methods in the class, this should be methods enumerated by eligibility check.

}
}

public generatePrompt(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be deligated to a strategy class

const promptKitchenSink = this.generatePrompt(GenerationStrategy.KITCHEN_SINK, metadata, context);
const tokenCountKitchenSink = this.getTokenCount(promptKitchenSink);
if (tokenCountKitchenSink <= PROMPT_TOKEN_MAX_LIMIT) {
return [GenerationStrategy.KITCHEN_SINK, 1];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we formalize the return type into its own type?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 would recommend creating interface GenerationStrategy and concrete classes responsible to the generation logic inside them. So we can keep the Orchestrator simple and add as many strategies as we need.

Ensure no sensitive details are included. Decline requests unrelated to OpenAPI v3 specs or asking for sensitive information.`;

export class promptGenerationOrchestrator {
public pickGenerationStrategy(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's imagine there are 50 strategies and how to handle finding the best bid.
The most direct approach would be to call each one, collect the bids and then determine which of those wins.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you thinking something like factory pattern or something a little more involved where bidding happens in parallel?

const END_OF_PROMPT_TAG = '<|endofprompt|>';
const USER_TAG = '<|user|>';
const ASSISTANT_TAG = '<|assistant|>';
const systemPrompt = `
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be coming from a config file of some sort?

console.log('document text = ' + documentText);

const userPrompt = // to be adjusted for rest resource class only
'Generate an OpenAPI v3 specification for the following Apex class. The OpenAPI v3 specification should be a YAML file. The paths should be in the format of /{ClassName}/{MethodName} for all the @AuraEnabled methods specified. For every `type: object`, generate a `#/components/schemas` entry for that object. The method should have a $ref entry pointing to the generated `#/components/schemas` entry. Only include methods that have the @AuraEnabled annotation in the paths of the OpenAPI v3 specification. I do not want AUTHOR_PLACEHOLDER in the result.';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect these might evolve quicker than the code itself, may be good to keep them in some config file or a constant.

} from '../openApiUtilities/schemas';

enum GenerationStrategy {
KITCHEN_SINK,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we go ahead and call these: WHOLE_CLASS, METHOD_BY_METHOD and INVALID or something like that?

Ensure no sensitive details are included. Decline requests unrelated to OpenAPI v3 specs or asking for sensitive information.`;

export class promptGenerationOrchestrator {
public pickGenerationStrategy(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you thinking something like factory pattern or something a little more involved where bidding happens in parallel?

@@ -0,0 +1,7 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please combine this with whole class json into a single json file keyed by strategy name

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the plan to expose these to the user?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will help us maintain and tweak the prompt. We declared the prompt info in typescript in local scope only before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also hesitating to introduce json files since it might introduce extra maintenance concern considering that the string format is a bit different from that in a typescript file, and json file is not that native as a javascript file during esbuild bundling. But I think it does not hurt for now before we clarify the next step.

What kinds of options are you considering to use to support various templates? Like environment variables, or VSCode configurations... Is the option of various templates only for our exploration to find the best prompt and will we keep the best prompt template as the default one if we are going to have a "template library" when it comes to TDX?

`${generalPrompt.SYSTEM_TAG}\n${generalPrompt.systemPrompt}\n${generalPrompt.END_OF_PROMPT_TAG}\n${generalPrompt.USER_TAG}\n` +
wholeClassPrompt.USER_PROMPT +
'\nThis is the Apex class the OpenAPI v3 specification should be generated for:\n```\n' +
documentText +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should account for the ability to apply additional context.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree and what do you think of supporting the flexibility in another story?

@@ -97,3 +97,13 @@ export enum ApexOASResource {
// method: ApexMethodOASFilter;
// property: ApexPropertyOASFilter;
// };

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we no longer need the comment code above, please remove it.

eslint.config.mjs Show resolved Hide resolved
@mingxuanzhangsfdx mingxuanzhangsfdx marked this pull request as ready for review January 17, 2025 18:35
@mingxuanzhangsfdx mingxuanzhangsfdx requested a review from a team as a code owner January 17, 2025 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants