Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StreamCompletionEnumerableAsync doesn't work with audio #398

Open
vbandi opened this issue Jan 10, 2025 · 0 comments · May be fixed by #399
Open

StreamCompletionEnumerableAsync doesn't work with audio #398

vbandi opened this issue Jan 10, 2025 · 0 comments · May be fixed by #399
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@vbandi
Copy link

vbandi commented Jan 10, 2025

Bug Report

Overview

The gpt-4o-audio-preview-2024-12-17 model allows for audio as an output modality. This doesn't seem to work.

To Reproduce

Steps to reproduce the behavior:

The code below doesn't have any Delta or any data in the chunks.

async Task GPTSpeech()
{
    var client = new OpenAIClient();
    var speaker = new SpeakerOutput();

    var chatRequest = new ChatRequest([new Message(Role.System, "Count from 1 to 10. Whisper please")],
        audioConfig: new AudioConfig(Voice.Nova), model: "gpt-4o-audio-preview-2024-12-17");  // Doesn't seem to work... OpenAI Lib issue??
        
    await foreach (var chunk in client.ChatEndpoint.StreamCompletionEnumerableAsync(chatRequest))
    {
        if (chunk.FirstChoice.Delta is not null)
            Console.Write(chunk.FirstChoice.Delta.Content);

        if (chunk.FirstChoice.Message?.AudioOutput is not null)
            Console.WriteLine(chunk.FirstChoice.Message.AudioOutput.Data.Length);
    }

    Console.WriteLine("Done.");
    Console.ReadKey();

}

However, when not providing audio in the ChatRequest, this still works:

    var chatRequest = new ChatRequest([new Message(Role.System, "Count from 1 to 10. Whisper please")]);

Expected behavior

Chunks should contain text and / or audio content when the model is generating audio

@vbandi vbandi added the bug Something isn't working label Jan 10, 2025
@StephenHodgson StephenHodgson added help wanted Extra attention is needed good first issue Good for newcomers enhancement New feature or request and removed bug Something isn't working labels Jan 10, 2025
@StephenHodgson StephenHodgson self-assigned this Jan 10, 2025
@StephenHodgson StephenHodgson linked a pull request Jan 11, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Development

Successfully merging a pull request may close this issue.

2 participants