How to combine ChatGPT and Charactr API to get a voice response?

The main idea is to send a request to ChatGPT to get a text response and then to send this output as input to Charactr TTS. We can do it in a few small steps:

  1. First install the charactr-api-sdk library to be able to use gemelo.ai API for gemelo.ai TTS and openai library to be able to use OpenAI API for Whisper and ChatGPT.
pip install charactr-api-sdk openai

Install also other required libraries.

pip install requests ipython
  1. Load all required libraries.
import json
from typing import Dict, List

import IPython.display
import openai
import requests
from charactr_api import CharactrAPISDK, Credentials
  1. Set up your keys for OpenAI and gemelo.ai API so that you can connect to them with your account.
openai_api_key = 'xxxx'
charactr_client_key = 'yyyy'
charactr_api_key = 'zzzz'

openai.api_key = openai_api_key
  1. Create CharactrAPISDK instance using your gemelo.ai keys. It allows you to use TTS and choose a voice.
credentials = Credentials(client_key=charactr_client_key, api_key=charactr_api_key)
charactr_api = CharactrAPISDK(credentials)
  1. Check the list of available voices to choose one of them.
charactr_api.tts.get_voices()

You get the list of available voices as output. Choose a voice and set up voice_id - this voice will be used to generate a voice response.

voice_id = 136
  1. Set up a model and other parameters for ChatGPT. The list of all parameters and their explanations you can find here: https://platform.openai.com/docs/api-reference/chat/create
model = 'gpt-3.5-turbo'
parameters = {
    'temperature': 0.8,
    'max_tokens': 150,
    'top_p': 1,
    'presence_penalty': 0,
    'frequency_penalty': 0,
    'stop': None
}

Define a function to generate a text response with ChatGPT using above parameters.

def generate(request: str) -> str:
    """Generate a text response with ChatGPT."""
    messages = [{'role': 'user', 'content': request}]
    result = openai.ChatCompletion.create(model=model,
                                          messages=messages,
                                          **parameters)
    try:
        response = result['choices'][0]['message']['content'].strip()
    except Exception as e:
        raise Exception(e)
    return response
  1. Type your request to ChatGPT.
text = 'Tell me a joke'
  1. Generate a voice response with ChatGPT and gemelo.ai API.
response = generate(text)
tts_result = charactr_api.tts.convert(voice_id, response)

It returns Audio object that contains fields: data, type, duration_ms, size_bytes. To listen to the output voice response in a notebook, run:

IPython.display.Audio(tts_result['data'])

You can also save the output voice response as a file.

with open('output.wav', 'wb') as f:
    f.write(tts_result['data'])