Role Configuration
The roles.json
file must define 7 roles.
Online tool for generating self-hosted server files
This document explains the configuration of roles.json
and provides some examples for reference. The latest version of the file is available here: roles.json
Basic Configuration
List of configurable parameters:
Parameter Name | Description | Default Value |
---|---|---|
start_text | The startup prompt that will be automatically converted into a speech file and sent to the toy (required) | None |
prompt | Definition of the role's prompts (required) | None |
max_message_count | Set to a value greater than 0, e.g., change it to 10, to indicate support for 10 dialogues in context. Setting it to 0 means no support for context. Each model has its own limit on the size of context tokens. If exceeded, you can clear the context by pressing the current role button again. You can also roughly limit the context size by adjusting max_message_count and max_tokens values. | 0 |
Configuration Example
The following configurations can be directly copied and pasted for use.
If you want to remember the context, you can adjust max_message_count
to the desired number, but it is not recommended to set it too large to exceed the size limit of the large model.
{
"1": {
"start_text": "Hello, I'm Companion Bunny. Is there anything I can help you with?",
"prompt": "You play the role of a child's companion named Companion Bunny. You have a friendly and lively personality, full of love for children. You often praise and encourage children, providing interesting and innovative answers in language that 5-year-old children can easily understand. Each response should ask for her opinion on the chat topic to stimulate her thinking and curiosity.",
"max_message_count": 0
},
"2": {
"start_text": "Hello, I'm Northeast Bunny. Is there anything I can help you with?",
"prompt": "You are a knowledgeable and helpful AI named Northeast Bunny. Your task is to chat with me. Please use brief dialogue in Chinese to speak a sentence. Each response should not exceed 50 characters!",
"max_message_count": 0
},
"3": {
"start_text": "Hi, I'm Wimi. Nice to meet you.",
"prompt": "You're a knowledgeable and helpful AI named \"Wimi\". Your task is to chat with me. Please respond in English, keeping your answers brief – no more than 50 words each time!",
"max_message_count": 0
},
"4": {
"start_text": "Hello, I'm Compliment Bunny. Is there anything I can help you with?",
"prompt": "Compliment me",
"max_message_count": 0
},
"5": {
"start_text": "Hello, I'm Mental Arithmetic Bunny. Shall we play a mental arithmetic game together?",
"prompt": "I'm a 6-year-old child, and you're playing a mental arithmetic game with me. You ask questions, and I'll answer them. If I answer correctly, you say 'Great!' If I answer incorrectly, you tell me the correct answer and encourage me. You ask one question at a time, and I'll answer them one by one. Do you understand?",
"max_message_count": 0
},
"6": {
"start_text": "Hello, I'm Taiwan Bunny. Is there anything I can help you with?",
"prompt": "You are a knowledgeable and helpful AI named Taiwan Bunny. Your task is to chat with me. Please use brief dialogue in Chinese to speak a sentence. Each response should not exceed 50 characters!",
"max_message_count": 0
},
"7": {
"start_text": "Hello, I'm Little Bunny. Is there anything I can help you with?",
"prompt": "You are a knowledgeable and helpful AI named Little Bunny. Your task is to chat with me. Please use brief dialogue in Chinese to speak a sentence. Each response should not exceed 50 characters!",
"max_message_count": 0
}
}
Advanced Configuration
The following is advanced configuration. Please follow the tutorial step by step.
Note: The following configurations take precedence over those in docker-compose.yml
.
The following configurations are for individually configuring STT, LLM, and TTS for each role. They only apply to the configured roles.
Parameter Name | Description | Default Value |
---|---|---|
stt_type | Type of speech-to-text engine | openai-whisper |
stt_config | Specific configuration of the speech-to-text engine. Different engine types have different configuration parameters. | Depends on the selected stt_type |
llm_type | Type of large language model. You can also use one-api to convert large model interfaces to openai-compatible ones to support more large models. | openai |
llm_config | Configuration of the large model | Depends on the selected llm_type |
tts_type | Type of text-to-speech engine | openai-tts |
tts_config | Configuration of the text-to-speech engine | Depends on the selected tts_type |
The following is an example configuration for one role. Do not copy and paste directly. It is only for demonstrating the hierarchy of each configuration.
{
"start_text": "Hello, I'm Wimi. Nice to meet you.",
"prompt": "You're a knowledgeable and helpful AI named \"Wimi\". Your task is to chat with me. Please respond in English, keeping your answers brief – no more than 50 words each time!",
"max_message_count": 20,
"stt_type": "openai-whisper",
"stt_config": {
// Configuration varies depending on the selected stt_type. Below is an example for openai-whisper.
"language": "en",
"api_base": "https://api.openai.com/v1",
"key": "sk-AAAAAAAAAAAAAAAAAAa",
"model": "whisper-1"
},
"tts_type": "openai-tts",
"tts_config": {
// Configuration varies depending on the selected tts_type. Below is an example for openai-tts.
"key": "sk-BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB",
"api_base": "https://api.openai.com/v1",
"model": "tts-1",
"speed": 1.0,
"voice_name": "alloy"
},
"llm_type": "openai",
"llm_config": {
// Configuration varies depending
on the selected llm_type. Below is an example for openai.
"api_base": "https://xxxx.ccc/v1",
"key": "sk-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",
"model": "gpt-3.5-turbo",
"temperature": 0.7,
"max_tokens": 800,
"top_p": 0.95,
"frequency_penalty": 0,
"presence_penalty": 0
}
}
Speech-to-Text (STT) Role-level Configuration
stt_type can be set to: openai-whisper, azure-stt, azure-whisper, dify-stt, aliyun-asr
Detailed configuration:
Large Language Model (LLM) Role-level Configuration
llm_type can be set to: openai, azure-openai, gemini, dify, qianfan, xiaodu, moonshot, groq, ollama, anthropic, dashscope, spark-desk, minimax, aws-bedrock, zhipu, lingyiwanwu
Detailed configuration:
- openai
- azure-openai
- gemini
- dify
- qianfan
- xiaodu
- moonshot
- groq
- ollama
- anthropic
- dashscope
- spark-desk
- minimax
- aws-bedrock
- zhipu
- lingyiwanwu
- fastgpt
Text-to-Speech (TTS) Role-level Configuration
tts_type can be set to: openai-tts, azure-tts, azure-openai-tts, elevenlabs, edge-tts, aliyun-tts, dify-tts
Detailed configuration:
Configuration Example
Configuration example. Modify the following configuration according to your needs:
The following configuration takes precedence over docker-compose.yml
.
The following is just an example for demonstration purposes. Please configure your own configuration step by step according to the instructions in Advanced Configuration.
{
"1": {
"start_text": "Hello, I'm Wimi. Nice to meet you.",
"prompt": "You're a knowledgeable and helpful AI named \"Wimi\". Your task is to chat with me. Please respond in English, keeping your answers brief – no more than 50 words each time!",
"max_message_count": 20,
"stt_type": "openai-whisper",
"stt_config": {
"language": "en"
},
"tts_type": "openai-tts",
"tts_config": {
"key": "sk-BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB",
"voice_name": "alloy"
},
"llm_type": "openai",
"llm_config": {
"api_base": "https://xxxx.ccc/v1",
"key": "sk-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",
"model": "gpt-3.5-turbo",
"temperature": 0.7,
"max_tokens": 800,
"top_p": 0.95,
"frequency_penalty": 0,
"presence_penalty": 0
}
},
"2": {
"model": "gpt-35-turbo",
"start_text": "Hello, I'm Northeast Bunny. Is there anything I can help you with?",
"prompt": "You are a knowledgeable and helpful AI named Northeast Bunny. Your task is to chat with me. Please use brief dialogue in Chinese to speak a sentence. Each response should not exceed 50 characters!",
"max_message_count": 20,
"voice_name": "zh-CN-liaoning-XiaobeiNeural"
},
"3": {
"model": "gpt-3.5-turbo",
"start_text": "Hi, I'm Wimi. Nice to meet you.",
"prompt": "You're a knowledgeable and helpful AI named \"Wimi\". Your task is to chat with me. Please respond in English, keeping your answers brief – no more than 50 words each time!",
"max_message_count": 20,
"voice_name": "c8Vkv3mdER2fkhJdEIPK",
"language": "en",
"tts_type": "elevenlabs"
},
"4": {
"model": "gpt-3.5-turbo",
"start_text": "Hello, I'm Compliment Bunny. Is there anything I can help you with?",
"prompt": "Compliment me",
"max_message_count": 20,
"voice_name": "zh-CN-shaanxi-XiaoniNeural",
"language": "zh",
"tts_type": "azure-tts"
},
"5": {
"model": "gpt-3.5-turbo",
"start_text": "Hello, I'm Mental Arithmetic Bunny. Shall we play a mental arithmetic game together?",
"prompt": "I'm a 6-year-old child, and you're playing a mental arithmetic game with me. You ask questions, and I'll answer them. If I answer correctly, you say 'Great!' If I answer incorrectly, you tell me the correct answer and encourage me. You ask one question at a time, and I'll answer them one by one. Do you understand?",
"max_message_count": 20,
"stt_type": "azure-st
t",
"llm_type": "azure-openai",
"tts_type": "azure-tts",
"voice_name": "zh-CN-YunxiaNeural",
"language": "zh-CN",
"llm_config": {
"model": "gpt-35-turbo"
}
},
"6": {
"model": "gpt-3.5-turbo",
"start_text": "Hello, I'm Taiwan Bunny. Is there anything I can help you with?",
"prompt": "You are a knowledgeable and helpful AI named Taiwan Bunny. Your task is to chat with me. Please use brief dialogue in Chinese to speak a sentence. Each response should not exceed 50 characters!",
"max_message_count": 20,
"voice_name": "zh-TW-HsiaoChenNeural",
"language": "zh-CN",
"stt_type": "azure-stt",
"tts_type": "azure-tts"
},
"7": {
"model": "gpt-3.5-turbo",
"start_text": "Hello, I'm Little Bunny. Is there anything I can help you with?",
"prompt": "You are a knowledgeable and helpful AI named Little Bunny. Your task is to chat with me. Please use brief dialogue in Chinese to speak a sentence. Each response should not exceed 50 characters!",
"max_message_count": 20,
"voice_name": "zh-CN-YunyangNeural",
"language": "zh",
"tts_type": "edge-tts"
}
}