Role Configuration

caution

The roles.json file must define 7 roles.

Online tool for generating self-hosted server files

This document explains the configuration of roles.json and provides some examples for reference. The latest version of the file is available here: roles.json

Basic Configuration

List of configurable parameters:

Parameter Name	Description	Default Value
start_text	The startup prompt that will be automatically converted into a speech file and sent to the toy (required)	None
prompt	Definition of the role's prompts (required)	None
max_message_count	Set to a value greater than 0, e.g., change it to 10, to indicate support for 10 dialogues in context. Setting it to 0 means no support for context. Each model has its own limit on the size of context tokens. If exceeded, you can clear the context by pressing the current role button again. You can also roughly limit the context size by adjusting `max_message_count` and `max_tokens` values.	0

Configuration Example

caution

The following configurations can be directly copied and pasted for use.

If you want to remember the context, you can adjust max_message_count to the desired number, but it is not recommended to set it too large to exceed the size limit of the large model.

roles.json
{
   "1": {
      "start_text": "Hello, I'm Companion Bunny. Is there anything I can help you with?",
      "prompt": "You play the role of a child's companion named Companion Bunny. You have a friendly and lively personality, full of love for children. You often praise and encourage children, providing interesting and innovative answers in language that 5-year-old children can easily understand. Each response should ask for her opinion on the chat topic to stimulate her thinking and curiosity.",
      "max_message_count": 0
   },
   "2": {
      "start_text": "Hello, I'm Northeast Bunny. Is there anything I can help you with?",
      "prompt": "You are a knowledgeable and helpful AI named Northeast Bunny. Your task is to chat with me. Please use brief dialogue in Chinese to speak a sentence. Each response should not exceed 50 characters!",
      "max_message_count": 0
   },
   "3": {
      "start_text": "Hi, I'm Wimi. Nice to meet you.",
      "prompt": "You're a knowledgeable and helpful AI named \"Wimi\". Your task is to chat with me. Please respond in English, keeping your answers brief – no more than 50 words each time!",
      "max_message_count": 0
   },
   "4": {
      "start_text": "Hello, I'm Compliment Bunny. Is there anything I can help you with?",
      "prompt": "Compliment me",
      "max_message_count": 0
   },
   "5": {
      "start_text": "Hello, I'm Mental Arithmetic Bunny. Shall we play a mental arithmetic game together?",
      "prompt": "I'm a 6-year-old child, and you're playing a mental arithmetic game with me. You ask questions, and I'll answer them. If I answer correctly, you say 'Great!' If I answer incorrectly, you tell me the correct answer and encourage me. You ask one question at a time, and I'll answer them one by one. Do you understand?",
      "max_message_count": 0
   },
   "6": {
      "start_text": "Hello, I'm Taiwan Bunny. Is there anything I can help you with?",
      "prompt": "You are a knowledgeable and helpful AI named Taiwan Bunny. Your task is to chat with me. Please use brief dialogue in Chinese to speak a sentence. Each response should not exceed 50 characters!",
      "max_message_count": 0
   },
   "7": {
      "start_text": "Hello, I'm Little Bunny. Is there anything I can help you with?",
      "prompt": "You are a knowledgeable and helpful AI named Little Bunny. Your task is to chat with me. Please use brief dialogue in Chinese to speak a sentence. Each response should not exceed 50 characters!",
      "max_message_count": 0
   }
}

Advanced Configuration

caution

The following is advanced configuration. Please follow the tutorial step by step.

Note: The following configurations take precedence over those in docker-compose.yml.

The following configurations are for individually configuring STT, LLM, and TTS for each role. They only apply to the configured roles.

Parameter Name	Description	Default Value
stt_type	Type of speech-to-text engine	openai-whisper
stt_config	Specific configuration of the speech-to-text engine. Different engine types have different configuration parameters.	Depends on the selected `stt_type`
llm_type	Type of large language model. You can also use one-api to convert large model interfaces to openai-compatible ones to support more large models.	openai
llm_config	Configuration of the large model	Depends on the selected `llm_type`
tts_type	Type of text-to-speech engine	openai-tts
tts_config	Configuration of the text-to-speech engine	Depends on the selected `tts_type`

caution

The following is an example configuration for one role. Do not copy and paste directly. It is only for demonstrating the hierarchy of each configuration.

{
   "start_text": "Hello, I'm Wimi. Nice to meet you.",
   "prompt": "You're a knowledgeable and helpful AI named \"Wimi\". Your task is to chat with me. Please respond in English, keeping your answers brief – no more than 50 words each time!",
   "max_message_count": 20,
   "stt_type": "openai-whisper",
   "stt_config": {
      // Configuration varies depending on the selected stt_type. Below is an example for openai-whisper.
      "language": "en",
      "api_base": "https://api.openai.com/v1",
      "key": "sk-AAAAAAAAAAAAAAAAAAa",
      "model": "whisper-1"
   },
   "tts_type": "openai-tts",
   "tts_config": {
      // Configuration varies depending on the selected tts_type. Below is an example for openai-tts.
      "key": "sk-BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB",
      "api_base": "https://api.openai.com/v1",
      "model": "tts-1",
      "speed": 1.0,
      "voice_name": "alloy"
   },
   "llm_type": "openai",
   "llm_config": {
      // Configuration varies depending

 on the selected llm_type. Below is an example for openai.
      "api_base": "https://xxxx.ccc/v1",
      "key": "sk-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",
      "model": "gpt-3.5-turbo",
      "temperature": 0.7,
      "max_tokens": 800,
      "top_p": 0.95,
      "frequency_penalty": 0,
      "presence_penalty": 0
   }
}

Speech-to-Text (STT) Role-level Configuration

stt_type can be set to: openai-whisper, azure-stt, azure-whisper, dify-stt, aliyun-asr

Detailed configuration:

Large Language Model (LLM) Role-level Configuration

llm_type can be set to: openai, azure-openai, gemini, dify, qianfan, xiaodu, moonshot, groq, ollama, anthropic, dashscope, spark-desk, minimax, aws-bedrock, zhipu, lingyiwanwu

Detailed configuration:

Text-to-Speech (TTS) Role-level Configuration

tts_type can be set to: openai-tts, azure-tts, azure-openai-tts, elevenlabs, edge-tts, aliyun-tts, dify-tts

Detailed configuration:

Configuration Example

Configuration example. Modify the following configuration according to your needs:

caution

The following configuration takes precedence over docker-compose.yml.

The following is just an example for demonstration purposes. Please configure your own configuration step by step according to the instructions in Advanced Configuration.

roles.json
{
  "1": {
     "start_text": "Hello, I'm Wimi. Nice to meet you.",
     "prompt": "You're a knowledgeable and helpful AI named \"Wimi\". Your task is to chat with me. Please respond in English, keeping your answers brief – no more than 50 words each time!",
     "max_message_count": 20,
     "stt_type": "openai-whisper",
     "stt_config": {
        "language": "en"
     },
     "tts_type": "openai-tts",
     "tts_config": {
        "key": "sk-BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB",
        "voice_name": "alloy"
     },
     "llm_type": "openai",
     "llm_config": {
        "api_base": "https://xxxx.ccc/v1",
        "key": "sk-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",
        "model": "gpt-3.5-turbo",
        "temperature": 0.7,
        "max_tokens": 800,
        "top_p": 0.95,
        "frequency_penalty": 0,
        "presence_penalty": 0
     }
  },
  "2": {
     "model": "gpt-35-turbo",
     "start_text": "Hello, I'm Northeast Bunny. Is there anything I can help you with?",
     "prompt": "You are a knowledgeable and helpful AI named Northeast Bunny. Your task is to chat with me. Please use brief dialogue in Chinese to speak a sentence. Each response should not exceed 50 characters!",
     "max_message_count": 20,
     "voice_name": "zh-CN-liaoning-XiaobeiNeural"
  },
  "3": {
     "model": "gpt-3.5-turbo",
     "start_text": "Hi, I'm Wimi. Nice to meet you.",
     "prompt": "You're a knowledgeable and helpful AI named \"Wimi\". Your task is to chat with me. Please respond in English, keeping your answers brief – no more than 50 words each time!",
     "max_message_count": 20,
     "voice_name": "c8Vkv3mdER2fkhJdEIPK",
     "language": "en",
     "tts_type": "elevenlabs"
  },
  "4": {
     "model": "gpt-3.5-turbo",
     "start_text": "Hello, I'm Compliment Bunny. Is there anything I can help you with?",
     "prompt": "Compliment me",
     "max_message_count": 20,
     "voice_name": "zh-CN-shaanxi-XiaoniNeural",
     "language": "zh",
     "tts_type": "azure-tts"
  },
  "5": {
     "model": "gpt-3.5-turbo",
     "start_text": "Hello, I'm Mental Arithmetic Bunny. Shall we play a mental arithmetic game together?",
     "prompt": "I'm a 6-year-old child, and you're playing a mental arithmetic game with me. You ask questions, and I'll answer them. If I answer correctly, you say 'Great!' If I answer incorrectly, you tell me the correct answer and encourage me. You ask one question at a time, and I'll answer them one by one. Do you understand?",
     "max_message_count": 20,
     "stt_type": "azure-st

t",
     "llm_type": "azure-openai",
     "tts_type": "azure-tts",
     "voice_name": "zh-CN-YunxiaNeural",
     "language": "zh-CN",
     "llm_config": {
        "model": "gpt-35-turbo"
     }
  },
  "6": {
     "model": "gpt-3.5-turbo",
     "start_text": "Hello, I'm Taiwan Bunny. Is there anything I can help you with?",
     "prompt": "You are a knowledgeable and helpful AI named Taiwan Bunny. Your task is to chat with me. Please use brief dialogue in Chinese to speak a sentence. Each response should not exceed 50 characters!",
     "max_message_count": 20,
     "voice_name": "zh-TW-HsiaoChenNeural",
     "language": "zh-CN",
     "stt_type": "azure-stt",
     "tts_type": "azure-tts"
  },
  "7": {
     "model": "gpt-3.5-turbo",
     "start_text": "Hello, I'm Little Bunny. Is there anything I can help you with?",
     "prompt": "You are a knowledgeable and helpful AI named Little Bunny. Your task is to chat with me. Please use brief dialogue in Chinese to speak a sentence. Each response should not exceed 50 characters!",
     "max_message_count": 20,
     "voice_name": "zh-CN-YunyangNeural",
     "language": "zh",
     "tts_type": "edge-tts"
  }
}

Basic Configuration​

Configuration Example​

Advanced Configuration​

Speech-to-Text (STT) Role-level Configuration​

Large Language Model (LLM) Role-level Configuration​

Text-to-Speech (TTS) Role-level Configuration​

Configuration Example​

Basic Configuration

Configuration Example

Advanced Configuration

Speech-to-Text (STT) Role-level Configuration

Large Language Model (LLM) Role-level Configuration

Text-to-Speech (TTS) Role-level Configuration

Configuration Example