api
Exposes interactive bot functionality through FastAPI
The backend consists of modules that expose the functionality needed for a interactive multi-modal chatbot. This module provides an api to that functionality that can be served to any frontend application.
The api provides methods for various forms of synchronous and asynchronous communication
Synchronous vs Asynchronous
-
Functions that start with 'get' are synchronous.
-
Functions that start with 'run' are asynchronous, and the result are returned through functions that start with 'stream'.
-
Functions that start with 'add_to' are semi-synchronous in that they synchronously add to a queue which will be asynchronously processed by the face/bot.
The api also allows communication between two different clients, such as how the WoZ interface can send a gesture to the api, and the api will essentially forward that gesture to the robot.
Typical usage
As documented in https://fastapi.tiangolo.com/deployment/manually/ The api can be run from another module:
or it can be run directly:
add_to_face(text, update_type)
Sends desired expression, behavior, or viseme to the face
Face presets are sent from the WoZ to this API, and are then loaded to the queue for the face to read.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text |
str
|
what command to send |
required |
update_type |
str
|
what type of update to send, either expression behavior, or viseme. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
PlainTextResponse |
PlainTextResponse
|
returns the requested text. |
Source code in backend/app/api.py
add_to_gesture(text)
Add desired gesture to queue for the robot
This function provides a passthrough from the WoZ to the robot bridge.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text |
str
|
Name of the gesture |
required |
Returns:
| Name | Type | Description |
|---|---|---|
PlainTextResponse |
PlainTextResponse
|
Returns the specified gesture. |
Source code in backend/app/api.py
get_response(mode, query)
Returns text presets based on WoZ input
Mostly useful for controlling the robot during study interactions. Does not use the text_stream as reponse is usually directly output.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mode |
str
|
What mode the facilitator is in. Possibilities include role_model - for when the robot is leading a session as a role model, direcctor - for when the robot is leading a session as a director, and facilitator - for when the robot is not in either condition yet. |
required |
query |
str
|
Key for looking up matching text response. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
PlainTextResponse |
PlainTextResponse
|
Text for the faciliatator to say. |
Source code in backend/app/api.py
get_speech(text, speaker_id='')
Synthesizes wav bytes from text, with a given speaker ID
This text will be spoken immediately after it is generated, so the bot is updated with the knowledge that the facilitator is actually saying this text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text |
str
|
Text to e synthesized |
required |
speaker_id |
str
|
ID of the voice to be used. Defaults to "". |
''
|
Returns:
| Name | Type | Description |
|---|---|---|
StreamingResponse |
StreamingResponse
|
Audio stream of the voice saying the text. |
Source code in backend/app/api.py
return_gesture()
Get next gesture from the queue.
This exposes an endpoint for the robot to regularly ping in order to fetch the next gesture it should do.
Returns:
| Name | Type | Description |
|---|---|---|
PlainTextResponse |
PlainTextResponse
|
desired gesture. |
Source code in backend/app/api.py
run_generate_response(text, speaker, reset_conversation, director_condition)
Takes input text and generates possible bot responses
Possible bot responses include generative responses from the chatbot as well as controlled responses from the facilitator. The classifications used for generating the facilitator response are added as well. Additionally, the emotion that was found in the text is mirrored by the robots expression if it is in the subset of possible expressions (joy sad, surprise)
warning
All of this text is returned asynchronously through the text_stream. The default response is set to the bot response.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text |
str
|
Input said by a human. |
required |
speaker |
str
|
Identify of the speaker |
required |
reset_conversation |
bool
|
Whether or not to restart the conversation. |
required |
director_condition |
bool
|
Flag for controlling what type of facilitator is used. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
PlainTextResponse |
PlainTextResponse
|
Defaults to generative bot response. |
Source code in backend/app/api.py
run_transcribe_audio(uploaded_file)
async
Perform speech to text on audio file
transcribes audio from file and adds transcribed text to human_speech in the text queue.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
uploaded_file |
UploadFile
|
Recorded audio. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
dict |
dict
|
name of the saved audio file. |
Source code in backend/app/api.py
stream_face(request)
async
Publishes face control messages to a subscriber
Publishes messages as soon as they are added to the queue. Message event argument specifies the type of face control message being sent, expression or behavior
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
request |
Request
|
Request for event generator when API is called. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
EventSourceResponse |
EventSourceResponse
|
Server Sent Event (sse) source. |
Yields:
| Type | Description |
|---|---|
EventSourceResponse
|
Iterator[EventSourceResponse]: Strings for a face behavior or expression |
Source code in backend/app/api.py
stream_text(request)
async
Publishes text messages to a subscriber
Publishes messages as soon as they are added to the queue.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
request |
Request
|
Request for event generator when API is called. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
EventSourceResponse | Server Sent Event (sse) source. |
Yields:
| Type | Description |
|---|---|
Iterator[EventSourceResponse]: Strings of text |
Source code in backend/app/api.py
stream_viseme(request)
async
Publishes visemes to a subscriber
Publishes the viseme after the corresponding viseme delay. Only publishes when there are visemes in the queue. If no subscribers are listening it will not send any messages and the queue will continue to grow.
warning
Only publishes each viseme once. If you have multiple subscribers (e.g. multiple tabs or multiple devices with the face open) each of the subscribers will only recieve a subset of the visemes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
request |
Request
|
Request for event generator when API is called. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
EventSourceResponse | Server Sent Event (sse) source. |
Yields:
| Type | Description |
|---|---|
Iterator[EventSourceResponse]: Strings for a viseme (the desired shape of the mouth) |