Listening... Done listening Finished transcribing in 1.21 seconds. Finished generating response in 0.72 seconds. Finished generating audio in 1.85 seconds. Speaking ...