
Reference:
https://azure.microsoft.com/en-gb/services/cognitive-services/speech-to-text/#features Speech recognition means Speech to Text. In the above example as a person speaks the words are converted into text of the same language. Hence Speech to Text also called Speech recognition is the right answer.
Speech recognition - the ability to detect and interpret spoken input.
Speech synthesis - the ability to generate spoken output.
https://docs.microsoft.com/en-us/learn/modules/recognize-synthesize-speech/1-introduction