This module aims to extract emotions from audio. The input argument is either an uploaded audio/video file to the server or a URL. The output is the predicted emotion in terms of Arousal and Valence within the JSON-LD format.