text2image, image2image, text2video, image2video, text-to-speech, speech-to-text, large language models