The sentences (e.g., ___ is a dog.) are generated randomly from an array via a function.
By "generated" you mean "picked"?
Do you pick a complete sentence from a random element of the array? Or do you generate the sentence from separate random words?
It's hard to give you a solution without seeing your project.
You can probably expand your array, adding another "column" to it that will store the name of audio file for each sentence.
Could you share your capx file?