This is a combination of two phonetic effects. First, final /t/ and initial /d/ often merge into a tap [ɾ] when the second word is unstressed. You probably know this sound from the general American pronunciation of “city” [sɪɾi].
Second, because of the weak vowel merger in many accents, is and does can have the same vowel, freely varying in pronunciation between schwa /ə/ and short I /ɪ/.
The net effect is that “what does” can be pronounced [wʌɾɪz], and be differentiated from “what is” only by context. Luckily, the two phrases are used differently, so in practice it’s easy to know what was meant.