I always thought script writing was for people who couldn't draw. And that most script writers were hacks that were saved by the real (visual) artists.
So I was making stories by creating pictures first. And the issue was that I was unable to pivot based on my whims, even three drawings in, huge amounts of half baked decisions have been grandfathered in that I have to carry around, until I collapse under the weight of them. I went so far as to look into trying to learn touch designer so I could trigger images with rubber pads.
And of course the real solution was script writing. But the way I do it is modular for example:
key visual (actual art)
image descrption: A cow on a field
voice description: "moo"
This method allows me to change the items being slammed together extremely quickly, and Im able to feel the reaction they cause with minimal effort.