I liked how the full room was revealed after the first couple interactions, creating sort of an interactive intro sequence. Nice.
As heltenjon said, the story flow is somewhat odd.
If you don't have animations, you could perhaps consider using first person perspective. Maybe with the protagonist's portrait sewn into the GUI somewhere if you want the player to know what he looks like. For a training game it probably doesn't matter much, but if you're planning to continue using static graphics in your projects, this may be a good way to work around the lack of animations without making the games feel like a work in progress.
As heltenjon said, the story flow is somewhat odd.
Spoiler
We needed the barista's permission to snatch a donut, but we had no issue getting the bus driver to choke on the gravel (that's what I assume happened), stealing his keys and hijacking the bus? Are we playing a psychopath?

[close]
If you don't have animations, you could perhaps consider using first person perspective. Maybe with the protagonist's portrait sewn into the GUI somewhere if you want the player to know what he looks like. For a training game it probably doesn't matter much, but if you're planning to continue using static graphics in your projects, this may be a good way to work around the lack of animations without making the games feel like a work in progress.