I'd worry about the number of people excluded - it's over 60%. That's not indicative of dodgy dealing, or anything, more that it's hard to get good data, but it's still not great when you've excluded more than half of your original sample.
I didn't really see any discussion of the skewness of the data, though I admit I did skim read a bit. It's much more concerning, I would say, if a few people have long delays, than if a lot of people have short ones (as the latter scenario fits better with the "stress" hypothesis).
Will try and look properly (not on a phone) later, when I should be able to see the graphs better, and see whether there's anything that shows the skewness.