A couple of us had just been recruited to work on an ASP.NET application that had been orphaned after the departure of the previous development team.
There are several instances of this app running in PROD without issue, but every so often something goes wrong, and, in the absence of a dev team, had been solved by a turn-it-off-and-on-again and a bit of downtime. Not a high traffic app so that wasn’t a massive cost. But, now there’s a dev team again, and we get notification of a catastrophic failure. It’s time to investigate…
Nothing works. The system drive on the web server has run out of space. We check the event logs and see thousands upon thousands of errors generated within the space of 5 minutes.
It turns out there’s a ‘handle all unhandled exceptions’ policy, which does the following:
- Write an entry to the event log
- Write the message to a log file in share \\fileserver\ErrorLog.
The app pool account password had expired. As the app pool was already running, this would be no issue for the web services until the next recycle, the following day. But, when there was a run-of-the-mill validation exception, what happened? It triggered the policy, which created an entry in the event log. And then tried to write the message to the log in the file share, which, due to the account’s expired password, failed with a SecurityException. Which wasn’t handled, and so re-triggered the exception handler. Repeat. Ad infinitum.
This happened in my first job out of university; some details are a bit hazy. Looking back, I’m not sure how this actually happened – I’m not sure the built-in ASP.NET global handler could behave like this. I suspect there was some 3rd-party module involved.*
Lessons learned: Think about what might fail in your exception handling code. Sometimes an empty catch block is OK (but add a comment explaining why).
* Update 2-June-2019: After thinking about this again for far too long, I think I’ve figured it out: There was a separate service for error logging (using log4net, maybe), which the app was configured to call from the global exception handler. And then, that same configuration setting was applied to the error logging service itself.