During my time as a web developer I have often dealt with transferring and transforming data to make information available in web pages. In the process of handling data I have on a (fortunately small) number of occasions brought one or another server down.
The first time this happened I was working at Port Talbot Hot Mill, and one of my timer-based VB applications had started hogging the connection to the Data Warehouse server by reconnecting several times and not releasing already opened connections. Tim Bolton from Hot Mill Systems had a good laugh when I told him what had happened, and said “you’ve finally become a IT developer now !”, implying that unless you’re really stretch the envelope, you’re not going to break anything but maybe also not achieve anything worthwhile.
Shortly afterwards I started to get into the habit to write my applications so that they get kicked off by the computer’s Windows Scheduler, where the scheduler itself cuts off the application if it runs for too long. Later on I also converted all my VB applications to .NET applications, which better at handling connections and ensures that they are closed whenever they should be.
On another occasion, I brought the Oracle RDB server down by creating a web page that had a loop accessing RDB inside another loop making a connection to the same server. By hogging all the connections it made access to the server impossible for all other users – a problem solved by Process Control by giving me my own access account with a limited number of connections, which, if I filled all of them up, would only affect my own pages, and not someone else’s working.
At a later stage a VB application which had to calculate charge weights was forced to do exactly the same thing, and although it only affected this application, it was at times noted that it seemed to be hanging. Again a rewrite in .NET ensured that closed connections had indeed relinquished all connections to the server.
Then there was the instance where for the duration that an application was running (twice a day for about a quarter of an hour), the weighbridge, who were using the same server, could not perform any operations. Apart from the fact that a business-critical operation should not share its resources with other applications, we made some amendments to make sure that the SQL query on my application was optimised so that it took up fewer resources.
The last time I brought anything down was when I had a page which checked the connection to a variety of servers, including two email (SMTP) servers (one in Teesside and another in IJmuiden). The problem was that I had made this page into an auto-refresher, but as long as I used the page this was not an issue, since I didn’t keep the page open for much longer than a minute each time.
The trouble started when I was on holiday, and someone started to use the page full time whilst investigating why one of my application did not complete its job (this had nothing to do with the application, but the corruption of one of its source tables). However, keeping the page open for hours on end kept pinging the SMTP servers every minute or so, and at some point this server decided that it would cut the connection of this source.
Once I was back I soon rewrote the page so that it no longer did an auto-refresh, and I ensured that checking the connection to the SMTP server was a manual click on a button which then disappeared once it had been clicked. The trouble is that I had known that this page could have been a potential problem, but had postponed doing anything about it because of (1) I was the only one using the page; and (2) I had other, more urgent, work to do.
Still, four times in a period of 15 years is not too bad, I suppose. Not sure how this compares with other data developers, but I console myself with the fact that (a) the problem was resolved pretty quickly; and (b) the benefits of my work far outweighed the occasional blip. After all, if you have a cowboy reputation, you may as well uphold it from time to time.