Tag Archives: Database

Excel Is Not a Database (Part 1)

In the first year after the Operational Research team was formed, there was this person (if you’re reading this, you know who you are, Mike) who was the owner of a vast and “sophisticated” Excel spreadsheet, from where various items to do with raw material usage at the blast furnaces were controlled. Everybody apart its loving owner agreed that we should replace it with a system that was more transparent and not depending on one person overseeing this unsightly mess – although we didn’t tell it to his face, we weren’t that cruel.

Still, in the final year prior to his retirement it became a running joke that at every monthly meeting, when we reported on our highlights from the past month, he declared how his system was now even more automated than before, and only required one click instead of three before the change – in short everyone knew when a turd was being polished. Even then, he had the audacity to refer to his spreadsheet as his “database” !

Fortunately he was not around to see how soon after his retirement his sophisticated system fell out of favour, and was dropped. The main reason ? It took the person who attempted to look after the system several days a week to even produce some useful feedback – in the end the effort could not be maintained, and another, more simple means of controlling raw materials feed for the blast furnaces was adopted.

Unfortunately he was not the only one who produced this type of sophisticated spreadsheet: I suppose it gave the person in charge great satisfaction to have all its functionality at his fingertips. The problem was just that: usually there was only the one person in charge of the spreadsheet who had a full understanding of how everything fit together, and although useful, the amount of time wasted in getting the spreadsheet to crank out its desired output had to be seen to be believed. I used to quip that every graph or report handed into senior management for their use should show the number of man hours it took to produce it. Maybe then the BMT would not be so cavalier about time spent serving sometimes frivolous reporting needs if they saw how much time otherwise would be available doing more productive tasks.

But to return to the concept of Excel as a database: there is no such thing !!!

A database may hold data in a format suitable to facilitate reporting, but unlike spreadsheets the data layer and the display layer are not inextricably linked. Meaning that if you want to create a variety of reports based on the same set of data, you don’t have to start creating new worksheets and linking cells, and basically rearrange your data layer to produce a new report. But for some reason or another, people accustomed to spreadsheets fail to appreciate this glaringly obvious discrepancy. In fact I’d say that those who have grown up with spreadsheets have to unlearn everything they know about them and acquire a new look-out on how a set of reports fed from the same database, but not physically linked with it, reduces maintenance times to a fraction of that required for a spreadsheet (after the initial development phase, obviously).

The problem with spreadsheets is that they’re easy to use and it doesn’t take long to become quite proficient getting results from them. What people often fail to notice is that it’s also a limited tool, because in the end “it does the job”. But the cost is one of those creeping poisons: initially it’s something that you can handle, but as the requirements change and new functionality needs to be added (thereby creating the sophisticated monster some people were so proud of) it becomes an all-consuming entity that fills people’s time and engulfs their PCs with gigabytes of historical data.

The more I started to think about it, the more I felt that I needed to highlight the unacknowledged evils of maintaining business systems using Excel spreadsheets. First I created a wiki article in the Strip UK wiki entitled “Excel is not a database”, and when I got wind that the BICC intended to investigate Power Pivot as an alternative to proper databases, I created a presentation entitled “Why Excel should not be used for business reporting”. But that will be part of further blog entries.

Bringing the Server Down

During my time as a web developer I have often dealt with transferring and transforming data to make information available in web pages. In the process of handling data I have on a (fortunately small) number of occasions brought one or another server down.

The first time this happened I was working at Port Talbot Hot Mill, and one of my timer-based VB applications had started hogging the connection to the Data Warehouse server by reconnecting several times and not releasing already opened connections. Tim Bolton from Hot Mill Systems had a good laugh when I told him what had happened, and said “you’ve finally become a IT developer now !”, implying that unless you’re really stretch the envelope, you’re not going to break anything but maybe also not achieve anything worthwhile.

Shortly afterwards I started to get into the habit to write my applications so that they get kicked off by the computer’s Windows Scheduler, where the scheduler itself cuts off the application if it runs for too long. Later on I also converted all my VB applications to .NET applications, which better at handling connections and ensures that they are closed whenever they should be.

On another occasion, I brought the Oracle RDB server down by creating a web page that had a loop accessing RDB inside another loop making a connection to the same server. By hogging all the connections it made access to the server impossible for all other users – a problem solved by Process Control by giving me my own access account with a limited number of connections, which, if I filled all of them up, would only affect my own pages, and not someone else’s working.

At a later stage a VB application which had to calculate charge weights was forced to do exactly the same thing, and although it only affected this application, it was at times noted that it seemed to be hanging. Again a rewrite in .NET ensured that closed connections had indeed relinquished all connections to the server.

Then there was the instance where for the duration that an application was running (twice a day for about a quarter of an hour), the weighbridge, who were using the same server, could not perform any operations. Apart from the fact that a business-critical operation should not share its resources with other applications, we made some amendments to make sure that the SQL query on my application was optimised so that it took up fewer resources.

The last time I brought anything down was when I had a page which checked the connection to a variety of servers, including two email (SMTP) servers (one in Teesside and another in IJmuiden). The problem was that I had made this page into an auto-refresher, but as long as I used the page this was not an issue, since I didn’t keep the page open for much longer than a minute each time.

The trouble started when I was on holiday, and someone started to use the page full time whilst investigating why one of my application did not complete its job (this had nothing to do with the application, but the corruption of one of its source tables). However, keeping the page open for hours on end kept pinging the SMTP servers every minute or so, and at some point this server decided that it would cut the connection of this source.

Once I was back I soon rewrote the page so that it no longer did an auto-refresh, and I ensured that checking the connection to the SMTP server was a manual click on a button which then disappeared once it had been clicked. The trouble is that I had known that this page could have been a potential problem, but had postponed doing anything about it because of (1) I was the only one using the page; and (2) I had other, more urgent, work to do.

Still, four times in a period of 15 years is not too bad, I suppose. Not sure how this compares with other data developers, but I console myself with the fact that (a) the problem was resolved pretty quickly; and (b) the benefits of my work far outweighed the occasional blip. After all, if you have a cowboy reputation, you may as well uphold it from time to time.