Tag Archives: Excel

Updating a Customer List


You could almost see this as an addendum to the “Excel is not a database” and “Why Excel should not be used for business reporting” blogs. It is also a real life example of how updating a customer is automatic and requires no manual intervention when using a database, but is virtually impossible, and definitely very time consuming if done using an Excel spreadsheet.

I used to have a system for recording complaints by extracting data from the CADS mainframe system. In parallel I also made an extraction of the current customer list from a different mainframe system. A daily automated extract was all that was needed to make sure that the customers from the complaints database matched the list of current customers.

In addition we also had a manually maintained customer list SQL table which contained more extensive customer-related data. If necessary this table could be kept up-to-date through a web-based entry screen, or block changes could also be made behind the scenes using customised SQL commands.

Compare this with the system in place for maintaining a customer list in support of Supply Chain Transformation (a separate customer list was deemed necessary since the customer groupings were specific to the needs of this project): you received an email letting you know that a number of unspecified changes had been made to the Excel spreadsheet which resided on a Sharepoint server.

Now how I had handled the initial SCT customer list was to extract the contents of the initial Excel spreadsheet into an equivalent SQL table and used this table in combination with my complaints table to create a listing of complaints with their specific SCT category. However, since the changes to the customer spreadsheet were not specified I had made up my mind that I was not going to waste any time on figuring them out, and let users flag up any gaps that might appear as a result of mismatches between my complaints database and the now outdated SCT customer list.

I could now make the required changes necessary to fill the gap in the customer list. It works, after a fashion, but it can hardly be considered an elegant solution – and besides, it always requires someone to step in and make the manual alteration. Now imagine if the SCT customer list had been held in a database: at worst I would have to perform a daily extract from Oracle to SQL, but since this is an automated process, the SCT customer list would always be up-to-date without the need for any manual intervention apart from the first one needed to alter the records in the Oracle table.

There couldn’t be a better example of how spreadsheets always need to be shepherded along, whereas a database runs automatically once set in motion. Being essentially a lazy person, I prefer the automated option.

Advertisements

Why Excel Should Not Be Used for Business Reporting


This is the distillation of a thought process that has come to fruition over the years, when the realisation started to sink in that not only is there no plan to move away from Excel spreadsheets for business reporting, the great white hope at the time appeared to rest on Power Pivots to enable everyone to create reports for themselves, in the mistaken belief that this will address all the things that are wrong with current attempts to use Excel spreadsheets for business reporting. A copy of the presentation can be found here, as well as on my LinkedIn account.

Why Excel Should Not Be Used for Business Reporting – 7 March 2014

Hidden Waste

  • Maintenance of spreadsheets is time-consuming
  • How many people do you know who spend at least one day a week, every week, updating spreadsheets for reports ?
  • Too much time spent on recording data leaves little time to do do something about the findings
  • There is never a moment that a spreadsheet can be fully automated and does not require someone to carry it along
  • Each spreadsheet report should show how many manhours it took to compile, in order to reveal the hidden cost

Private Copy

  • By its very nature, every spreadsheet is a localised copy
  • Version control is problematic, especially when individuals “improve” their copy
  • Sharing can be done through email attachments, SharePoint or Lotus Notes databases, but remains problematic
  • Collaborative work, where different people need to add their portion to a common spreadsheet, is clumsy

Data Trail

  • A data trail is virtually impossible to establish, mainly because of frequent manual interventions
  • In a worst case scenario data get transferred from database to spreadsheet and back several times before it ends up in the final report version
  • Because the data layer is inextricably linked to the display layer, new reporting requirements results in rearranging of existing data into a new format to serve the needs of the new report
  • Ultimately this can lead to the nightmare of the “sophisticated” spreadsheet, where only the owner or a few initiated people know how it all links up

Bad Habits

  • Manual input without error checking leaves spreadsheet prone to mistakes
  • Encourages the bad habit of copy-and-pasting as standard
  • Worst of all, when solving problems, it narrows people’s mindset to what and how Excel can solve things (“If all you have is a hammer, everything looks like a nail” — Abraham Maslow)

The Cost of Spreadsheets and Formal BI

The cost of centralised BI and spreadsheets compared

Low lifetime of report or analysis:

  • centralised BI high start-up costs, lead times sometimes months, many needs unfulfilled
  • spreadsheets offer same day turnaround on emerging needs

High lifetime of report or analysis:

  • ongoing spreadsheet maintenance and updates – labour intensive, error prone, not secure
  • low maintenance costs – autorefreshed, secure, robust

Power Pivot – Will It Help ?

  • Hidden waste – doubtful
  • Private copy – doubtful
  • Collaboration – doubtful
  • Version control – doubtful
  • Data trail – probably
  • Data / display layer – possibly
  • Error-prone manual entry – probably
  • Copy-and-paste – no
  • Sophisticated spreadsheet – makes it worse
  • Narrowed mindset – no

Main Drawbacks of Power Pivot

  • Only full loads are allowed, “Append” to an existing data set is not an option
  • Power Pivot is not a data cleanser, meaning that the data set needs to be prepared prior to loading
  • Since Power Pivot runs in memory, you will need huge RAM to run against all your enterprise data
  • Power Pivot for SharePoint requires Enterprise license and SQL Server 2008R2 EE license; the cost (which is very high) may affect the SharePoint implementations of Power Pivot

Excel Is Not a Database (Part 2)


Below is the content of the wiki article entitled “Excel is not a database” as it was at the time I left Tata Steel. There may be some overlap with the contents of the previous blog, but emphasizes the various bullet points where Excel falls in my opinion short of being a satisfactory solution for business reporting. Some items are repeated under different headings, meaning that their bad influence is multi-pronged.

Excel spreadsheets are easy to use and are an ideal tool for ad hoc reporting and investigation of data. However, the fact that it has so many tools, including macros, at its disposal and can hold large amounts of data has misled some people to think of Excel as a database. This is a fatal error, because it leads people to think that it’s OK to employ Excel (and any other type, for that matter) spreadsheets for business reporting.

Below is a shortlist of various reasons why those using Excel as a database are badly mistaken:

Localised Copy

  • Unlike a web application, Excel is a one-user-at-a-time application – it’s not sharable unless you email it (Modern Databases Are Naturally Multi-User)
  • Large Excel spreadsheets use up RAM in a way that interrogating a database doesn’t (Databases Can Work with Database Files Much Larger Than Available RAM)
  • Information silos caused by localised nature and need for in-depth knowledge of the spreadsheet (super-spreadsheets need super-spreadsheet experts)
  • Encourages silo mentality and data hiding

Accessibility

  • Unlike a web application, Excel is a one-user-at-a-time application – it’s not sharable unless you email it (Modern Databases Are Naturally Multi-User)
  • Large Excel spreadsheets use up RAM in a way that interrogating a database doesn’t (Databases Can Work with Database Files Much Larger Than Available RAM)
  • Encourages silo mentality and data hiding

Data Integrity

  • There’s no centrally controlled master copy with the officially sanctioned data, and anyone can adjust their copy to their liking
  • Data errors are not corrected at source
  • Correction of identical information that exists in multiple places
  • Multiplication of spreadsheets for multiple reporting purposes
  • Manual entry without error checking
  • Data trail from raw data to final report becomes confused
  • Arguments over what constitutes the “real” figures
  • Multiple hits against data sources for individual reports that have large amount of overlap

“Sophisticated” Spreadsheets

  • Fancy Excel spreadsheets require you to write macros in VBA – you may as well spend that time learning how to program properly on a database
  • “Sophisticated” spreadsheets become impenetrable to the uninitiated
  • Information silos caused by localised nature and need for in-depth knowledge of the spreadsheet (super-spreadsheets need super-spreadsheet experts)

Static Copies

  • The data layer and the presentation layer are inextricably linked – if you want an additional column for some calculation, you need to physically create that column
  • Excel doesn’t have inherent back-up/restore capabilities
  • Lack of drilldown facilities, so that in meetings the reported figures can’t be interpreted
  • Summarising daily figures into weekly figures into monthly figures etc. without manual intervention
  • Snapshots miss subsequent alterations to the feeder data

Manual Effort

  • Spreadsheets always need someone to look after them
  • Multiple hits against data sources for individual reports that have large amount of overlap
  • Summarising daily figures into weekly figures into monthly figures etc. without manual intervention
  • Updating of super-spreadsheets can take a long time – reports may take days to be available to decision makers or otherwise massive manual effort needs to be made to speed up the reporting phase
  • Different contributors often need to be chased to update the spreadsheet
  • Data specialists should spend their time analysing and understanding data rather than creating and fine-tuning spreadsheets (= opportunity cost)

And there’s a lot more where this came from under “Other”, but I suppose you get the gist …

Excel Is Not a Database (Part 1)


In the first year after the Operational Research team was formed, there was this person (if you’re reading this, you know who you are, Mike) who was the owner of a vast and “sophisticated” Excel spreadsheet, from where various items to do with raw material usage at the blast furnaces were controlled. Everybody apart its loving owner agreed that we should replace it with a system that was more transparent and not depending on one person overseeing this unsightly mess – although we didn’t tell it to his face, we weren’t that cruel.

Still, in the final year prior to his retirement it became a running joke that at every monthly meeting, when we reported on our highlights from the past month, he declared how his system was now even more automated than before, and only required one click instead of three before the change – in short everyone knew when a turd was being polished. Even then, he had the audacity to refer to his spreadsheet as his “database” !

Fortunately he was not around to see how soon after his retirement his sophisticated system fell out of favour, and was dropped. The main reason ? It took the person who attempted to look after the system several days a week to even produce some useful feedback – in the end the effort could not be maintained, and another, more simple means of controlling raw materials feed for the blast furnaces was adopted.

Unfortunately he was not the only one who produced this type of sophisticated spreadsheet: I suppose it gave the person in charge great satisfaction to have all its functionality at his fingertips. The problem was just that: usually there was only the one person in charge of the spreadsheet who had a full understanding of how everything fit together, and although useful, the amount of time wasted in getting the spreadsheet to crank out its desired output had to be seen to be believed. I used to quip that every graph or report handed into senior management for their use should show the number of man hours it took to produce it. Maybe then the BMT would not be so cavalier about time spent serving sometimes frivolous reporting needs if they saw how much time otherwise would be available doing more productive tasks.

But to return to the concept of Excel as a database: there is no such thing !!!

A database may hold data in a format suitable to facilitate reporting, but unlike spreadsheets the data layer and the display layer are not inextricably linked. Meaning that if you want to create a variety of reports based on the same set of data, you don’t have to start creating new worksheets and linking cells, and basically rearrange your data layer to produce a new report. But for some reason or another, people accustomed to spreadsheets fail to appreciate this glaringly obvious discrepancy. In fact I’d say that those who have grown up with spreadsheets have to unlearn everything they know about them and acquire a new look-out on how a set of reports fed from the same database, but not physically linked with it, reduces maintenance times to a fraction of that required for a spreadsheet (after the initial development phase, obviously).

The problem with spreadsheets is that they’re easy to use and it doesn’t take long to become quite proficient getting results from them. What people often fail to notice is that it’s also a limited tool, because in the end “it does the job”. But the cost is one of those creeping poisons: initially it’s something that you can handle, but as the requirements change and new functionality needs to be added (thereby creating the sophisticated monster some people were so proud of) it becomes an all-consuming entity that fills people’s time and engulfs their PCs with gigabytes of historical data.

The more I started to think about it, the more I felt that I needed to highlight the unacknowledged evils of maintaining business systems using Excel spreadsheets. First I created a wiki article in the Strip UK wiki entitled “Excel is not a database”, and when I got wind that the BICC intended to investigate Power Pivot as an alternative to proper databases, I created a presentation entitled “Why Excel should not be used for business reporting”. But that will be part of further blog entries.