The Burden of Technical Debt

The Burden of Technical Debt

Chris EvansOpinion

As the issues at TSB continue to unfold, I look on at an IT problem that will likely never be fully explained and eventually disappear into the archives as yet another “computer glitch”.  Why does it keep happening, what’s wrong with the industry and how can we do better next time?

Experience

Who am I to pass judgement on the ability of my peers, I can already hear people say?  Over the past 30+ years, I have worked at Lloyds TSB (I integrated the storage standards of TSB into those of Lloyds), JP Morgan Chase, Rabobank International, Lehman Brothers, ING and EftPos UK.  The last company on the list was a project to create a unified back-end reconciliation system for all the banks and building societies in the UK.  I also worked for a small company that developed the software for online authorisations that was originally used in payment processing at the till by Tesco and BHS.  These are only the major pieces of work for financial companies that I can bring to mind immediately.  I’ve also been involved at Barclays, Nationwide and SWIFT.  There’s well over a decade of experience here alone.

Although mostly working in the infrastructure teams, I spent much of my time liaising with the lines of business supporting various aspects of these organisations.  Most people didn’t do that, however I’ve always felt that any work was only being done to support one thing – the customer.

The Issue

What do we know so far?  TSB was acquired by Lloyds in 1995.  However, as part of the bailouts during the 2008 financial crisis, the European Commission demanded the newly formed Lloyds Banking Group divest some of its business by November 2013.  So, TSB was separated into an independent business and eventually acquired by Spain’s Banco Sabadell.  The migration of customer records (around 5 million) from the previous Lloyds systems was undertaken last weekend, moving to a new platform built by Sabadell.  Clearly that migration didn’t work correctly and customers have had issues ever since.

What’s worrying in this scenario is that the issues haven’t just been about the inability to access data.  Customers have been able to access the accounts of others and had more money in their accounts than they were expecting.  Excluding the IT issues, there are lots of problems around governance and audit that will need to be answered.

Politics

In my experience, I was amazed that financial organisations managed to implement anything at all.  So much of the time was spent trying to undermine other teams or bypass the in-place processes for delivering technology.  To be fair, many of those processes were completely broken, despite the amount of money being spent on IT.  I’ve seen decisions to change vendors made on a whim, because a senior exec from one company cancelled a meeting.  One organisation spent more time ensuring IT staff wore their jackets when away from their desks than they did on picking the right technology.  Some IT organisations were positively Machiavellian in nature, with everyone scheming against each other.

Problems

So what are the main problems for large IT organisations, including financial ones?

  • Politics – I’ve already mentioned this separately, but this is a major problem.  Excluding any internal politics, I’ve seen technology choices dictated by the desire to settle legal rulings and personal choice.
  • Shiny – why support the core “legacy” stuff when new shiny stuff is more interesting?  Too many times, the best people get dragged off onto projects looking to deploy the latest technology, while leaving the core business platforms to stagnate.  You can see the technical debt being acquired by the day.
  • Outsourcing – I’ve never seen outsourcing work well.  That’s a strong statement, but I’ve joined organisations that have had outsourcing and I’ve been “outsourced”.  In both scenarios, the aims and goals of both parties have been in conflict, with the service provider only focused on doing the job for the lowest cost and least effort.
  • Lack of Skills – many people simply didn’t have the understanding of how IT worked.  There were plenty of projects in my career where the teams had no understanding of fundamentals like multi-threading.  Most of the time, businesses wanted to employ technical people (like me) for the lowest possible wage.  Contracting rates today are less than half what I could have earned 10 years ago.
  • Lack of Vision – most senior managers were not long-termers, but looking to make a name for themselves or expand their own careers with new skills.  They would come in, make big changes without considering the long-term impact, then leave for the next job.

The last point is pretty key.  If the people in charge are only looking 1-2 years out, there’s no chance of ever rectifying that technical debt problem.

TSB

What could TSB have been doing wrong?  The first question to ask is why they went for a “big bang” approach in moving all the customer records in one hit.  Two possibilities come to mind for this.  First, that Lloyds had set some deadline to be off the old systems and that was imminent.  Second, that the connected systems like Internet and mobile banking couldn’t cope with customer records across two systems.  Either scenario (or others) could have forced TSB’s hand into a single migration process.

Next question – what was the back0ut plan?  With any migration, there are always checkpoints at which assessments are made to continue moving forward.  Once data is in the new system and starting to be modified, it becomes much harder to move back.  However, it’s clear that enough checks weren’t done moving forward.  I’ve been in these migration situations and had to call a halt or decide to move on.  I know it isn’t easy, but planning is key.

The Architect’s View™

It’s more than likely that we’ll never know the real reason for TSB’s migration failure.  The company won’t want to air their dirty laundry in public.  Instead, we’ll be told it’s simply an “IT issue”.  This is a scenario that’s been played out many times before.  The UK government are the experts in wasting billions of pounds on failed IT projects, as only one example.  It’s easy to think that the complexity of IT means these issues are inevitable, but I beg to differ.  There are many complex industries out there.  Look at aerospace or construction.  Huge projects like CrossRail are undertaken every day, with some incredible feats of engineering.  Why do these projects succeed, but IT ones don’t?

Part of the problem here is culture.  Buy a laptop and you can start programming in minutes.  While this is great from a barriers to entry perspective – anyone can get involved – there are few qualification standards compared to engineering, medicine or law.  The age of coming through the ranks as an operator have gone.  I’m not suggesting we revert to the old days, but we do need to question exactly how our industry works and how it can be improved for the better.

Comments are always welcome; please read our Comments Policy.  If you have any related links of interest, please feel free to add them as a comment for consideration.  

Copyright (c) 2007-2021 – Post #2CE3 – Brookend Ltd, first published on https://www.architecting.it/blog, do not reproduce without permission.