Big Data

Share this page

Written by Jonathan Kettleborough on 1 April 2014 in Features

In the first of three articles on Big Data, Jonathan Kettleborough provides an introduction to BD and its role within L&D

It’s difficult to pick up any professional publication or read related online materials without being reminded of Big Data. We’re seeing the BD label being applied to more and more areas – each time claiming that it will bring new meaning, insights and understanding to everything it touches. Perhaps, then, it’s worth a deeper look.

But what about L&D? Can BD help us in the way it claims? Can BD transform what has traditionally been a bit of a ‘touchy-feely’ profession and, if it can, are there any consequences of which we should be wary?

Over the next three months, I will begin to immerse you in the issues of Big Data and its application within L&D. In this article, I will give you an introduction to BD and its potential application within L&D. Next month, I will highlight some of the pitfalls with handling BD and explore a number of reasons why L&D has historically struggled to get the best out of the information it has, and, in the final article, I will work through some of the ramifications of making decisions based on BD, which may have deeper legal consequences in the future.

It’s shaping up to be an interesting journey!

What is Big Data?

Wikipedia has a rather detailed definition of BD at, although I believe that a far clearer – and more popular – definition was provided in a report by Gartner analyst Doug Laney in 2001, in which he identified three characteristics of the complex, often large, data sets that were emerging from e-commerce1. Those “three Vs” have become shorthand for describing BD.

The “three Vs” are:

  • volume This refers to the sheer amount of data collected. It is not about having a few records but perhaps millions or hundreds of millions of records
  • velocity, which refers to the speed at which the data is processed and analysed. While data warehousing techniques will keep a wide range of data for when – or if – it is required, with BD the information is analysed almost immediately
  • variety, which reflects the fact that data may not just be numbers or, if it is, it may be from more than one source. The data can also be words, pictures, tweets and videos from both inside and outside an organisation.

I’d like to add two more Vs:

  • validity This is how useful the data is in predicting the things you want to measure. Sales data, for example, is very useful in measuring the effectiveness of a sales person and, for this reason, has a high level of validity. The type of car a sales person drives may not be as valid in predicting success
  • value As Einstein famously said, “everything that can be counted does not necessarily count”. Just because you can capture data, measure it and assess its validity, it does not mean it has to have great value for you.

Whether there are three Vs or five, there’s no doubt that the next corporate bandwagon is up and rolling and, this time, the wheels have been attached to BD. The media is telling us that we’ve got to concentrate on BD; we’ve got to learn about it and we’ve got to embrace it. Indeed, according to the website Quartz, “big data” was one of the most overused corporate buzzwords of 20132.

Big Data is all around us

Let’s not get too worried, though; we’ve been looking at BD for a while now. If you’re in the UK and have a store loyalty card – which, according to the BBC, 85 per cent of UK households do3 – you’ll definitely be adding to the massive pool of data that retailers already hold about you, as this article from The Guardian explains4. It’s been suggested that BD allows retailers to know when you’re planning to start a family. BD has been helping credit card companies for years in their efforts to detect fraud based upon spending patterns and, if you think the recent revelations surrounding the PRISM ‘scandal’5 are scary, think again – security services have been using BD for years in their efforts to keep us safe; only recently it was suggested that the US National Security Agency was collecting 200m texts per day in its battle against terrorism6.

According to one infographic, we are also an integral part of the BD phenomenon7. Our shopping habits are tracked via store cards, our online purchases are tracked and we are contributing to the 144bn emails sent every day, the 300m photographs added to Facebook or the 175m tweets sent each day – all of which is adding to the wealth of BD and much of which is publicly accessible. Although the world of social media moves apace, it’s well accepted that there are more than 1bn Facebook accounts and more than 10bn tweets have been sent. That’s definitely Big Data!

Making use of Big Data

Although an increasing number of industries are now making use of BD, retailers have been using it for decades; indeed it was Wal-Mart that famously drove the adoption of the barcode so it could track items, thereby improving its sales and logistics information. And it is retailers – as a main example – that have worked their magic with the vast amounts of data available to them.

They now use BD to predict who should be customers for their products – rather than waiting for a customer to buy. The classic BD story that illustrates this is of the US retailer Target, which famously marketed maternity products to a teenage girl even though her father never knew she was pregnant! Using data, Target was able to market products to a pregnant teenager based solely on her buying behaviour – and it was right!

BD is now being used for a wide variety of applications including retail, medical, insurance, engineering and others. There are some examples at

But are employees really data?

Let’s be honest, the L&D profession has often acted on ‘gut feel’ or has followed well-worn approaches. Using a video after lunch to try to combat the ‘graveyard shift’ and the unswerving belief in the ‘happy sheet’ are two over-used but real examples. But what about when it comes to people – can we really use BD to find the best talent for our organisation? It would be of real value if we could. Virtually every organisation cares about talent – they talk of their ‘talent’, ‘top talent’ and ‘high potentials’ – but where do you begin to find future talent for your organisation and can BD really help us predict behaviour and performance?

I’d like to say that there’s no magic formula but the book Moneyball: The Art of Winning an Unfair Game, by Michael Lewis, changed my mind on this. Lewis’s book tells the story of the Oakland Athletics baseball team and its general manager, Billy Beane. As the book unfolds, Beane ditches the decades-old approach of scouting for talent and begins to build a cost-effective team using analytics and evidence-based statistics – Big Data. It’s a gripping read and is now a film starring Brad Pitt.

It’s not only baseball – BD is also transforming football on the UK side of the Atlantic, with managers such as Sam Allardyce using, and benefiting from, detailed performance analytics9. The Moneyball approach is quietly being adopted – it doesn’t matter what a player looks like or where he comes from but what he can do. Performance matters!

But what, you may wonder, do baseball, pregnant women and terrorism have in common with L&D? Well, it’s simple, it’s all about being able to identify and develop the right talent at the right time – to get the very best from the people we want and the people we have.

L&D’s Big Data

As we’ve seen, BD can be really BIG – massive, in fact – probably more massive than most of us will ever come across but, to be honest, many L&D professionals’ data sources are somewhat smaller, but nonetheless of great use. Returning to the five Vs mentioned earlier, the implications for L&D of BD are as follows:

Volume As we’ve seen, retailers can capture enormous quantities of employee-linked sales data on virtually a per-second basis but, for many organisations, L&D BD is smaller but still of use. An organisation with, say, 1,000 employees will have records not only for its current employees but on past employees as well, and also those who turned down an offer and those who made it to second interview and those who… well, I’m sure you get the picture.

Velocity Amazon, Netflix, Google and a host of other companies are monitoring vast amounts of data in real time – true velocity. The average L&D department tends to run at a different speed because of the nature and frequency of the data collected. Appraisal scores, course completions, average time to fill vacancies, feedback on courses etc tend to be collected on an annual, or at best quarterly, basis. Sales data tends to be one of the few, if only, data sets that will be analysed with any regularity or speed.

Variety Here’s where L&D BD really comes into its own. Most organisations have an exceptionally rich variety of data, which may include such items as:

  • recruitment including number of vacancies and average time to fill them, data from CVs, qualifications and experience of applicants, sources of candidates
  • succession planning Roles, current incumbents, possibilities and gaps
  • certification Who’s ‘in ticket’, who’s not, average time to reach competence, backlogs for essential or mandatory training
  • qualifications What qualifications, what level, where from and when
  • employee engagement Which leaders are good and not so good, what the issues are, where the gaps are
  • performance review Which employees, which managers, which locations, scores and performance
  • absenteeism Who, when and how often per department/region/day
  • staff turnover by department/location/manager
  • training Who, what, where, when, feedback, test scores etc
  • sales or other business performance Who has sold what and when, whose customers come back and whose don’t.

And that’s just for starters!

Validity On the surface, it would seem that all people-related BD is valid but this isn’t always as straightforward as it may seem. Good sales people are those who sell more than others – simple! But what if 50 per cent of what a particular person sells is returned, or results in a complaint or a failure to purchase other products? Validity is harder to predict than we may first think – and this is something we’ll explore in the next article.

Value Just because you can capture data, measure it and assess its validity, it does not mean it has to have great value for you. One quirky BD fact that emerged from the insurance industry was that orange cars were the most reliable. A great fact – and true – but of little real value. Disagree? Think what difference it would make to your car if you painted it orange. That’s right – none!

Implications for L&D

In the past, we’ve typically studied data in isolation – we’ve looked at people’s reaction to a course, how well they perform in a role, or what their qualifications are – but BD forces us to rethink our approaches and consider the relationship between the data and the desired performance. Historically, within L&D, we’d look at turnover rates for staff but wouldn’t necessarily connect them with training records and performance reviews and who your manager was and which location you worked at and how long you’d spent in the organisation and the date of your last promotion – phew!

But that’s where we’ve got to go. We know that we can no longer hope that someone will remain with your organisation because they once went on a ‘nice course’. Taking that approach is about as useful as predicting survival rates on the Titanic based on who was wearing a lifebelt – there are just so many other factors to consider. Instead, L&D professionals will need to use BD approaches to link a whole range of applicable data to look for trends.

It all sounds rather complex but it can be done and it can deliver results. Web giant eBay uses BD to fight talent turnover and attrition. “You can really drill down to where it’s happening,” says Elizabeth Axelrod, senior VP of HR10. In addition to seeing departmental or managerial hotspots for leaving, eBay can identify other, less obvious, factors. “If somebody has been in a role for three years, hasn’t been promoted, and hasn’t changed roles, there’s a far higher probability of attrition than someone who doesn’t have those circumstances,” Axelrod says.

Wow! That’s almost the Holy Grail – the ability to predict who’s likely to leave before they do. And there’s more...

One financial services organisation has for years – understandably – operated under a belief that employees with good grades, who come from highly ranked colleges, will make good performers. So guess what – it selects, recruits and promotes people with great grades from great colleges. The problem is that, some years ago, one of its analysts performed a statistical analysis of sales productivity and turnover. He looked at sales performance during an employee’s first two years and correlated total performance and retention rates against a variety of demographic factors. What he found was astounding11.

The factors that drove sales performance were:

  • an accurate, grammatically-correct resume (It’s a US company)
  • having completed some education from beginning to end
  • having successful sales experience in high-priced items
  • demonstrated success in some prior job
  • ability to work under unstructured conditions.

What did not affect sales performance:

  • where the candidate went to school
  • what GPA (grade point average) he had
  • the quality of his references.

Just think about this for a moment – the entire modus operandi for recruiting great people was wrong! All that time and effort that went into selecting the best academically-gifted talent was just heading in the wrong direction.

So, BD is pointing towards some major changes for L&D. In the past we’ve had to master TNA, evaluation, e-learning and social media – to name but a few – and now it’s time to add BD to that list. L&D is going to need a new set of skills – and fast! These issues were wonderfully illustrated in Eddie Short’s article in the February edition of Training Journal12.

As you would expect, Google has made the very best of its available data. It learned to use a combination of seven different factors to predict which employees were most likely to leave (in some cases, before the employee actually realised it himself). In other cases, Sprint used analytics to predict which new hires were likely to quit and Cisco once used BD to identify which struggling new hires were likely to succeed over the long term.

The challenges for L&D

In business, it’s not always so simple. We can measure the impact of sales people based on how much they sell and the loyalty of their customers but how would we really measure a great HR manager or a great administrator or a great learning designer? Of course we have competency frameworks that we can apply but where is the real evidence? Where are the facts? For the most part, the best we can do is judge how someone performs once he is inside the business. How on earth would you look for someone with the right capabilities who is outside the business, where key performance information may not be so visible? Ideally we need to understand the recipe for success for each of our key roles. The theory is easy – but reality is tougher, as I shall explore in the next two articles.


L&D can’t ignore BD for much longer. The landscape is changing and data is driving decisions more than ever before. It’s no longer about ‘things’ or ‘widgets’. BD is now about people and their performance. There are new tools and techniques we will need to master and new problems to overcome. Future decisions will be more evidence-based and may reveal trends and insights that may shake our industry to the core.

In the next article, I’ll explore the problems that often occur when analysing BD, investigate why L&D professionals generally don’t handle information well and look at some of the pitfalls to
be overcome.

A fully-referenced version of this article is available on request.

About the author

Jonathan Kettleborough is a consultant, author, blogger and lecturer. He has worked in L&D and business for more than 25 years. He can be contacted via Twitter @JKettleborough or


Please login to post a comment or register for a free account.

Related Articles

25 January 2022

This week’s selection of news, research and insights from across the world.

25 January 2022

Looking to the future: what does 2022 hold for organisations, their leaders and people?

24 January 2022

A case study of how Hyundai Card used the metaverse to engage and reward their teams during Covid restrictions


Related Sponsored Articles

6 December 2021

Learning Pool, global provider of e-learning solutions, is thrilled for its colleagues, Stefan Eger and Ronnie Wilson-Miller who both achieved wins at the Learning Technologies Awards 2021

30 June 2017

Against a backdrop of recent headline-making global cyber-attacks, cyber-security training game Zero Threat has won an international award.