It’s hard to crack open a business magazine or click on a human resource blog these days without coming across the phrase ‘Big Data’ in the headline.
Most definitions of ‘Big Data’ center around words that begin with the letter ‘V’– Volume, velocity, and variety.
Volume means big, not surprisingly.
Where Megabyte used to mean ‘big’ and now Gigabyte means ‘big’ now, Big Data is measured in terabytes, petabytes, and zettabytes (a trillion gigabytes).
Yet when it comes to predicting job performance, we seldom have had more than a few megabytes of data.
This series is focused on approaching HRIS Project Management from outside of the box as experienced and documented by Garrett OBrien of CGServicesUSA Inc as well as Lauren Gander of HR Software Solutions, Inc.
Lets be clear right up front that we are not ditching any of the methodologies of Project Management. Instead, we are taking a look at just why some projects go so well while most others do not from our own experiences as well as those documented elsewhere. From there we are focusing on the successful elements that seem to be missing from most Project Management methodologies. The projects that do not go so well happen more frequent than not and it seems most Project Managers just give in to timeline extensions and over budget occurrences as being part of the learning curve to gain for the next project as well as being part of the job.
Garrett did as well until his 5th project and decided there was more to making a project come in on-time and within budget than just luck and more likely the cause of missing skills as well as incorrect perceptions. If some Project Managers could run most if not all their projects on time and within budget while most others could not, then we sought out to find out why some could consistently do so well.
We look into what contributed to those consistent successes in our entire content series The HRIS World Project Insights Series™ with the first 6 posts sharing the professional experiences of Garrett OBrien. These 1st 6 posts are the foundation of most of this series and it is highly suggested you give them a good read. They are easily found with the short URLs j.mp/thwCESintro1 to /thwCESintro6 as well as in the listing of related content in the toggle box at the end of all posts related to this series.
All other content arrives from contributors like you as well as the sharing of experiences from Lauren Ganders company. Feel free to reach out to us if you wish to contribute some your thoughts via a post by clicking the contact us button on the lower right of any page.
Feedback, debates, discussion, collaboration and conversation are always encouraged in the comments section below... For more information about this series, use the blue contact us button on the lower right of your screen to contact us -- or if you are reading this by our newsletter, then hit the reply button to get back to us!
The largest meta-analyses that combine the data from hundreds of studies that include data on the performer’s mental ability, personality, bio-data, training courses, work experience, and credit history, in order to predict important performance outcomes such as supervisory performance ratings, attendance data, customer satisfaction scores, production and wastage data would still have a hard time exceeding a single gigabyte.
Adding social media and video streams certainly boosts the volume and also the variety and velocity of the data.
Variety means different types of data– some data coming from numerical scores on tests, training courses or supervisory performance ratings.
Other types include rankings– as in sales success rankings.
Other types also includes free text answers, from simple word associations to complex sentences, paragraphs and whole books.
Or the data could be still images, audio streams or video streams.
Streaming data types involve velocity, meaning they change rapidly.
Regardless of the volume, variety, and velocity of the data, it has to fall into 2 other important categories
 predictor data
 criterion data
Predictor data captures qualities of the performer that tell us who will be highly valuable vs. less valuable in the future.
Criterion data tells us who the more and less valuable people are, based on their job performance.
Predicting who will generate the most value before they spend 3-18 months on-the-job is the whole point of candidate selection programs.
When Big Data Makes Little Sense
Confusing Matched Data As Big Data
But in order to know which predictors work, we need to have people for whom we have both predictor and criterion data (matched cases).
And that’s often a big problem.
Some people confuse Big Data with Matched Data.
Merely having zillions of test scores or credit histories, or bio-data scores or social media interactions will do nothing to discover how those scores can be combined to predict which candidates will do well on the job.
Yet, many retail companies collect millions of predictor records and don’t match them up with performance data.
They take the value of their assessment process on faith.
Alternatively, many professional sports organizations collect endless statistics on their player and team performance, year after year.
Yet few measure the mental abilities, bio-data histories, and personality characteristics of their athletes.
Without matched cases, the data can’t be analyzed by powerful statistical methods that quantify how well a given set of measures predicts future job performance.
Measuring Finicky Conditions
Big Data also makes little sense when either the predictor or criterion measures aren’t reliable– that is, a person who scores high on one occasion scores low on the next measurement occasion with no systematic change to explain the difference.
Asking people to record their mood on a scale of one to ten each day, and then matching that to their daily sales numbers in an inbound call center would most likely result in a low predictor-criterion relationship.
While a few people are consistently in a bad mood and a few others consistently in a good mood, most people’s mood varies with the day, making mood on any one day a poor predictor of performance.
Tons of History And No Indicators for Performance
Finally, Big Data makes little sense when there is no clear, efficient way to score the data that relates to future job performance.
Owners of large resume databases face this problem, combined with the problem of having little criterion data against which to confirm the value of the resume scores they do concoct.
This Week's Featured Free Offer
The Essential HR HandbookThe Essential HR Handbook is a quick-reference guide that sheds light on the issues that keep managers up at night
This book is filled with information, tools, tips, checklists, and road maps to guide managers and HR professionals through the maze of people and legal issues, from recruiting and retaining the best employees to terminating poor performers.
With this book, you’ll learn how to effectively and efficiently:
- Individually manage each employee, starting on his or her first day
- Manage a multi-generational workforce
- Appraise job performance
- Coach and counsel
- Provide equitable pay, benefits, and total rewards strategies
- Identify legal pitfalls and stay out of court
The Essential HR Handbook is the one HR guide every manager needs on his or her desk!
Offered Free by: Career Press
Total number of resume characters, number of sentences, characters per sentence, words per sentence, characters per word– all these scores can be computed quickly and cheaply, but don’t likely relate to job performance.
Boolean character searches can return resumes that contain exact matches to key search terms the greatest number of times.
Recently, more sophisticated search algorithms contain proxy words for the primary search terms, so that candidates using terms that mean similar things will be included in the returned set of resumes.
Still, none of these purely algorithmic solutions deliver candidate slates whose performance rises substantially above average.
When Big Data Brings Big Benefits
Complex Issues Requiring Multi-Level Measurements
Big Data makes the most sense when the work itself is complex, requiring
- people with both mental and interpersonal skills and
- people who perform well in very specific environmental conditions
Subtle factors can be teased out of the interactions, such factor as
- environmental and
Big Data takes many matched data cases to produce stable prediction equations that involve non-linear terms to capture these subtle interactions.
Measuring Present Behavior for Future Performance
Big data is not needed if a mental ability test and a measure of social potency capture most of the predictive variance for a routine job.
In one example, more than 7000 store clerks for a national chain of cosmetic stores were administered items from a test that had validated many times in retail settings.
The linear scale scores based on the items correlated poorly with a criterion based on a recorded termination code.
Reasons for termination included:
- return to school
- accepted a job elsewhere
- spousal relocation
- terminated for theft
- terminated for poor performance
- voluntary quit
Given the large dataset, it was possible to analyze each item against the proportion of positive vs. negative termination codes, and develop a new scale that correlated negatively with the old scales, but positively with the new predictor scale.
And keep in mind that 7000 cases is still small data to those who define Big Data in terms of terabytes and petabytes.
We will need Big Statistics to get to work on the Big Data that is being collected by LinkedIN, YouTube, Facebook, and other social media.
Companies like Burning Glass and Content Analyst have developed statistical methods for matching essay type answers marked as exemplars from top performers and poor performers in order to identify the performance level of answers from candidates responding to the same questions.
While useful for scoring SAT essays, this technology could also reliably score digitized answers to behavioral interview questions collected from candidates online via audio or video recorded interviews.
The Big Benefits of Big Data require that we have both Big Predictor and Big Criterion data on matched cases.
We will need more work on Big Statistics for scoring the variety of high velocity data types (sensor, audio, and video streams) already captured, and on the way.
At PeopleAssessments.com, we invite interest from clients committed to collecting and scoring both big Predictors and Big Criterion to unleash the true power of Big Data to document a practical step-change boost to the value of human performance.
Discover More From Our Project Insights Series
More Content In This Series…
Our Social Media Presence
Tom Janz is a seasoned, published thought leader in talent sourcing, assessment, and development.
He applies the following strengths to creating human asset value:
- advanced analytical skills
- emotional intelligence
- realistic optimism
- a passion for achievement, and
- the courage to be new
Tom's specialties include behavioral interviewing, business impact analysis, strategic performance modeling, testing, performance management, corporate culture assessment and development.
You can reach Tom via social media, email, or by leaving a comment below...
Latest posts by Tom Janz (see all)
- The Gaming Imperative for Pre-Employment Screening Assessments – Part 1 - Sun, 1-May-2016
- Big Data for Predicting Job Performance: Big Dollar$ or Big Whup ? - Wed, 20-Apr-2016
- Is Your Company’s Strategy All Chat and No Metrics? - Sun, 17-Apr-2016
- How to Turn Oddball Interview Questions into Valuable Answers - Tue, 23-Feb-2016
- When Turnover Boosts the Bottom Line - Tue, 12-Jan-2016