Now Reading
Why AI can’t do hiring

Why AI can’t do hiring

2023-05-16 11:14:36

The current thrilling and considerably horrifying inflection level in AI functionality, which many people received to expertise firsthand when enjoying with OpenAI’s ChatGPT, tipped me into lastly penning this put up.

I’m the founding father of, a mock interview platform and eng hiring market. Engineers use us for mock interviews, and we use the info from these interviews to floor prime performers, in a a lot fairer and extra predictive manner than a resume. In case you’re a prime performer on, we fast-track you on the world’s greatest firms.

We’re enterprise backed and have raised 4 rounds of funding within the final 7 years, totaling over $15M, which signifies that I’ve performed quite a lot of VC pitches. I don’t know what number of precisely (and a girl ought to by no means inform), however it’s within the lots of. When you’ve performed that many pitches, you begin to hear the identical suggestions again and again. They vary from questions on whether or not the eng hiring is large enough (it’s) to how objections about human-on-human interviews don’t scale (if 2 people doing a factor collectively didn’t scale, our species can be extinct) to well mannered strategies about how we’d be a way more engaging funding if we used ML/AI to match candidates to firms.

I’ve heard the latter a LOT over time, however regardless of the well-intentioned recommendation, I’m satisfied that constructing an AI matcher is a idiot’s errand. My argument is that this: It’s not that AI doing hiring is technically unimaginable – ChatGPT has proven us that the ceiling on what’s potential is larger than many people had ever imagined – however that it’s unimaginable since you don’t have the info. In different phrases, the onerous half about hiring isn’t the tech. It’s having the info to make good hiring selections within the first place.

For the needs of this piece, I outline “hiring” as with the ability to discover candidates for particular roles and fill these roles. I’m NOT referring to automating duties like resume parsing, writing sourcing emails, or scheduling, i.e., duties that human recruiters, sourcers, and coordinators do as a part of their job. Absolutely these may be automated. The extra fascinating query is whether or not an AI can do the job of a recruiter higher than a human.1 In different phrases, can it take a listing of candidates, a listing of job descriptions, after which use publicly out there (NOT proprietary) information to match them up efficiently and fill roles? In spite of everything, the truth is that neither recruiters nor burgeoning AI recruiting startups have a proprietary information set to work with. They normally have job descriptions, LinkedIn Recruiter (the granular search performance of which isn’t publicly accessible… however the LinkedIn profiles of candidates really are), and no matter else they’ll discover on the web.

To wit, this put up isn’t about how AI can’t be used for hiring when you’ve got all the info.
Relatively, it’s about how one can’t get entry to all the info you’d have to do hiring, thereby making the coaching of an AI unimaginable.

Just a few caveats: the case for Microsoft and the query of bias

At this level, you’re most likely considering, “Nicely, absolutely Microsoft can do that, provided that they personal each LinkedIn and GitHub.” On this put up, you’ll see why LinkedIn and GitHub are usually not sufficient. Maybe if Microsoft selected to purchase a bunch of applicant monitoring programs (ATSs) with the intention to get entry to interview efficiency information, coupled with information from LinkedIn and GitHub, they’d have a combating probability, however actually that’s most likely not sufficient both.

Furthermore, the tenuous Microsoft edge apart, the truth is that almost all of us do NOT have entry to the form of coaching information we’d want, however we nonetheless see startup after startup claiming to do AI hiring of their advertising supplies.

Lastly, earlier than we get into why AI can’t do hiring, I need to name out the essential query of bias that outcomes from coaching AI on hiring information the place selections had been beforehand made by people. To maintain this (already lengthy) put up on process, I cannot contact as regards to the bias. To make sure, it’s an actual downside, and there’s already quite a lot of good writing on the topic.

That caveat apart, if we’re attempting to construct an answer that takes candidates and jobs as inputs and produces a set of matches as output, let’s begin by contemplating what that matcher does and the way it’s educated.

How can we prepare an AI matcher?

Let’s faux for a second that we’ve constructed the platonic superb of an AI matcher. It takes 2 inputs: a listing of candidates and a listing of firms, and a sorted checklist of firm/position matches come out, like so:

To coach this matcher, we’d like 3 distinct items of knowledge:

  1. Publicly out there job descriptions from a bunch of tech firms
  2. Publicly out there information about engineers, i.e., LinkedIn profiles, GitHub profiles & contributions, and engineers’ social graph throughout a bunch of various platforms
  3. A listing of profitable firm/candidate matches, taken from the general public area, e.g., from scraping LinkedIn to see the place folks labored/for the way lengthy, and cross-referencing that with (1) and (2).

It’s possible you’ll discover that in all 3 information necessities above, I known as out that they’re publicly out there. That may appear odd at first — in spite of everything, if you happen to’re beginning an organization that’s constructing this matcher, your secret sauce may be your proprietary information about candidates or firms or each, and also you would possibly first run a special enterprise mannequin, fully divorced from AI, to gather this information, at which level, increase, you flip a change, and impulsively you’re an AI firm.

That’s a superb technique, however it’s really actually onerous to amass detailed, proprietary information about candidates, firms, or how effectively folks do at firms as soon as they’re employed, not to mention all 3 directly. Most startups that attempt to construct an AI matcher don’t begin with a bunch of proprietary information. Relatively, they begin with the general public area. The thesis of this piece is that getting the info is the onerous half, not the AI, so to motive by means of it, let’s assume that we’ve the AI already however that the one information we’ve is publicly out there.

So now that we’ve all this coaching information, let’s get to work. We’ll undergo the set of profitable matches after which discover extra information about these candidates and people firms and see which traits carry essentially the most sign for a superb match.

We now have a educated and dealing matcher. Thus far so good. However wait, not so quick!

What’s hiring, actually?

Let’s change gears and neglect about our matcher for a second. Broadly talking, no matter how we get there, what must be true for somebody to be a superb match for a job? There are three issues:

  1. Good: They’re ok to do the job
  2. Wanting: They’re open to taking a brand new job
  3. : They’re within the job/firm

Let’s succinctly name these “good, trying, and ”. These three standards are essential to make a rent.

The primary two objects are largely impartial of the corporate. The third is about candidate/firm match, and we’ll come again to it after we discuss matching. Earlier than we try this, although, can we really deduce which engineers are good and searching?

I might argue that we will’t. Irrespective of how the matcher ended up getting educated or what patterns or artifacts it detected and assigned worth to, the info to inform whether or not somebody is an efficient engineer (even when the definition is elastic, relying on a given firm’s “bar”) merely doesn’t exist within the public area.** An AI is excellent at discovering patterns in present information. It isn’t good at magicking information out of skinny air. That signifies that earlier than we even get to the query of matching, we’re useless within the water.** Let me attempt to persuade you.


Usually, you may have 3 items of public information out there to you for a given engineer:

  1. Their public-facing LinkedIn
  2. Their public-facing GitHub
  3. Their public-facing social graph

Absolutely you’ll be able to inform if somebody is an efficient engineer from some combo of those 3?


Let’s have a look at every of the three information sources above, beginning with LinkedIn. What information is offered on engineers’ public-facing LinkedIns? Given {that a} LinkedIn profile is a glorified resume, it’s normally these 3 issues:

  1. The place they’ve beforehand labored
  2. The place they went to highschool
  3. Any certifications/endorsements/abilities that they’ve

I run a hiring market, and having any form of edge in predicting which of our customers are good is materials to our enterprise, so we’ve spent a superb quantity of effort and time attempting to tie these attributes to how good an engineer is. Because it seems, an engineer’s employment historical past carries some sign, faculty carries little or no to none, and LinkedIn certifications and endorsements carry a unfavorable sign.2

Understanding which programming languages or frameworks an engineer is aware of may be helpful, however most eng roles are both language/framework agnostic OR the language/framework isn’t essentially the most vital bit when figuring out whether or not an engineer is an efficient match or not — realizing the language is normally a pleasant to have, however it received’t get you employed if you happen to’re not a superb coder, at the beginning.

Usually, within the absence of with the ability to seek for whether or not an engineer is nice, recruiters will seek for programming languages as a proxy for match, however that’s all it’s, a weak proxy.

Principally, if you happen to’re counting on a combo of faculty, work historical past, and particular abilities, you’re doing precisely the identical work that armies of recruiters have been doing for many years, with very restricted success. It’s very easy to go looking LinkedIn for previous employers, faculties, and abilities/endorsements. They’ve constructed a complete enterprise on it. Nonetheless, people are horrible at predicting engineer high quality from resumes — only about as good as a coinflip. And it’s not as a result of people are dangerous and that an AI would do higher. It’s as a result of there’s minimal sign in a resume within the first place, and take a look at onerous as you would possibly, extracting sign on this case is like squeezing water from a stone. Having a robotic hand won’t prevent.

AI or not, we’re in a tough market. Preserve your abilities sharp. Join nameless mock interviews with engineers from prime firms.

See available times


GitHub is an fascinating one. Absolutely when you’ve got entry to a bunch of code somebody has written, you’ll be able to suss out in the event that they’re a superb engineer. The GitHub method can be interesting as a result of it’s far more meritocratic than a resume — your good code can stand by itself, no matter who you might be or the place you come from. Wouldn’t or not it’s nice if GitHub might assist you to floor the odd diamond within the tough or an upstart with no job expertise?

That is in fact, the moral query of whether or not GitHub ought to be used for hiring within the first place and whether or not it’s truthful to create a bias towards hobbyist engineers and/or open-source contributors.3 For the needs of this piece, I’ll put apart that query and focus simply on whether or not GitHub can be used for hiring.

The brief reply is that it could actually’t as a result of most engineers don’t have public commits. Senior engineers at giant tech firms don’t work on open-source initiatives for essentially the most half. Because of this programmers on Reddit laugh on the concept of screening out candidates with unused GitHub accounts. Bjarne Stroustrup, the inventor of C++, would look unimpressive to an algorithm obsessive about GitHub exercise.

This identical message is obvious from information on GitHub’s customers. The web site reported 100 million active developers in early 2023, however information on all commits in 2022 exhibits that solely about 11 million customers had any public code commits and solely 8 million had greater than two commits. That signifies that solely 8-10% of engineers even have publicly out there GitHub information that our AI might use, in the very best case, and this counts tens of millions of repositories with names like “howdy world” and “check.”

Ben Frederickson did a few of his digging concerning the utility of GitHub in hiring and printed a stellar, highly detailed report in 2018. In accordance with Frederickson, only one.4% of GitHub customers pushed greater than 100 occasions, and solely 0.15% of GitHub customers pushed greater than 500 occasions. Frederickson’s findings from 2018 roughly corroborate ours from late 2022, and in each circumstances, there’s a clear energy legislation – many of the commits are being performed by a tiny fraction of GitHub’s customers.

A paper known as “The Promises and Perils of Mining GitHub”, printed in 2014, checked out GitHub exercise as effectively, by means of the lens of initiatives reasonably than customers. They discovered an analogous power-law relationship – essentially the most energetic 2.5% of initiatives account for a similar variety of commits because the remaining 97.5% initiatives. Furthermore, this paper discovered that about 37% of initiatives on GitHub weren’t getting used for software program improvement, and the remaining had been used for different functions (e.g., storage).

Given these limitations, the truth is that the portion of GitHub accounts that would really be helpful for hiring is probably going beneath 1%.

Social Graph

Lastly, we’ve the social graph. The speculation right here is that nice engineers observe different nice engineers on platforms like GitHub and Twitter, so if we will establish a set of nice engineers in some way, and dig deeply sufficient by means of the tangled internet of whom they observe and who follows them, we’ll have the ability to create a dependable expertise map.

Let’s assume for a second that we will seed this method with sufficient nice engineers (e.g., by scraping the GitHub profiles that do have sufficient code to extract sign). What occurs subsequent?

To get a really feel for the “following” conduct amongst software program engineers and whether or not they are likely to observe the most effective folks they work with, we surveyed our consumer base, like so:

Based mostly on nearly 1000 responses from our engineers (common years of expertise = 8, median = 7), the social graph method doesn’t maintain water. First, engineers that we surveyed solely hardly ever adopted their most spectacular colleagues on GitHub or Twitter. The typical engineer reported following simply one among their prime 5 coworkers in these websites. The social graph was simply not that related to folks’s perceptions of expertise.

What about Linkedin? **Though lots of our customers did certainly observe the most effective engineers they’ve ever labored with, in addition they adopted everybody else: nearly all of engineers we surveyed reported that their connections had no rhyme or motive, or they simply related with anybody who tried to attach with them. **

These muddle the graph-based approaches. Twitter and GitHub follows are nonetheless too unusual to be a dependable sign. And LinkedIn connections seize a haphazard mixture of networking approaches — essentially the most dominant of which is both random or simply pushed by office, which any recruiter would have identified within the first place.


Let’s assume for a second we will determine who the great engineers are (or price them on some form of scale, a minimum of). Subsequent, we’ve to determine which of those engineers are available on the market proper now.

That is arguably a better downside to resolve than whether or not an engineer is nice as a result of publicly out there cues do exist. That mentioned, plenty of startups have tried to sort out this downside (most notably Entelo, with their Sonar product), although thus far none have been significantly profitable. The sorts of inputs that sometimes go into attempting to determine if an engineer is energetic are break up between candidate-level attributes (e.g., how not too long ago they up to date their LinkedIn) and company-level attributes (e.g., how lengthy for the reason that firm’s final spherical of funding). Right here’s what that checklist might appear like:

Candidate-level attributes:

  • When did they final replace their LinkedIn/have they not too long ago began being extra energetic on LinkedIn?
  • How lengthy have they been of their present position? And at their present firm?
  • Have they not too long ago began being extra energetic on social media (e.g., tweeting about engineering matters)?
  • Have they not too long ago began a weblog?
  • Have they not too long ago began contributing to open supply?

Firm-level attributes:

  • How lengthy has it been for the reason that firm final raised cash (if not public)?
  • How has the inventory worth been doing (if public)?
  • Produce other folks been leaving the corporate, particularly administration?

A few of these attributes are simpler to tug than others (e.g., to trace LinkedIn updates, one must be logged in and has to repeatedly cache candidate exercise, which seemingly violates LinkedIn’s phrases of service), however I think about that there’s sufficient publicly accessible information to make some guesses about who’s shifting. In fact, these guesses might be pretty primitive for candidates who don’t do stuff publicly and loudly — going simply off of inventory worth and/or fundraising historical past offers you a really crude first move, however it’s not sufficient.

When requested their favourite social media platforms, half of engineers in a StackOverflow survey reported Reddit or YouTube — which it’s onerous to think about might ever be leveraged for predictive algorithms — or not utilizing social media in any respect.

As such, even when this information is less complicated to get than candidate high quality information, determining who’s trying continues to be an information downside and never an AI tech downside.

Matching: Determining what engineers need and whether or not firms have it

Let’s say that we’re in some way capable of surmise from public-facing candidate information whether or not and engineer is each “good” and “trying”. Now we have to determine whether or not they’re really going to be all for a given firm.

To try this, we have to have a listing of firm attributes that engineers might care about, after which we have to determine 1) how every firm stacks up in opposition to this checklist and a pair of) which attributes a given engineer cares about. I’ve been a recruiter for a few decade, and under is a decently consultant checklist (in no specific order):

  • Compensation
  • Firm mission/whether or not the corporate is mission-driven
  • Vertical
  • Dimension of complete group and the eng group
  • Tech stack
  • Younger vs. established
  • Status of the model/social proof
  • What issues are being solved/what’s arising on the roadmap
  • Chemistry with supervisor and with the quick group
  • The corporate’s tradition and values

Now that we’ve this checklist, how can we determine how every firm stacks up? With no proprietary information set, the principle useful resource you may have is an organization’s job descriptions. How a lot of this info are you able to pull from these descriptions?

See Also

Determining what firms have to supply

Compensation is more and more simpler to get, particularly given current laws in some states that makes it obligatory for employers to incorporate wage ranges of their job descriptions. In fact, these ranges are typically broad, and lots of aren’t tremendous helpful (see the examples under), however it’s one thing. The screenshot under comes from, a website that aggregates comp ranges from job descriptions in NY and California, the place firms are legally required to reveal them. As you’ll be able to see, the ranges are fairly broad.

Beneath is an instance for a particular firm and position: Dropbox’s open Cell Software program Engineer position within the US. As you’ll be able to see, the ranges are fairly broad (a 66K unfold for SF, NYC, and Seattle as an illustration). In my thoughts, like many of those ranges, all this tells you is “you’re going to receives a commission marketplace for the situation that you simply’re in”. In fact, if an organization is paying under market, that’s one thing you should know, however that’s the exception reasonably than the rule.

Some attributes like vertical, tech stack, whole dimension, and model status, are extra constant than compensation and are potential to determine from publicly out there information.

Whether or not the corporate is mission-driven can typically be deduced from the vertical, the corporate’s B-corp standing, and the corporate’s job descriptions. Nonetheless, given each firm’s penchant for sounding mission-driven, even once they do one thing as dryly mercenary as ad-serving infrastructure, this can be a bit tough. However I’m certain the tech is there to determine this one out.

Properties like eng group dimension, upcoming initiatives and roadmap, and the corporate’s tradition/values are a lot more durable. Sussing these out isn’t a technological downside however reasonably an information one, similar to figuring out whether or not engineers are good or not. Don’t imagine me? Learn any job description. How do you start to determine from this wall of mush what it’s really wish to work at an organization? Possibly some nice job descriptions exist on the market, and possibly you may get someplace by scraping websites like Glassdoor, however given how low-signal each are typically, AI will seemingly get simply as hamstrung because the people attempting to parse these paperwork.

In case you don’t purchase that, try and search for actually any firm. You’ll discover that each firm appears to have pulled randomly from the same-grab bag of 20 or so lofty-sounding values that inform you nothing about what really occurs at that firm daily. As an alternative, you end up smack in the course of a imprecise advantage signaling arms race.

So, a few of these attributes you’ll be able to determine from the general public area, and a few you’ll be able to’t. In a nutshell, the info out there to you on either side appears to be like like this:

Now we’ve to sort out the second a part of the matching downside: how can we determine what every engineer on our checklist is on the lookout for? As an illustration, which engineers in our database care about compensation, and what are their wage necessities? And what forms of issues are they most all for? And so forth…

The fact is that there isn’t a great way to intuit most of this stuff from publicly out there information. If you wish to know what engineers care about, you must ask them. And even while you ask them, they’ll very seemingly, albeit by means of no fault of their very own, be mendacity.

Determining what engineers worth and what they’re on the lookout for

As a result of, to the most effective of my data, there isn’t a dataset on the market that compares and contrasts what engineers mentioned they had been on the lookout for versus what jobs they ended up taking, I’ll fall again to an anecdote from earlier in my recruiting profession. TL;DR Dr. Home was proper; everyone lies.

A few years in the past, I used to be interim head of expertise at Udacity. A lot of my candidates informed me that it was their dream to work in ed-tech, however one specifically stood out. Along with extolling his ardour about schooling, he informed me that one among his deal-breakers was working in promoting and that he’d by no means do it. He did nice in our first-round technical display, and I set him up with an onsite interview. Then, whereas he was on the town, he interviewed at an promoting startup the place one among his associates was working. That is the place he ended up selecting… as a result of he actually hit it off with the group. Although this specific instance was essentially the most stark (the one-letter distinction between “ad-tech” and “ed-tech” belies the huge gulf between these verticals), situations like this, the place a candidate claimed to strongly need one factor however then ended up selecting one thing fully completely different after assembly the group, are the rule reasonably than the exception.

The reality is, folks will inform you all method of lies about the place they need to work, what they need to work on, and what’s essential to them. However, in fact, they don’t seem to be mendacity intentionally. It seemingly means you are not asking the suitable questions, however typically realizing what to ask is basically onerous. For most of the questions you’ll be able to consider, folks could have all kinds of rehearsed solutions about what they need to do, however these solutions are framed to the precise viewers and will not replicate actuality in any respect. Or, quite a lot of the time, folks merely do not know what they need till they see it.

It’s onerous sufficient sussing these items out while you’re speaking to candidates 1:1. Think about attempting to collect this sort of nuanced info from what they are saying on social media or of their public blogs.

Maybe a very powerful factor I’ve discovered is that, on the finish of the day, one of many largest components of a recruiter’s job is to get the suitable two folks in a room collectively. No matter business, or area, or stack, or cash (inside motive in fact), chemistry is king. Get the suitable two folks to have the suitable dialog, and all the things else goes out the window. All people lies. It’s not malicious. It’s simply that chemistry is the factor that issues most, and all the remainder of the attributes above are poor proxies for the magic that typically occurs when two good folks have a superb dialog.

How do you are expecting chemistry between folks? Can an AI do it? Presumably, if that AI has entry to a ton of knowledge about candidates and corporations, i.e., all the things we’ve mentioned on this put up to date… AND previous candidate/firm interactions and their outcomes.

Even if you happen to can’t get all of the candidate and firm information you’d want, you CAN get a historical past of candidate/firm interactions and their outcomes from an Applicant Monitoring System (ATS). However ATS information isn’t public. It’s the other — for ATSs, their information is their moat, which is what drives retention, and ATS switching prices are painful and infrequently prohibitive.

Within the absence of wealthy candidate and firm information and the interactions between them, an AI predicting chemistry is unimaginable. Hell, people can’t do it both.

However what if you happen to do have the info?

If in case you have proprietary information, you then don’t want an AI. A easy non-AI program (e.g., a regression) or a human can do the job effectively sufficient. In truth, Arvind Narayan from Princeton gave a wonderful speak known as “How to recognize AI snake oil”, whose crux is that, for complicated questions the place you should predict social outcomes (e.g., recidivism, job efficiency), regardless of how a lot information you may have, as a result of “AI isn’t considerably higher than handbook scoring utilizing just some options”.

Arguably, if you happen to do have the info, you possibly can nonetheless construct out AI hiring to extend effectivity, however do not forget that it’s your possession of proprietary information that made an AI method viable.


So what does this all imply for the way forward for recruiting? As I mentioned to start with, AI is basically well-suited to automating a bunch of recruiting duties that people do now. As an illustration, an AI can take the ache out of stuff like this:

  • Interview scheduling
  • Composing first-draft sourcing emails (although you’d want a human to really make them sing)
  • Enriching candidate profiles with “first-order” information (years of expertise, location, determining what programming languages they’ve utilized in some circumstances, and many others.)
  • Monitoring candidate progress by means of a funnel
  • Some quantity of evaluation4
  • Creating lovely dashboards that observe key recruiting metrics (time to rent, price per rent, and many others.)

As AI will get an increasing number of subtle, the checklist above will get longer and longer, and given that almost all recruiters aren’t significantly good at their jobs, over time, AI will take over an increasing number of, and there might be progressively much less for human recruiters to do.

Nonetheless, the stuff above isn’t what makes recruiting onerous; these are the trimmings of recruiting, however not the essence. The onerous factor about recruiting is determining who’s good and who’s trying proper now, and bulldozing the best way for these candidates to have as many conversations with firms as they’ve urge for food for, with the intention to see if they’ve chemistry with that firm and probably their future group.

Till we’ve entry to all the info that reliably predicts whether or not somebody is an efficient engineer, whether or not they’re trying proper now, and what an organization provides, and whether or not they’re all for that factor, having an AI won’t be sufficient. On the finish of the day, you’ll be able to’t use AI for hiring if you happen to don’t have the info. And when you’ve got the info, you then don’t strictly want AI.

Big thanks to Maxim Massenkoff, who did the info evaluation for this put up. Additionally thanks to everybody who learn drafts of this behemoth and gave their suggestions.


Source Link

What's Your Reaction?
In Love
Not Sure
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top