After more than a year of the covid-19 pandemic, millions of people are looking for employment in the United States. AI-powered interview software aims to help employers scrutinize applications to find the best people for the job. Companies specializing in this technology have reported business growth during the pandemic.
But as the demand for these technologies grows, so does questions about its accuracy and reliability. In the latest episode of the MIT Technology Review podcast “In Machines We Trust”, we tested the software from two companies specializing in AI job interviews, MyInterview and Curious thing. And we found variations in predictions and working correspondence scores that raised concerns about what exactly these algorithms were evaluating.
He knows you
MyInterview measures traits considered in the Big Five Personality Test, a psychometric assessment often used in the recruitment process. These characteristics include openness, awareness, extroversion, pleasantness, and emotional stability. Curious What measures also personality traits, but instead of the Big Five, candidates are evaluated on other metrics, such as humility and resilience.
The algorithms analyze the candidates ’responses to determine personality traits. MyInterview also compiles scores that indicate how well a candidate corresponds to the characteristics identified by hiring managers as ideal for the position.
To complete our tests, we first installed the software. We have uploaded a fake job post for an office administrator / researcher both in MyInterview and in Curious Thing. We then built our ideal candidate by choosing personality-related traits when ordered by the system.
In MyInterview, we selected features such as attention to detail and ranked them according to their level of importance. We also selected interview questions, which are displayed on the screen while the candidate records video responses. In Curious Thing, we selected features such as humility, adaptability and resilience.
One of us, Hilke, then applied for the position and completed interviews for the role in MyInterview and Curious Thing.
Our candidate completed a telephone interview with Curious Thing. He first did a regular job interview and received an 8.5 out of 9 for English competence. In a second test, the automated interviewer asked the same questions, and answered each by reading Wikipedia’s voice for psychometry in German.
Yet Curious Thing gave him a 6 out of 9 for English competence. He completed the interview again and received the same score.
Our candidate turned to MyInterview and repeated the experiment. He read the same Wikipedia voice aloud in German. The algorithm not only delayed a personality assessment, but also predicted that our candidate would be a 73% match for the fake job, putting her in the top half of all the candidates we had asked to run.