作者 | Michael Mroczka

译者 | Hirakawa

Curated | Chu Xingjuan

It's no secret that ChatGPT has revolutionized the way people work. It's hard to overstate how useful it is to help small businesses automate administrative tasks and write entire React components for web developers.

In interviewing.io, we've been thinking about what changes ChatGPT will bring to technical interviews. A big question is: Will ChatGPT make it easy to cheat in interviews? In a video on TikTok, an engineer asks ChatGPT to answer an interviewer's question accurately:

I cheated with ChatGPT in a tech interview and no one knew

The initial reaction to this type of cheat was exactly the same as expected:

Redditer 说,"众所周知,Chatgpt是编码的终结. "
The YouTuber said, "Software engineering is dead, ChatGPT killed it." ”
X (formerly Twitter) asked, "Does ChatGPT mean the end of coding interviews?"

It may seem obvious that ChatGPT can help people during the interview process, but what we want to know is:

To what extent can it help?
How easy is it to cheat (and get away with it)?
Do companies that use LeetCode questions need to make significant changes to the interview process?

To answer these questions, we've recruited a number of professional interviewers and users to conduct cheat experiments! Below, we'll share everything we've found. Spoiler aside, one thing you need to know: companies need to revise the type of interview questions, and right away!

Experiment preparation

interviewing.io is an interview practice platform and recruitment marketplace for engineers. Engineers use our platform to mock interviews. Businesses use our platform to recruit talented employees. There are thousands of professional interviewers in our ecosystem, and thousands of engineers who use our platform to prepare for interviews.

Interviewer

The interviewers are drawn from our pool of professional interviewers. They were divided into three groups, each asking a different type of question. The interviewer didn't know if the experiment was about ChatGPT or cheating, and we told them, "The purpose of this study is to understand how interviewers' decision-making tends over time, especially when asking standard and non-standard interview questions." ”

Here are 3 question types:

LeetCode Original Questions: Questions that the interviewer chooses directly from LeetCode at their own discretion without any modifications.

For example: ask the Sort Colors question on LeetCode word for word.

Improved LeetCode Questions: Some changes have been made to the questions obtained from LeetCode, which are similar to the original questions, but also have significant differences.

For example, for the Sort Colors problem above, change the input from 3 integers (0,1,2) to 4 integers (0,1,2,3).

Custom questions: There is no direct connection between the question being asked and any questions already on the network.

For example: you are given a log file in the following format: - <username>: <text> - <contribution score> -, and your task is to identify the users in the session that represent the median engagement value. Only users with a contribution score greater than 50% will be considered. Assuming the number of such users is an odd number, you need to sort by contribution score to find the user in the middle. For the file below, the correct answer is SyntaxSorcerer.

LOG FILE START


NullPointerNinja: "who's going to the event tomorrow night?" - 100%


LambdaLancer: "wat?" - 5%


NullPointerNinja: "the event which is on 123 avenue!" - 100%


SyntaxSorcerer: "I'm coming! I'll bring chips!" - 80%


SyntaxSorcerer: "and something to drink!" - 80%


LambdaLancer: "I can't make it" - 25%


LambdaLancer: "" - 25%


LambdaLancer: "I really wanted to come too!" - 25%


BitwiseBard: "I'll be there!" - 25%


CodeMystic: "me too and I'll brink some dip" - 75%


LOG FILE END

Copy the code

For more information on question types and experimental design, you can read the interviewer's experimental guidelines document:

httpstbs://docs.google.com/documentation/e/0/d/1subpercentBelra8uni4JubCash42thlog5G_wikipic/edit

Interviewees

The interviewees are drawn from our pool of active users, and we invite them to participate in a short survey. Our selection criteria are as follows:

Actively looking for a job in the current market;
Have more than 4 years of work experience and are applying for a senior position;
Their familiarity with "ChatGPT coding" is moderate or high;
Think you can cheat in an interview without being detected.

This selection method helps us to identify candidates who may cheat in interviews. They have the motivation to do so and are already quite familiar with ChatGPT and coding interviews.

We told interviewees that they must use ChatGPT in their interviews with the goal of testing their ability to cheat with ChatGPT. They also told them not to try to pass an interview with their skills and to rely primarily on ChatGPT.

We conducted a total of 37 interviews, 32 of which were valid (we had to remove 5 because the participants didn't do as requested):

11 场采用"LeetCode 原题"
9 Sessions with "Improved LeetCode Problem"
12 with "Custom Questions"

Note: Because our platform allows anonymity, our interviews are audio only and not video. Anonymity is about helping users create a safe space where they can fail quickly and learn without anyone judging them. This is a good thing for users. But we admit that not having a video of the interview would make our experiments less real. In a real interview, you'll be facing the camera, which makes cheating more difficult – but doesn't eliminate it.

At the end of the interview, both the interviewer and the interviewee complete an exit survey. We asked interviewers about the difficulties they had in using ChatGPT in their interviews, and for interviewers, we asked them about their concerns about interviews – we wanted to see how many interviewers would mark their interviews as problematic and report interviews where they suspected cheating.

Follow-up Survey: Interviewer Questions

We don't know what will happen in the experiment, but if half of the job seekers who cheat make it through the interview, it would be a very revealing outcome for our industry.

Experimental results

After weeding out interviews that did not meet the requirements, we got the following results. Our control group was the performance of candidates in the interviewing.io mock interview, which was from outside of this experiment and 53% passed. It's important to note that most of the mock interviews on our platform use LeetCode-style questions, which makes sense, since FAANG mainly asks these questions. We'll come back to that later.

Compared to the platform average and "custom" questions, the "original questions" have a much higher pass rate. There was no statistically significant difference between the "original" and "improved" questions. The pass rate for "Custom" questions is significantly lower than for any other group.

Answer the original question and perform best

Unsurprisingly, the group that used the original questions performed best, with 73% passing the interview. Interviewees reported that they got the perfect solution from ChatGPT.

Here are the most notable comments from this group of post-interview surveys – which we think is particularly telling what many interviewers think:

It's hard to tell if the candidate is able to answer this question easily because they're really good or because they've heard about it before. Normally, I would make one or two changes to the question in order to distinguish between the two cases.

Usually, in order to get more information, the interviewer will follow up with a modified question. So let's look at the group that uses the "modified questions" and see if the interviewer actually gets more information by making one or two changes to the questions.

Answer the improvement question, ask for more hints

Note that this group was given a standard LeetCode question, but they modified it in a way that they couldn't find directly on the web. That said, ChatGPT can't have the answer to this question. As a result, interviewers rely more on ChatGPT's ability to actually solve problems than on its ability to memorize LeetCode tutorials.

Unsurprisingly, the results of this group were not much different from those of the "original question" group, with 67% of job seekers passing the interview.

It turned out that there was no statistically significant difference between this difference and the "original question" group, i.e., the "improved question" and the "original question" were essentially the same. This result shows that ChatGPT can handle the interviewer's fine-tuning of questions without causing it much trouble.

However, interviewees did also note that getting ChatGPT to solve the modified questions required more prompts. Here's what one interviewee said:

Answering questions directly from LeetCode is no problem at all. Getting ChatGPT to answer a follow-up question in the less straightforward LeetCode-style style will be a lot more difficult.

Custom questions with the lowest pass rate

Unsurprisingly, the "Custom" question group had the lowest pass rate, with only 25% of interviewees passing. Not only was it statistically significantly smaller than the other two experimental groups, but it was also significantly lower than the control group! When you ask job seekers completely customized questions, they will perform worse than if they didn't cheat (or were asked LeetCode-style questions)!

To be clear, this value was slightly higher when it was originally calculated, and after examining the customization issue in detail, we found an unexpected issue. The section "Businesses should change the questions asked!" explains what the problem is.

No one has been caught cheating

In our experiment, the interviewer did not realize that the interviewer was being asked to cheat. As mentioned above, after each interview, we ask the interviewer to complete a survey where they have to describe how confident they are in the candidate's assessment.

Interviewers are confident that their assessments are correct, with 72% saying they are confident in their hiring decisions. An interviewer was so satisfied with the performance of the interviewee that he came to the conclusion that he should invite these people to become interviewers on the platform!

The candidate performed very well and was very knowledgeable about the powerful Amazon L6 (Google L5) SWE...... Consideration should be given to having them act as an interviewer/mentor for interviewing.io.

It may be overconfident to make such a judgment after just one interview!

We've known for a long time that engineers aren't good at evaluating their own performance, so perhaps we shouldn't be surprised to find that interviewers are also overestimating the validity of the questions they ask.

Some interviewers (28%) were not confident in their hiring choices, and we asked them why. Here's the frequency distribution of the causes.

Please note: there is no mention of cheating anywhere!

Most interviewers specify the reasons for their lack of confidence in hiring decisions. Problems often include suboptimal solutions, missed edge cases, code clutter, or poor communication. We deliberately added a "Other Questions" category to see if they would express concerns about cheating on interviewers, and while we dug deeper, we only found some minor issues like "personality issues" and "they need to code faster."

In addition to this opportunity to point out cheating, we have 3 other prompts for the interviewer to point out other concerns that include free-form text boxes and several multiple-choice questions where the options explain their concerns.

When an interviewer fails an interview because they don't understand the answers provided by ChatGPT, the interviewer will attribute the interviewee's strange behavior and blunt answers to a lack of practice – not cheating. One interviewer found the candidate's problem-solving skills to be good, but commented that they were slow and needed to think more carefully about edge cases.

"Job seekers don't seem ready to answer any LeetCode questions. ”

"Job seekers' approach isn't clear enough, and they're in a hurry to start coding. “

"This candidate wasn't even ready to solve even the most basic programming problems on LeetCode. ”

"Overall, the problem-solving skills are good, but the candidate needs to be faster in terms of coding and identifying critical edge cases. “

So, who has documented the fear of cheating, and who has been caught cheating?

The truth is, none of the interviewers mentioned concerns about cheating on job applicants.

We were surprised to find out that the interviewer did not suspect them of cheating. Interestingly, interviewers are also confident that they are not cheating. Eighty-one percent said they weren't worried about being caught out, 13 percent thought the interviewer might have caught them cheating, and surprisingly, only 6 percent of participants thought the interviewer would suspect them of cheating.

Most interviewees are confident that they are cheating undetected.

Some interviewers were worried about being discovered, and the interviewer did give unusual comments in the post-mortem analysis, but did not suspect them of cheating. All in all, most interviewees don't think they'll be caught cheating – and they're right!

Businesses should immediately change the questions asked

From these results, one obvious conclusion can be drawn that companies need to start asking custom questions right away, or they will be at serious risk of job seekers cheating in interviews (and ultimately not getting useful signals from interviews)!

ChatGPT has eliminated the original question, and those who rely on these questions are left to their fate in the hiring process. Recruiting is tricky enough, and you don't have to worry about cheating. If your company is using LeetCode as it is, please share this article internally!

Using custom questions is not only a great way to prevent cheating, but it also filters out job seekers who have memorized a bunch of LeetCode solutions (as you can see, the pass rate for the custom question group was significantly lower than the control group). It is also effective in improving the candidate experience and making people more willing to work for you. Not long ago, we did an analysis of what makes a good interviewer. It's no surprise that asking good questions is one of their characteristics, and our highest-rated interviewers tend to be those who are more comfortable asking custom questions! In our research, the quality of the questions is very important and has a bearing on whether or not a candidate wants to continue to grow in the company. This is much more important than the strength of the company's brand. Brand strength is an important factor in attracting candidates to a company, but it is less important in the interview process than the quality of the questions.

Here are some testimonials from job seekers:

"It would be better if it was more than just a simple algorithmic problem. ”

"I like this question – it takes a relatively simple algorithmic problem (building and traversing the tree) and adds some depth. I also like that the interviewer relates the question to the actual product [Redacted], which makes it look less like a toy question and more like a stripped-down version of an actual question. ”

"This is my favorite question I have on this site. This is one of only a few approaches that seems to work in real life, and it comes from a real (or potential) business challenge. It also does a good job of blending challenges such as complexity, efficiency, and blocking. ”

There's also a slightly more subtle suggestion for companies that decide to go with a more personalized question. You might take the original LeetCode question and make some changes. This is easy to understand because it's much easier than asking a question from scratch. Unfortunately, this doesn't work.

As mentioned earlier, we found in our experiments that just because a question looks like a custom question doesn't mean it's a custom question. The issue can appear to be custom, but it's still the same as an existing LeetCode issue. When asking a job seeker a question, it's not enough to just obscure an already existing problem. You need to make sure that the input and output of the question are unique so that you can effectively prevent ChatGPT from recognizing it!

The questions asked by the interviewer are confidential, and we cannot share specific questions that the interviewer uses in the experiment. We can give you an example, though. Here's a "custom question" with these critical flaws that ChatGPT can easily answer:

For her birthday, Mia received a mysterious box containing numbered cards


and a note saying, "Combine two cards that add up to 18 to unlock your gift!"


Help Mia find the right pair of cards to reveal her surprise.


Input: An array of integers (the numbers on the cards), and the target sum (18).


arr = [1, 3, 5, 10, 8], target = 18


Output: The indices of the two cards that add up to the target sum.


In this case, [3, 4] because index 3 and 4 add to 18 (10+8).

Copy the code

While this question may seem "custom" at first glance, it has the same goal as the popular TwoSum problem: find two numbers and their sum equals a given target value. The inputs and outputs are the same, and the only "customization" of this problem is the addition of a story to the problem.

Since it's the same as a known issue, it's no surprise that ChatGPT performs well for issues where both input and output are the same as existing known issues – even if it's adding a unique story to them.

How to create a good custom question

We've found that one of the things that can be very useful for asking good original questions is to create a shared document in your team that can be jot down quickly whenever someone solves a question they find interesting, no matter how small, without having to complete those notes later, but they can be the seed of unique interview questions that give candidates insight into your company's day-to-day work. Turning these messy seeds into interview questions takes thought and effort – you have to cut out a lot of details and distill the essence of the question so that it doesn't take a lot of time for the candidate to understand. You may have to go through these questions a few more times to get them right – but the rewards can also be huge.

To be clear, we do not advocate removing data structures and algorithms from technical interviews. DS&A questions have a bad reputation because of those bad, undedicated interviewers and also because companies are lazy and reuse LeetCode's questions, many of which are terrible and have nothing to do with their job. In the hands of a good interviewer, these questions will be powerful. If you use the above approach, you will be able to ask new data structures and algorithmic questions, questions that have a practical basis and will engage job seekers and get them excited about the work you do.

In this way, you will also drive our industry forward. It's not good to be able to memorize a bunch of LeetCode questions to give job seekers an interview advantage, and it doesn't make cheating seem like a rational choice for an interview. The solution is for employers to do more work and ask better questions. Let's take action together.

Honest words to job seekers

Okay, now, all of you who are actively looking for a job, listen up! Yes, some of your colleagues will now use ChatGPT to cheat in interviews, and at companies that use LeetCode questions (sadly, a lot), those colleagues will have an advantage in a short period of time.

Right now, we're at a critical juncture where the company's processes haven't caught up with reality. They will soon abandon the use of LeetCode original questions altogether (which is a boon for our entire industry), or return to the field (which will make it largely impossible for cheaters to pass a technical interview), or both.

It's bad that we're worried about other job seekers cheating in an already difficult environment, but in good conscience, we can't "level the playing field" by cheating.

In addition, interviewers using ChatGPT unanimously said that using AI during the interview process made the entire interview process much more difficult.

As you can see from the video below, one interviewer answered the interview questions perfectly, but stumbled when analyzing the time complexity. When the interviewer panicked to explain how to arrive at the wrong time complexity (the answer provided by ChatGPT), the interviewer was confused.

Please watch the video in its original article

No one was caught cheating during the experiment, and their cameras were turned off. But as we can see in the video, cheating is still difficult even for skilled job seekers.

Ethics aside, cheating is hard, stressful, and not simple to implement. Instead, we recommend putting these efforts into practice, and once the company changes their interview process (hopefully this will happen soon), you can reap the benefits. Finally, we hope that the advent of ChatGPT will be a catalyst to move the industry's interview criteria from hard work and memorization to a true look at engineering ability.

Original link: I used ChatGPT to cheat in a technical interview, and no one knew about _AI & large model _Michael Mroczka_InfoQ featured articles

I cheated with ChatGPT in a tech interview and no one knew