2 Comments
User's avatar
Leonid Bugaev's avatar

I can definitely say that I feel very weird right now, and my sense of reality—what is real and what is not—is definitely distorted when applied to AI. This is a very complex, non-deterministic machine; however, we try to apply rule-based evaluations on top of them. It's the same as putting a person into a box or a checklist. Everyone is unique, and on every evaluation, on every slightly different question, you will get a different answer from the AI. I don't believe these benchmarks, and I totally agree that the part with intuition is ignored almost completely.

It's interesting when it applies to HR. When I'm hiring a person, I always use my intuition. If I tell an HR person that I disqualified a candidate because I don't like him as a person—the way he talks, the way he moves, the small nuances in what he talked about—they will tell me I should never say that because I can get legally sued for this kind of review. But I am still doing it. We are evaluating humans; we have our intuition, and sometimes you just know something is not right.

At the same time, I had a weird example recently. A person was charming in the first part of the interview. I saw that, okay, this is a person I want to hire. But he started falling apart on the more technical side of the interview, trying to fall back on humor or other nlp techniques, which made me very concerned. However, my colleague was not concerned. In the end, I disqualified him, but I ended up after this call in such a weird feeling that I don't know what is real anymore. If you are a person with the capability to be charming, there is a set of people who are just good with others—they know how to talk, they have a good sense of humor, and they can speak almost on any topic and still charm the audience, but there is no depth. Imagine combining such person with knowledge of AI system.

During interviews, we found that a majority of candidates actually use AI-assisted flows. So if you look at it from the perspective of the person being interviewed, they have some box on the screen with transparent text which almost immediately gives them advice on what to answer. Obviously, while I can't prove if this person used it or not, imagine applying this charm combined with AI knowledge and real-time analysis of who is interviewing you and what kind of answers will best satisfy them. You get such a monstrosity where it is very hard to distinguish what is real and what is not anymore. In the end, I disqualified this candidate because of my intuition, but I'm not sure if I will be able to do so the next time.

And it's interesting you talking about these group feelings. I understand what it means, and I have felt it; however, I'm not sure how to codify it, or even if it *should* be codified. We're so obsessed with rules, and we are ignoring the human nature of people, which is sometimes irrational and sometimes something we can't explain or codify. I feel we definitely should, but it requires some societal changes, even on the legal side of things. If you just say, "I don't feel good about it," or a group of people say something like that, no one will listen to them—they will ask for proof. And how do you prove your intuition? So it requires, a kind of system based purely on trust and subjective authority. You just trust this group of people because of their authority.

I feel this topic is also interesting because it is challenging what it means to be human. Where is the line between the artifactual and the humanistic? Where is the line between the deterministic and non-deterministic? Where is the line between the deterministic rule and one human trusting the other one.

Mark Gurvis's avatar

Leo - this is the conversation I've been trying to start. The Verification Gap is the right name for it.

Your hiring story is exactly the kind of problem I'm trying to address. The intuition caught what the rule-based system couldn't. The legal-organizational structure makes acting on it harder, and may make it impossible the next time.

Your three closing questions are the territory I'm working in. Let's talk longer. Zoom when you have time, or next time we're in the same place. Spain was good.