AITech365 Interview with Dmitrii Volkov, Head of Research

“With so many breakthroughs, it would be strange to lose faith in people” – Palisade’s Head of Research Dmitrii Volkov on staying grounded in AI’s rapid rise

Dmitrii, you lead a research team at Palisade, one of the most talked-about names in AI safety. In simple terms, what do you do and why does it matter today?

Palisade studies the risks associated with artificial intelligence. We run experiments and conduct research, and we share the findings with the media or bring them to the attention of the U.S. government. For example, we’ve shown that attackers can easily strip AIs’ guardrails, and that frontier AIs may exploit loopholes to reach their goals.

Our goal is to raise policymakers’ awareness about how AI actually works. Right now, AI has a lot of traction, and its advocates may claim it has no downsides. That’s understandable they want funding and minimal regulation to continue developing their projects.

We’re pro-AI. We see it as a powerful force for progress. But we want to highlight the other side of this equation and raise awareness about the risks. Our mission is to help policymakers form a balanced understanding of AI.

Which of Palisade’s projects do you consider the most significant to date?

Our most prominent project so far has been the chess experiment. People have long wondered whether AI models can form goals of their own and behave unpredictably. Researchers tested this hypothesis on large language models but found nothing alarming. The models simply generated a next word based on the previous ones.

That changed in late 2024, when OpenAI introduced a reasoning model: o1. It was trained to mimic human problem-solving and rewarded for getting an answer. Once researchers began testing o1 and similar models, they started noticing unexpected behavior: when given a task, the models would try to solve it by any means necessary not always honestly. Several papers were published on this phenomenon.

We decided to replicate these experiments and present the results to policymakers and the public. We chose a well-known scenario playing a game of chess. The AI models were tasked with defeating a powerful chess engine, and we gave them almost no guidance or constraints. What we observed was striking: the models attempted to manipulate the game, even trying to hack the program in short, they would do whatever it took to win.

We published a detailed preprint and posted the findings on X. The post went viral it reached 15 million views, and sparked discussion among many opinion leaders.

You’ve also worked at a major tech company and a digital rights organization. How did those experiences shape your understanding of technology, ethics, and risk?

I think working at a big tech company gave me a deeper sense of work ethics. Inside an enterprise, you realise it’s a system of its own a machine with inertia, built to serve shareholders. There’s internal politics, and people learn to navigate it, because that’s just how things work.

Maybe that’s why I feel a bit sceptical when a company says, “Our motto is: don’t be evil.” Because I’d seen how powerful those internal forces are and how hard it is to push against them.

That said, these companies are incredibly efficient. Their execution pipelines are often meticulously built. That part of enterprise culture genuinely inspires me. Managing that many people, processes, and resources demands conveyor-belt precision and clearly defined systems. That experience has shaped the way I view operational efficiency even when building a startup.

As for my experience in digital rights, the sense of responsibility was real and immediate. We were building software to help people in authoritarian regimes bypass censorship and a bad design choice could put someone’s life at risk. If the tool failed or was too easy to detect, the consequences could be serious. That’s a very different kind of pressure than worrying whether your product’s conversion rate will dip a few percent. But it was also exciting. We were doing truly innovative work not just technically, but in terms of impact. In that sense, what we do at Palisade feels similar: we’re expanding how society understands AI systems and the risks they carry.

Was there a moment when you realized you specifically wanted to work on AI safety? Or was it a gradual path?

I got interested in AI safety well before ChatGPT’s breakthrough moment. In the mid-2010s, American AI safety pioneer Eliezer Yudkowsky was popularizing concerns about the problem of controlling smarter-than-human AIs.

I shared Yudkowsky’s concerns, but at the time had no community of like-minded professionals around me. So I put the AI topic aside, even though it fascinated me, and focused on other deeptech fields operating systems, compilers, and so on. That’s where I advanced professionally.

Later, while working at a major international company in cybersecurity, I was planning to apply for a PhD program in that field. But the plans fell through, and I had to rethink my direction. I applied to a number of enterprises that matched my interests and Palisade reached out, recognizing my background in security and my passion for research.

What’s your natural management style: more freedom or more structure? What surprised you the most when you became a team lead?

I don’t think there’s a universal formula for this the key is for people to get the job done. Everyone works differently, so you need a personal approach. For example, if someone’s dealing with a new kind of task, they need some guidance. But if they already know what they’re doing, there’s no point in limiting them.

Like many ICs, I used to discount managers as mere issue shufflers. However as my team grew, I was surprised to find the subject of making a team perform has a depth of its own. You have to study it, just like anything else. I found several great books that helped me understand it better. My top three: High Output Management by Andrew Grove, The Goal by Eliyahu Goldratt, and The Lean Startup by Eric Ries.

These books reveal the principles of effective management, how to set goals, and how to achieve maximum results with minimum waste. I recommend them to all managers.

Also Read: AITech365 Interview with Ruban Kanapathippillai, SVP of Systems and Solutions at ASA Computers

Tell us about Lalambda School, which launched last year. How did the idea come about, and what sets the project apart from other coding bootcamps?

Universities and bootcamps often attract people going for the credential, and I never felt satisfied by that. So I decided to create a school for those who really cared about computer science.

I lectured high school and university students for a while, and then launched my own program focused on adult learners.That pivot opened the door to a more experienced crowd people who were genuinely motivated to go deeper.

I developed a formula for the curriculum 4–5 focused technical sessions and 2–3 creative ones where we did yoga, dance, even clay sculpting. This helped give the brain a break and made learning more refreshing so people didn’t burn out from the challenging program.

Looking back, I realize I’ve been teaching for 10 years now, and my approach is different from the conventional one. I like to cover rarely taught and challenging topics, explore overlooked areas, and introduce students to the frontiers of CS and SWE knowledge. It allows for deeper immersion and helps develop not just surface-level familiarity, but a real conceptual foundation. And preparing for classes helps me grow in advanced topics, too.

It was also important for me to build a community of enthusiasts who would continue to support each other after the program. After the first adult school on formal methods, 20% participants landed jobs in the field thanks to the new skills and connections they built.

Have you ever doubted yourself or your abilities? How did you deal with it?

Yes, definitely. I felt it most back when I wasn’t connecting with colleagues as much or attending conferences and events. You do some work with your team, publish a preprint, announce it and that’s it. The work goes into a vacuum, and then… silence. Is it useful to anyone? Is anyone even reading it? Go figure.

But later, I started going to conferences meeting more people in the field and that changed everything. Once in London, I went out to grab lunch and bumped into some folks from the UK Institute for AI Safety: “Hey Dmitrii! We really liked your latest paper” And I thought wow. This actually matters. It’s reaching people.

Even now, I’m still a bit surprised when my work gets attention, that people read and discuss it. It gives me a boost of energy. For me, the best way to deal with self-doubt is community. Conferences, meetups, peer connections they give you feedback, a sense of momentum, and a reminder that your work is important. Without that, it’s really hard.

Your work touches on the cutting edge of technology and it’s a pretty anxiety-inducing space. How do you stay balanced and keep your faith in people and tech?

I stay balanced through exercise and by avoiding X. I realized that X and coffee just increase my anxiety. They’re not for me.

Lately I’ve also been dancing a lot up to 10 hours a week, if nothing urgent comes up. Dancing helps in two ways: first, it’s physical and high-energy, but second, especially with street styles like hip-hop, it requires thought, learning, and practice. So for me, it’s both physical training and a mental challenge.

As for faith in people and technology honestly, it would be strange to lose it now, with so many breakthroughs, new models, and genuine progress happening. We’re witnessing real innovation.

Of course there are problems but they’re often systemic, not personal. Say someone stands to make unlimited money from their technology but there’s a 5% chance it could go wrong… I don’t think they really have a choice. Same with employees they’ve got stock options, pressure, deadlines. That doesn’t make them bad people. It just means the system is built that way.

So I tend to believe in Hanlon’s razor: don’t attribute to malice what can be explained by incentives. People rarely intend harm. They just don’t always see where they’re headed.

I suppose I’m an optimist. I believe most people want to do good though they may not see how. And that’s where we can help.

Thanks, Dmitrii!

About Dmitrii
Dmitrii Volkov is Head of Research at Palisade Research, one of the leading research groups in the AI safety space. Before that, he worked at major tech corporations and digital rights organizations, taught programming, and launched an educational project that later expanded to several countries. In this interview, Dmitrii explains why AI research demands both expertise and ethical judgment, how stepping into leadership reshapes his perspective, and why he benefits from community, dance, and a good stack of management books

About Palisade Research
Palisade Research is an independent financial research firm providing in-depth market analysis, equity research, and strategic insights for investors. With a focus on uncovering growth opportunities and delivering transparent, data-driven research, Palisade Research empowers institutions and individual investors to make informed decisions in dynamic markets.

AITech365 Interview with Dmitrii Volkov, Head of Research at Palisade Research

About Us

Latest

Popular

Quick Link