Related ToolsCursorClaude CodeChatgptCodyAider

Rootly Now Grades Hiring Take-Homes on AI Transcripts, Not Just Code

AI news: Rootly Now Grades Hiring Take-Homes on AI Transcripts, Not Just Code

"Everyone can ship clean code with AI." That blunt assessment from Rootly CTO Quentin Rousseau is the reasoning behind a hiring change that more engineering teams will probably copy before the year is out.

Rootly, the incident response platform, has restructured its take-home coding assessment. Candidates now submit three things: a GitHub repo with working code, their full AI session transcripts, and a Loom video (under five minutes) walking through their design decisions. The instructions explicitly state that using AI is "expected and strongly encouraged."

The transcripts are the main event.

What the Transcripts Actually Reveal

Rousseau describes two candidates who submitted functionally identical code. Traditional review would have rated them equally. The transcripts told a different story.

One candidate asked targeted questions about Slack API edge cases and caught a mistake in the AI's webhook retry handling before it became a bug. The other pasted the entire spec into a single prompt, got code back, and kept resubmitting variations until something passed.

Same output. Completely different engineering capability. Without the transcripts, you'd never know.

Rootly's reviewers now focus on four signals in the transcripts:

  • Problem decomposition - Did the candidate break requirements into logical chunks, or dump everything at once?
  • Course correction - When the AI went off-track, how fast did they catch it?
  • Domain understanding - Do the prompts show the person actually understands what they're building?
  • Iteration quality - Are they refining based on understanding, or just reprompting randomly?

The Uncomfortable Implication for Developers

This approach assumes that the code itself is no longer the hard part. For a lot of tasks, that's already true. Tools like Cursor, Claude Code, and GitHub Copilot can produce working implementations from decent prompts. The skill gap has shifted from "can you write this function" to "do you know what function needs to be written, and can you tell when the AI got it wrong."

That's a meaningful distinction. A junior developer who deeply understands the problem domain and uses AI as a force multiplier might outperform a senior developer who treats it as a magic box. The transcript makes that visible in a way that code review alone cannot.

There's a practical concern here too. Requiring candidates to share their full AI chat history is a level of process transparency that cuts both ways. It shows genuine thinking, but it also means candidates need to be comfortable with someone reading every false start and backtrack. Some strong engineers might find that invasive.

Still, compared to the alternative - pretending AI tools don't exist and hoping candidates didn't use them - this feels like a more honest approach. The industry has been stuck in an awkward limbo where take-homes implicitly ban AI assistance while knowing full well that everyone uses it. Rootly's model at least acknowledges reality and evaluates what actually matters: whether you can think through a problem, not whether you can type out the solution.