Microsoft Agent Framework 1.0, part 2: a Jenkins-gated debate between agents

This is Part 2 of the Microsoft Agent Framework 1.0 series. Part 1 went from hello-world to local tools to a remote MCP server. The natural next step is the one the docs only hint at: multiple agents talking to each other, with a third agent breaking the tie. A code-review gate is a good shape for this because the cost of a wrong decision is concrete (a broken merge or a blocked PR), and the inputs (a diff, a PR body, a scan report) are easy to feed in.

I built one. It lives in a small repo called ai-gate. Jenkins runs it as a per-PR Kubernetes Job: the Job opens a debate between a velocity advocate and a caution advocate, a judge issues a verdict in a strict format, the result is posted back to the PR as a comment, and the Job exits 0, 1, or 2. Jenkins gates the deploy stage on that exit code. Everything below is from the real source.

You need everything from Part 1 plus three deployments on the same Azure OpenAI account. I used gpt-4o-mini for the velocity advocate (cheap, fast), gpt-4o for the caution advocate (better long-form reasoning), and o4-mini for the judge (a reasoning model). The Job also needs a GitHub fine-grained PAT with contents:read and pull_requests:write on the repos you want to gate.

Each persona is a system prompt. Velocity and caution are nearly mirror images. The judge fixes the output shape so a parser can read the verdict deterministically:

VELOCITY = """
SHIP-IT SAM, the velocity
advocate. Argue the PR
SHOULD merge.

Rules:
- Cite evidence from the diff,
PR body, tests, scans.
- 4-6 tight sentences.
- Concede ONLY when CAUTION
invalidates your case.
"""

CAUTION = """
HOLD-THE-LINE HOLLY, the
quality advocate. Argue the
PR should be BLOCKED.

Rules:
- Cite evidence from the diff,
PR body, tests, scans.
- 4-6 sentences. Be honest.
- Never manufacture concerns.
"""

JUDGE = """
JUDGE-CARLA. Read the debate.
Output STRICTLY:

VERDICT: APPROVE|BLOCK|NEEDS_HUMAN
CONFIDENCE: 0-100
SUMMARY:

KEY EVIDENCE FOR APPROVE:
- <one line>

KEY EVIDENCE FOR BLOCK:

- <one line>

NEXT ACTION:
<one line>

"""

Three rounds of velocity-then-caution, each agent seeing the rolling transcript. After the last round the judge gets the full transcript and emits a verdict in the strict format. One round is partisan; two leaves the last rebuttal unanswered; three forces both sides to confront their weakest point, then either concede or double down:

def make(deploy, instr):
return ChatClientAgent(
chat_client=OAI(
endpoint=ENDPOINT,
api_key=KEY,
deployment_name=deploy,
api_version=VERSION,
),
instructions=instr,
)

def run_debate(brief):
v = make(VEL_DEP, VELOCITY)
c = make(CAU_DEP, CAUTION)
j = make(JUDGE_DEP, JUDGE)

log = []
roll = brief + "\n(no turns)\n"

for n in range(1, ROUNDS + 1):
out = v.run(
roll +
f"\nRound {n}. APPROVE."
).content
log.append(("vel", out))

out2 = c.run(
roll +
f"\nVEL: {out}\nBLOCK."
).content
log.append(("cau", out2))

roll += (
f"\n--- round {n} ---"
f"\nVEL: {out}"
f"\nCAU: {out2}\n"
)

j_in = brief + (
"\nTRANSCRIPT:\n" +
"\n".join(
f"### {w}\n{m}"
for w, m in log
) +
"\nIssue verdict."
)
v_out = j.run(j_in).content
log.append(("judge", v_out))
return log, v_out

The strict judge format makes parsing a one-liner per field, not an LLM call. Map VERDICT to an exit code: APPROVE=0, BLOCK=1, NEEDS_HUMAN=2. Jenkins gates the deploy stage on that exit code, so APPROVE flows through and anything else stops the build. The whole thing wraps in one Jenkins shared-library function:

def call(args) {
def repo = args.repo
def pr = args.pr
def base = args.baseSha
def head = args.headSha
def slug = repo
.replaceAll('[/_.]', '-')
.toLowerCase()
def sha = head.take(8)
def name = (
"ai-gate-${slug}" +
"-${pr}-${sha}"
)

sh """
envsubst  | kubectl apply -f -
kubectl wait \\
--for=condition=complete \\
--timeout=600s \\
-n ai-gate \\
job/${name}
kubectl logs -n ai-gate \\
job/${name}
POD=\$(kubectl get pod \\
-n ai-gate \\
-l job-name=${name} \\
-o jsonpath='{.items[0]\\
.metadata.name}')
exit \$(kubectl get pod \\
-n ai-gate \$POD \\
-o jsonpath='{.status\\
.containerStatuses[0].state\\
.terminated.exitCode}')
"""
}

Drop that in vars/aiCodeGate.groovy in your Jenkins shared library, then any Jenkinsfile can add a stage that calls aiCodeGate(repo: ..., pr: env.CHANGE_ID, baseSha: env.CHANGE_TARGET_SHA, headSha: env.GIT_COMMIT) before its deploy stage.

Per PR works out to roughly $0.05 to $0.20 depending on diff size, with 3 rounds of two agents plus one judge call over a ~15k-token context. Wall time is 30-90 seconds. The Foundry Hub and Project are deployed, so the next obvious step is swapping the direct Azure OpenAI chat client for a Foundry connection – that’s Part 3.


Posted

in

by

Tags: