• Coming back to endpoint management: from SMS 2.0 to Autopilot

    I have been doing endpoint management for a long time. I started on SMS 2.0 – Microsoft Systems Management Server – back when there were rumors the product was going to be cancelled and it was not clear it would survive. It did. It became SCCM, then Configuration Manager, and a good chunk of this blog from 2010 to 2015 is me working through that era: OSD task sequences, building WinPE boot images, driver packages, collections and advertisements, the trickle install, TSConfig.INI, and running a task sequence after OSD. Then I moved on to other things and have not touched endpoint deployment in a while.

    I wanted to catch up – to see how the new technology actually works now, not just read about it. So I rebuilt Windows 11 bare-metal in a dev environment from scratch, on purpose without a management server, to find out which parts have changed and which are exactly the same. This time I took it all the way: bare metal to a managed, Entra-joined machine with its apps installed.

    What I built. A Windows 11 machine that PXE-boots with no install media attached and provisions itself end to end. The chain: PXE and DHCP hand the machine a boot file over TFTP; that chainloads iPXE, which uses wimboot to pull WinPE over HTTP; WinPE partitions the disk, pulls the Windows image over HTTP, applies it with DISM, writes the boot files – and, before it reboots, computes the machine’s Autopilot hardware hash and registers it with the tenant itself. The machine then reboots into the out-of-box experience, joins Entra, enrolls in Intune, and installs its apps. No distribution point, no site server, no client agent, and no one collecting a hardware hash by hand.

    The old-to-new mapping. This is the part I actually wanted to see, and most of the pieces I knew are still here – they just moved. The OSD task sequence became a WinPE script doing a DISM apply, or, for the cloud path, Autopilot; the task sequence engine is gone and what is left is the primitives it was wrapping. The boot image I used to build inside Configuration Manager is still WinPE, still a boot.wim, but now I serve it over HTTP with wimboot instead of staging it on a distribution point – the same idea as my old WinPE wim/ISO post, fifteen years on. Driver injection is still a real problem: I hit the modern version of my own drivers-missing post when WinPE could not see the disk because the virtual SCSI controller had no inbox driver. Domain join and client push became Entra join, Autopilot, and Intune. Group Policy became Intune configuration profiles. WSUS and the software update point became Windows Update for Business and update rings. Collections and advertisements became Intune groups and policies. Software distribution became Intune apps delivered at enrollment.

    Apps moved out of the image. This was the biggest practical change. In the SCCM days I either baked applications into the image – the thick image – or laid them down in the task sequence. Now the image is just the operating system. Intune installs the applications after the machine enrolls: I deployed Office and Visual Studio Code, and the Enrollment Status Page holds the desktop until the required apps finish, which is the closest modern thing to a task sequence installing software before handing the machine over. One snag worth recording: the tenant did not have the Microsoft Store integration turned on, so Visual Studio Code could not go in as a Store app. It had to be a Win32 app, which meant building the .intunewin package and its encrypted upload myself instead of using the packaging tool. The primitives are still here, you just assemble them.

    What is the same, and what is not. The same: under everything, bare-metal first boot is still DHCP, TFTP, and WinPE. If you want a machine to image itself from power-on, you are still standing up the same plumbing I was standing up in 2010. That surprised me – I expected the cloud to have absorbed more of it. I even had to add PowerShell to the WinPE boot image to make the registration call, the same optional-component surgery I did on boot images years ago. What is not: the server in the middle is gone. SMS and then SCCM were one big box that did imaging, software, updates, inventory, and reporting. The modern version is disaggregated – PXE and WinPE for first boot, Autopilot for provisioning, Intune for management – and most of it is cloud-side. You assemble it from parts instead of installing one product, and you still need something to orchestrate the parts; mine is a CI pipeline. The other shift is identity: a device used to be identified after the fact by a client push, and now it is identified up front by a hardware hash you register before it ever boots. BIOS and MBR also gave way to UEFI, GPT, and a TPM, which Autopilot requires, and that bit me with a boot-loader detail that did not exist in the BIOS world.

    Where I landed. It works end to end – power-on to a managed, Entra-joined Windows 11 machine with Office and VS Code installed, no management server in the path. The only manual step left is a single sign-in at the out-of-box experience with a work account; that is user-driven Autopilot. Self-deploying mode removes even that, but it needs a real, attestable TPM, which my virtual machine does not have – on physical hardware it would. Coming back to this after a decade, the surprise was not how much changed but how much did not: it is still WinPE and a boot.wim and DISM at the bottom, with the one big server in the middle replaced by a handful of cloud services you wire together yourself.


  • Azure Build Update Part 2

    On June 3 through 5, Microsoft pushed about 40 Azure updates. Microsoft Foundry IQ went GA, Microsoft Discovery went GA, HorizonDB and DocumentDB kept extending, Azure Monitor kept closing the OpenTelemetry gap, and a cluster of agent-side updates landed that line up directly with the Agent Framework posts from June 1 and June 2. The Agent Framework follow-on is the most useful framing for the drop, so it goes first.

    Follow-on to the Agent Framework posts

    If you followed the Agent Framework posts last week, six of these updates are direct follow-ons. Part 1 walked from a single ChatClientAgent to one wired up to a remote MCP server. Part 2 ran three of them in a debate and gated a Jenkins deploy on the verdict. This week Microsoft shipped first-class versions of most of the moving parts those posts hand-rolled.

    Agent-to-agent (A2A) for Prompt and Hosted agents in Foundry (preview)

    The Part 2 debate was hand-rolled: three ChatClientAgents sharing a transcript across three rounds, with a judge persona constrained by a strict-format prompt. A2A is Microsoft formalizing exactly that pattern as a first-class Foundry primitive for prompt agents and hosted agents. Worth watching whether the preview converges on the same multi-round, single-transcript shape the debate post used, or on a different topology.

    Rubric evaluator and Intelligent Trace Sampling evaluations (preview)

    The judge persona in Part 2 was a hand-built rubric (VERDICT: APPROVE / BLOCK / NEEDS_HUMAN plus a confidence score). The rubric evaluator is the productized version of that pattern. Intelligent trace sampling is the answer to the question that breaks most agent-eval pipelines: how do you score the runs without paying to evaluate every single one. It picks representative traces instead of evaluating wall-to-wall.

    Foundry for VS Code (GA, June Build 2026 refresh), code-first observability for Foundry Agents in VS Code (preview), and observability developer experience in Azure Developer CLI (preview)

    Both Agent Framework posts were code-first Python against an agent-framework==1.0 install, run from a terminal. This week is Microsoft shipping the IDE and CLI dev loop for that exact workflow. The VS Code refresh is GA; the in-editor agent observability and the azd observability experience are both in preview. Pair them and the local agent dev loop in VS Code starts to look like the local API loop did five years ago.

    Tool search in Foundry toolboxes (preview)

    Part 1, example 3 connected to the Microsoft Learn remote MCP server as a tool source. Tool search is the catalog UX for that pattern once a Foundry toolbox holds more than one MCP server. Useful the moment a team has more than two or three.

    Microsoft Foundry IQ (GA) and two new knowledge sources (preview)

    Foundry IQ is the managed knowledge layer for grounding agents in enterprise data: connect SharePoint, OneLake, Azure Blob, and other sources, and Foundry IQ handles the retrieval pipeline that previously had to be rebuilt for each project. The MCP-server-as-tool-source pattern from Part 1 covers half of grounding; Foundry IQ packages the retrieval half behind one managed surface. The two new sources land alongside the GA: Azure SQL Database becomes a first-class knowledge source, and a Microsoft Fabric Ontology can be queried as a federated source. The Fabric Ontology one is the more interesting half. Agents query the semantic layer Fabric customers already curate, instead of a parallel definition built only for retrieval.

    User feedback logging in Microsoft Foundry (preview)

    Part 2 writes a PR comment and exits 0, 1, or 2. The Jenkins gate decides the next step; the eventual human verdict on whether the gate was right never makes it back into the agent. Feedback logging is the Foundry-native equivalent for capturing that verdict so the next evaluator run has a ground-truth signal to train against.

    The pattern across all six is the one the agent space has been on for a year. A community or hand-rolled approach gets validated, then absorbed into a managed surface.

    Microsoft Discovery (GA)

    Microsoft Discovery is generally available as an enterprise platform for building and governing agentic AI workflows for R&D organizations across scientific and engineering disciplines. This was previewed earlier in the year. The GA is the signal that the agentic-workflow surface inside Microsoft is no longer just the Foundry Agent Service.

    Foundry and AI Search platform updates

    Four more Foundry and AI Search items that are not Agent Framework follow-ons but are worth knowing.

    Private connectivity for AI Search and Foundry Knowledge Bases (GA)

    Ingestion, enrichment, retrieval, and agent traffic between AI Search and Foundry Knowledge Bases can now stay on Shared Private Link or Network Security Perimeter end-to-end. Together with the Purview integration that went GA in the June 2 drop, the retrieval layer is closing the same governance and networking gaps the data-plane services closed years ago.

    APIM support for Foundry Models in Azure AI Search (preview)

    Azure API Management can now front all Foundry model integrations used by Azure AI Search. The reason this matters for platform teams: it puts a single throttling, key-vault, and observability surface between AI Search and the underlying model deployments, instead of each search workload calling models directly.

    Content Understanding chunking and image verbalization in AI Search (preview)

    The Content Understanding pipeline can now chunk and verbalize images as part of AI Search indexing. The output is searchable text derived from images, which lets a single retrieval query span text and visual content.

    Domain filter in the Foundry model catalog (preview)

    The Foundry model catalog adds a domain filter that narrows the 1,900-plus models to the ones trained for a specific industry or use case, with filters for domains like robotics and biomedical sciences. A small UX change, but the catalog crossed the size where browse-by-name stopped scaling a while ago. This is the model-catalog equivalent of the tool search update above – the same problem (too many things in the catalog) solved one layer down.

    Databases

    HorizonDB AI pipelines (preview)

    HorizonDB, the Postgres-compatible database introduced in the June 2 drop, now lets you describe an AI ingestion workflow (chunking, embedding, extraction, generation, ranking) declaratively in SQL and run it as a fault-tolerant pipeline inside the engine. Same play as the rest of HorizonDB: keep the RAG pipeline on the operational database instead of stitching together a separate service per stage.

    DocumentDB advanced full-text search (preview)

    Advanced full-text search lands in DocumentDB, alongside the instant free-tier clusters that shipped on June 2. HorizonDB and DocumentDB are clearly the two databases Microsoft wants to push for new workloads.

    Postgres Flexible Server DuckDB extension (GA)

    The DuckDB extension is now GA in Azure Database for PostgreSQL Flexible Server. DuckDB-in-Postgres turns the operational instance into a competent analytics endpoint for parquet and CSV in blob storage without moving the data. For the small-to-medium analytics that do not justify a Fabric or Synapse footprint, this is the simplest viable answer.

    Azure Monitor

    Three Monitor updates landed together and all went GA.

    OTLP ingestion is GA: send OpenTelemetry Protocol signals straight from instrumented applications and platforms to Azure Monitor with no Application Insights SDK in between. Dynamic thresholds for log search alerts went GA, so the platform calculates the threshold instead of asking the operator to guess. And Azure Monitor Service Level Indicators reached GA. Combined with the simple log alerts and OpenTelemetry metrics that shipped on June 2, Azure Monitor is methodically closing the gap with the open observability stack.

    A day later, on June 5, Metrics Usage Insights added an Ingestion Volume Change dashboard in preview, for comparing ingestion volume over time and spotting spikes or drops in time-series counts and event ingestion rates. It is a cost-and-noise control surface more than an observability one – the dashboard you open when the Monitor bill jumps and you need to know which stream moved.

    Confidential computing

    Three updates on the confidential side. Confidential Clean Rooms gets a preview of multiparty analytics, a managed service for partners to jointly analyze privacy-sensitive datasets with Apache Spark without exposing the underlying data. Confidential live migration for Intel TDX VMs is in development, which is the last big operational gap separating confidential VMs from regular ones. And Azure Confidential Ledger gains a GA backup tool for audit retention of ledger files.

    GitHub Copilot modernization agent (GA)

    The GitHub Copilot modernization agent is GA. It coordinates application assessments and upgrades across a whole portfolio, not just a single repo. For the Java-on-old-Spring or .NET-Framework migration backlog most enterprises still carry, this is the first serious estate-wide automation Microsoft has shipped.

    Migration

    Azure Files assessments worldwide in Azure Migrate (GA)

    Azure Migrate now discovers and assesses SMB and NFS file shares hosted on Windows and Linux servers, worldwide. File shares were the awkward gap in most migrate-to-Azure-Files plans: you could assess the servers but had to size and plan the share targets by hand. This closes that gap and gives a data-driven view of the file-share estate. Read it alongside the Copilot modernization agent above – one is the estate-level story for application code, this is the estate-level story for file data.

    App Service Flex Consumption: rolling updates (GA)

    Rolling updates are GA in the Flex Consumption plan. Instead of restarting all instances during a deploy, the platform rolls them. Zero-downtime deploys on Flex no longer need a slot-swap or external front door.

    Compute

    Lasv5 and Laosv5 storage-optimized VMs (preview)

    Storage-optimized VM series based on the 5th-generation AMD EPYC (Turin). Lasv5 targets high disk capacity, throughput, and I/O. Laosv5 targets the same shape with a different storage profile.

    Azure Infrastructure Resiliency Manager (preview)

    A new preview service for orchestrating resiliency testing and recovery across an Azure estate. It sits in the same conversation as Chaos Studio but framed for the resiliency-program owner rather than the SRE writing fault-injection experiments.

    Guest RDMA on Azure Boost (private preview)

    Landing on June 5, Guest RDMA is in preview on Azure Boost, starting in UK South, bringing high-throughput, ultra-low-latency RDMA networking directly into guest VMs within a region. Offloaded to Azure Boost, this is the kind of networking that used to require specialized HPC SKUs showing up as a general guest-VM capability – the part that matters for tightly-coupled HPC and AI-training traffic that is sensitive to latency between nodes.

    Speech, voice, and language

    MAI-Voice-2 is in preview in Foundry. Custom Voice portal and self-serve custom photo avatar creation both went GA. Voice Live API also picks up avatar voice sync with custom voices in preview, pairing a branded or persona-specific text-to-speech voice with a real-time avatar – the piece that ties the custom-voice and custom-avatar tracks together. On the language side, Text Analytics for Health NextGen Playground is GA, and the Conversational and Text PII NextGen playgrounds shipped updates. The pattern: the NextGen playgrounds are now the default front door for the language services.

    Region

    Azure Red Hat OpenShift is GA in Belgium Central.


  • The June 2, 2026 Azure Update Drop: What Shipped and Why It Matters

    On June 2, Microsoft pushed out over a hundred Azure updates in one coordinated drop. Two areas dominated: AI agents and databases. Here are the topics worth knowing and why each one is interesting.

    Azure HorizonDB

    HorizonDB is a Postgres-compatible database, in preview since Ignite. Microsoft already has two managed Postgres options, so the reason for a third matters: it uses shared storage with scale-out compute – the same architecture as AWS Aurora and Google AlloyDB – so compute and storage scale separately and there’s no sharding to manage. Microsoft’s existing distributed Postgres (the Citus engine) shards data across nodes, which is harder to run. HorizonDB also has vector search built into the engine using Microsoft’s DiskANN index, so RAG retrieval can run inside the operational database instead of a separate search service. It’s Microsoft’s answer to Aurora and AlloyDB, and a play for the Postgres migrations coming off Oracle.

    Cosmos DB and DocumentDB

    Cosmos DB: per-partition automatic failover, changing partition keys on the NoSQL API, and an all-versions-and-deletes change feed mode all went GA. In preview: distributed transactions, native Azure Backup, a cost estimator, and a relational-to-NoSQL migration assistant. DocumentDB, the Mongo-compatible engine, added instant free-tier clusters and service-managed failover. Changing partition keys and distributed transactions are the notable ones – both were long-standing gaps in Cosmos.

    MCP across the platform

    MCP (Model Context Protocol) showed up as a built-in surface in several services the same day: an MCP Toolkit for Cosmos DB (GA) and DocumentDB (preview), a data-plane MCP server in API Center, MCP servers as knowledge sources in Foundry IQ, and a workspace-wide MCP for Databricks Genie in Copilot Studio. The interesting part is the timing: MCP went from a community pattern to something Azure services ship natively, across several products in one release wave.

    Foundry agent features

    Several agent capabilities went GA: Hosted Agents in the Foundry Agent Service (managed agent hosting) and Voice Live integration. Two new models landed in the catalog, MAI-Image-2.5 and MAI-Transcribe-1.5. Logic Apps can now call Foundry Agents directly, and a new Logic Apps automation SKU targets agent workflows, which puts agents in the low-code integration layer. On governance, Azure Policy now covers the Model Router and Global PTU reservations became region-agnostic, so reserved capacity isn’t tied to one region.

    Azure AI Search

    OneLake catalog integration and a GenAI prompt skill went GA. In preview, Purview sensitivity labels and access auditing now apply to knowledge sources. The Purview integration is the interesting part: it adds data-governance controls to the retrieval layer that RAG apps depend on.

    Speech and Translator

    GA: the LLM Speech API, Speech SDK 1.50, a Unified Text Translation API, and image, PDF, and Office document translation in AI Translator. HD voice HDv2.5 is in preview. The Unified Text Translation API consolidates the translation features under one API.

    Compute and VMs

    New Arm-based Cobalt 200 VM series (Dpsv7, Dplsv7, Epsv7, Mpsv4, Lpsv5) are in development, Azure Linux 4.0 is in preview, and VMSS Flex got automatic OS image upgrades. Cobalt 200 is Microsoft’s own Arm silicon, following AWS Graviton – more of the fleet moving to in-house chips.

    Containers and AKS

    Azure Container Linux is GA on AKS, AKS Fleet Manager now manages Arc-enabled clusters, and App Gateway for Containers added Istio service-mesh integration. The Fleet Manager change extends multi-cluster management to clusters running outside Azure.

    API Management

    Wildcard custom hostnames went GA on Premium v2 and Standard v2. A Unified Model API for multi-model AI apps is in preview, which lets one API front multiple AI models – relevant if you route across models.

    Monitoring

    Simple log alerts and dynamic thresholds for log search alerts went GA, along with OpenTelemetry metrics and visualizations in Azure Monitor for VMs and Arc servers. Simple log alerts are a lower-friction path than the existing log-search alert rules.

    Storage and Files

    A file-share-centric management model (Microsoft.FileShares) for Azure Files went GA. Secure access to Azure Files on macOS with Entra ID is in preview – native Entra-authenticated SMB from a Mac, which previously needed workarounds.


  • Microsoft Agent Framework 1.0, part 2: a Jenkins-gated debate between agents

    This is Part 2 of the Microsoft Agent Framework 1.0 series. Part 1 went from hello-world to local tools to a remote MCP server. The natural next step is the one the docs only hint at: multiple agents talking to each other, with a third agent breaking the tie. A code-review gate is a good shape for this because the cost of a wrong decision is concrete (a broken merge or a blocked PR), and the inputs (a diff, a PR body, a scan report) are easy to feed in.

    I built one. It lives in a small repo called ai-gate. Jenkins runs it as a per-PR Kubernetes Job: the Job opens a debate between a velocity advocate and a caution advocate, a judge issues a verdict in a strict format, the result is posted back to the PR as a comment, and the Job exits 0, 1, or 2. Jenkins gates the deploy stage on that exit code. Everything below is from the real source.

    You need everything from Part 1 plus three deployments on the same Azure OpenAI account. I used gpt-4o-mini for the velocity advocate (cheap, fast), gpt-4o for the caution advocate (better long-form reasoning), and o4-mini for the judge (a reasoning model). The Job also needs a GitHub fine-grained PAT with contents:read and pull_requests:write on the repos you want to gate.

    Each persona is a system prompt. Velocity and caution are nearly mirror images. The judge fixes the output shape so a parser can read the verdict deterministically:

    VELOCITY = """
    SHIP-IT SAM, the velocity
    advocate. Argue the PR
    SHOULD merge.
    
    Rules:
    - Cite evidence from the diff,
    PR body, tests, scans.
    - 4-6 tight sentences.
    - Concede ONLY when CAUTION
    invalidates your case.
    """
    
    CAUTION = """
    HOLD-THE-LINE HOLLY, the
    quality advocate. Argue the
    PR should be BLOCKED.
    
    Rules:
    - Cite evidence from the diff,
    PR body, tests, scans.
    - 4-6 sentences. Be honest.
    - Never manufacture concerns.
    """
    
    JUDGE = """
    JUDGE-CARLA. Read the debate.
    Output STRICTLY:
    
    VERDICT: APPROVE|BLOCK|NEEDS_HUMAN
    CONFIDENCE: 0-100
    SUMMARY:
    
    KEY EVIDENCE FOR APPROVE:
    - <one line>
    
    KEY EVIDENCE FOR BLOCK:
    
    - <one line>
    
    NEXT ACTION:
    <one line>
    
    """
    
    

    Three rounds of velocity-then-caution, each agent seeing the rolling transcript. After the last round the judge gets the full transcript and emits a verdict in the strict format. One round is partisan; two leaves the last rebuttal unanswered; three forces both sides to confront their weakest point, then either concede or double down:

    def make(deploy, instr):
    return ChatClientAgent(
    chat_client=OAI(
    endpoint=ENDPOINT,
    api_key=KEY,
    deployment_name=deploy,
    api_version=VERSION,
    ),
    instructions=instr,
    )
    
    def run_debate(brief):
    v = make(VEL_DEP, VELOCITY)
    c = make(CAU_DEP, CAUTION)
    j = make(JUDGE_DEP, JUDGE)
    
    log = []
    roll = brief + "\n(no turns)\n"
    
    for n in range(1, ROUNDS + 1):
    out = v.run(
    roll +
    f"\nRound {n}. APPROVE."
    ).content
    log.append(("vel", out))
    
    out2 = c.run(
    roll +
    f"\nVEL: {out}\nBLOCK."
    ).content
    log.append(("cau", out2))
    
    roll += (
    f"\n--- round {n} ---"
    f"\nVEL: {out}"
    f"\nCAU: {out2}\n"
    )
    
    j_in = brief + (
    "\nTRANSCRIPT:\n" +
    "\n".join(
    f"### {w}\n{m}"
    for w, m in log
    ) +
    "\nIssue verdict."
    )
    v_out = j.run(j_in).content
    log.append(("judge", v_out))
    return log, v_out
    

    The strict judge format makes parsing a one-liner per field, not an LLM call. Map VERDICT to an exit code: APPROVE=0, BLOCK=1, NEEDS_HUMAN=2. Jenkins gates the deploy stage on that exit code, so APPROVE flows through and anything else stops the build. The whole thing wraps in one Jenkins shared-library function:

    def call(args) {
    def repo = args.repo
    def pr = args.pr
    def base = args.baseSha
    def head = args.headSha
    def slug = repo
    .replaceAll('[/_.]', '-')
    .toLowerCase()
    def sha = head.take(8)
    def name = (
    "ai-gate-${slug}" +
    "-${pr}-${sha}"
    )
    
    sh """
    envsubst  | kubectl apply -f -
    kubectl wait \\
    --for=condition=complete \\
    --timeout=600s \\
    -n ai-gate \\
    job/${name}
    kubectl logs -n ai-gate \\
    job/${name}
    POD=\$(kubectl get pod \\
    -n ai-gate \\
    -l job-name=${name} \\
    -o jsonpath='{.items[0]\\
    .metadata.name}')
    exit \$(kubectl get pod \\
    -n ai-gate \$POD \\
    -o jsonpath='{.status\\
    .containerStatuses[0].state\\
    .terminated.exitCode}')
    """
    }
    

    Drop that in vars/aiCodeGate.groovy in your Jenkins shared library, then any Jenkinsfile can add a stage that calls aiCodeGate(repo: ..., pr: env.CHANGE_ID, baseSha: env.CHANGE_TARGET_SHA, headSha: env.GIT_COMMIT) before its deploy stage.

    Per PR works out to roughly $0.05 to $0.20 depending on diff size, with 3 rounds of two agents plus one judge call over a ~15k-token context. Wall time is 30-90 seconds. The Foundry Hub and Project are deployed, so the next obvious step is swapping the direct Azure OpenAI chat client for a Foundry connection – that’s Part 3.


  • Microsoft Agent Framework 1.0: three runnable Python examples

    Microsoft Agent Framework reached 1.0 GA on May 21, 2026 for both .NET and Python. It consolidates the previous Semantic Kernel and AutoGen SDKs into one supported codebase with a stable API surface and a long-term support commitment.

    To see what zero-to-agent actually looks like against my own Azure OpenAI deployment, I wrote three small Python examples that progressively add capability: a hello-world agent, an agent with local Python tools, and an agent that connects to a remote MCP server. All three are real files I ran.

    You need Python 3.10 or newer, an Azure OpenAI deployment (I used gpt-4o-mini, which is gpt-4.1-mini behind the alias), the endpoint URL and an API key, and pip install agent-framework==1.0 python-dotenv. Drop the credentials in a .env:

    AOAI_ENDPOINT=https://...
    AOAI_DEPLOYMENT=gpt-4o-mini
    AOAI_API_KEY=...
    AOAI_API_VERSION=2024-10-21
    

    The first example is a single agent that takes one prompt and returns one response. No tools, no state, no MCP. If it returns a sentence, the SDK and the deployment are both working:

    import os
    from dotenv import load_dotenv
    from agent_framework import (
        ChatClientAgent,
    )
    from agent_framework.azure import (
        AzureOpenAIChatClient,
    )
    
    load_dotenv()
    E = os.environ
    
    client = AzureOpenAIChatClient(
        endpoint=E["AOAI_ENDPOINT"],
        api_key=E["AOAI_API_KEY"],
        deployment_name=E["AOAI_DEPLOYMENT"],
        api_version=E["AOAI_API_VERSION"],
    )
    
    agent = ChatClientAgent(
        chat_client=client,
        instructions=(
            "Terse release-notes "
            "summarizer. One sentence."
        ),
    )
    
    prompt = (
        "Summarize: Azure VPN P2S "
        "now supports user-group "
        "IP pools, GA 2026-05-21."
    )
    print(agent.run(prompt).content)
    

    Adding a tool is one more parameter. The Annotated[..., "description"] pattern feeds each parameter description into the tool schema the model sees; the function docstring becomes the tool description:

    from typing import Annotated
    
    PRICES = {
        "MSFT": 412.10,
        "TSLA": 178.42,
        "PGEN": 1.23,
    }
    
    def get_price(
        sym: Annotated[
            str, "Ticker, uppercase"
        ],
    ) -> str:
        "Mock price for a ticker."
        p = PRICES.get(sym.upper())
        if p is None:
            return f"unknown {sym}"
        return f"{sym}: ${p:.2f}"
    
    agent = ChatClientAgent(
        chat_client=client,
        instructions=(
            "Portfolio assistant. "
            "Use the tool for prices."
        ),
        tools=[get_price],
    )
    
    q = "MSFT price? PGEN above $1?"
    print(agent.run(q).content)
    

    With two questions in one prompt the agent calls the tool twice and summarizes both results.

    The third example skips local tools entirely and pulls them from a remote MCP server. Agent Framework ships an MCP client, so any compliant server is a tool source with no glue code. The Microsoft Learn MCP server at learn.microsoft.com/api/mcp is public and Streamable-HTTP:

    import asyncio
    from agent_framework.mcp import (
        MCPStreamableHTTPClient,
    )
    
    URL = (
        "https://learn.microsoft.com"
        "/api/mcp"
    )
    
    async def main():
        mcp = MCPStreamableHTTPClient(
            url=URL,
        )
        async with mcp:
            tools = await mcp.list_tools()
            agent = ChatClientAgent(
                chat_client=client,
                instructions=(
                    "MS Learn assistant. "
                    "Use the MCP tools."
                ),
                tools=tools,
            )
            q = (
                "Front Door Standard "
                "vs Premium tiers?"
            )
            r = await agent.run_async(q)
            print(r.content)
    
    asyncio.run(main())
    

    Swap the URL for GitHub’s, Atlassian’s, or your own private one – the rest of the code is unchanged.

    Two gotchas worth knowing: agent-framework==1.0 moved instructions to the ChatClientAgent constructor, so preview code that set it on ChatClientAgentOptions will break. And if a model deployment rejects parallel tool calls, pass parallel_tool_calls=False on the agent constructor.


  • Azure updates: weeks 21-22, 2026

    Azure shipped a batch of updates over the last two weeks. Here are the five worth more than a line.

    Microsoft Agent Framework 1.0 GA

    Microsoft Agent Framework reached 1.0 GA on May 21 for both .NET and Python. The 1.0 release commits Microsoft to a stable API surface with long-term support. Agent Framework consolidates work that was previously split across Semantic Kernel and AutoGen.

    One breaking change from preview to GA: the Instructions setting moved off ChatClientAgentOptions and onto the ChatClientAgent constructor directly.

    Source: Azure update #560982.

    Azure Front Door WebSocket support (public preview)

    Azure Front Door Standard and Premium now support WebSocket connections with no additional configuration. The HTTP handshake is inspected, then the upgraded connection passes through.

    The practical use case is putting Web PubSub, SignalR, or a custom WebSocket workload behind Front Door for global edge presence and WAF inspection at the handshake. Two things to know: WAF rules apply only to the handshake, not to the open connection; and Web PubSub is not on Front Door Premium’s supported Private Link origin list, so a Standard Internal Load Balancer or Application Gateway is needed between Front Door and a private Web PubSub endpoint.

    Source: Azure Front Door WebSocket (Microsoft Learn).

    P2S User Groups and IP address pools (GA)

    Azure VPN Gateway Point-to-Site connections can now assign IP addresses from different pools based on the user’s Entra ID group membership. Prior to this GA, segmenting VPN users by role required parallel gateways or NSG rules keyed off unpredictable IP ranges.

    The mechanism: define User Groups on the gateway, map each group to an address pool, and set a priority for users that match multiple groups. Downstream NSGs can then key off per-group subnets.

    Source: Azure update #564460.

    Entra-only identities with Azure Files (GA)

    Azure Files SMB shares can now authenticate against Microsoft Entra ID alone, without AD DS, Entra Domain Services, or hybrid identity sync.

    For organizations running cloud-only identity, this removes one of the remaining reasons to keep a domain controller running just for file-share access. Hybrid environments still have their own configuration paths.

    Source: Azure update #562359.

    Virtual network flow logs connector with Microsoft Sentinel (GA)

    Azure now offers a native data connector that sends Virtual Network flow logs directly into Microsoft Sentinel. Before this, getting VNet or NSG flow logs into Sentinel for SecOps correlation required a custom ingestion pipeline, typically built on Storage, Event Hub, and a Function App or Logic App.

    For Sentinel deployments that already ingest the other Azure first-party signal sources (Activity Log, Entra ID sign-ins, Defender for Cloud), this adds network-layer telemetry without parallel ingestion infrastructure. Typical detection use cases include lateral movement, anomalous east-west flows, and exfiltration over uncommon ports.

    The connector is GA, not preview.

    Source: Azure update #564689.


  • Running eclipse-mosquitto in an Azure IoT Edge Module

    I was looking to run a MQTT broker on an IoT edge device, allowing devices to communicate locally with each other. The most important thing is to modify the “Container Create Options” so that the container’s port can be seen on the local network. Also I want to use the no auth mosquitto config (it is in development), so I overrode the cmd section.

    With those changes, the device was able to pull the standard docker image “eclipse-mosquitto:latest” and it started running on the local device and the exposed 1883 port was available for local devices

    {
      "NetworkingConfig": {
        "EndpointsConfig": {
          "host": {}
        }
      },
      "ExposedPorts": {
        "1883/tcp": {}
      },
      "HostConfig": {
        "PortBindings": {
          "1883/tcp": [
            {
              "HostPort": "1883"
            }
          ]
        },
        "NetworkMode": "host"
      },
      "Cmd": [
        "/usr/sbin/mosquitto",
        "-c",
        "/mosquitto-no-auth.conf"
      ]
    }
    

  • How to find the Raspberry Pi model

    Note to self:

    How to find the Raspberry Pi Model :

    cat /proc/device-tree/model