Polaris ML/AI Training

AI Agents Bootcamp

advanced60h45 lessons

Build Atlas — an autonomous AI agent that answers your team's questions from the codebase (the source of truth) plus Foundry, Salesforce, Confluence and observability data, then triages issues on Slack and email, files and prioritizes Jira tickets, reports a daily digest, and escalates only what it can't handle. 45 hands-on lessons covering agent architecture, RAG over code, MCP integrations, evaluation, and production deployment — built on the Claude Agent SDK with LangGraph and OpenAI Agents SDK comparisons. Part 2 of a path: the AI Chatbot Bootcamp (Part 1) builds the grounded conversational foundation, and here the same Atlas learns to take action.

AI AgentClaude Agent SDKMCPRAGSlack BotJira AutomationSalesforcePalantir FoundryLangGraphOpenAI Agents SDKVector SearchTriagePrioritizationEvaluationLLM-as-JudgeProduction AITypeScriptPython

Premium content

Upgrade to premium to access all lessons, tutorials, and hands-on exercises.

Lesson Overview

1

Lesson 1: The Tech Lead Bottleneck — Why Build a Support Agent

Meet the Nimbus Logistics scenario, quantify the interrupt cost that makes a tech lead a bottleneck, and define exactly what Atlas must do: answer, triage, file, prioritize, report, and escalate.

2

Lesson 2: Anatomy of an AI Agent — Loop, Tools, Memory, Model

Strip an 'AI agent' down to its four moving parts: a model, a tool interface, a memory/state store, and the control loop that ties them together. Build the smallest possible agent loop from scratch so the abstractions are never magic.

3

Lesson 3: Choosing Your Stack — Claude Agent SDK vs LangGraph vs OpenAI Agents SDK vs Raw

A hands-on comparison of four ways to build the agent: the Claude Agent SDK (our primary choice), LangGraph, the OpenAI Agents SDK, and raw provider function-calling. Decide with a scorecard instead of hype.

4

Lesson 4: The Reference Architecture — From Slack Message to Jira Ticket

The end-to-end blueprint for Atlas: ingress (Slack/email) → orchestrator → knowledge layer (RAG + MCP servers) → action layer (Jira/Slack) → reporting & escalation. The diagram you'll refer back to in every later lesson.

5

Lesson 5: Setting Up Your Dev Environment & Claude API Access

Get a clean, reproducible workspace: API keys and a secrets strategy, the Claude Agent SDK installed, a sandbox for the nimbus-platform sample repo, and a 'hello, Atlas' agent that proves the whole toolchain works end to end.

6

Lesson 6: Your First Q&A Agent — Answer One Real Question End to End

Wire the pieces from Module A into a working agent that answers one real engineering question from a small hardcoded knowledge base, with a citation and a graceful 'I don't know — escalating.' This is the skeleton Atlas grows into.

7

Lesson 7: RAG Fundamentals — Embeddings, Chunking, and Vector Search

The mental model behind retrieval-augmented generation: how text becomes vectors, why we chunk, how nearest-neighbor search works, and why RAG — not fine-tuning — is the right tool for grounding Atlas in an ever-changing codebase.

8

Lesson 8: Indexing a Codebase — AST-Aware Chunking vs Naive Splitting

Code is not prose. Learn why fixed-size text splitting wrecks code retrieval, and build structure-aware chunking that respects functions, classes, and files — plus the metadata (path, language, symbol) that makes citations and filtering possible.

9

Lesson 9: Choosing a Vector Store — pgvector vs Pinecone vs Qdrant vs Chroma

A practical comparison of vector databases for Atlas: managed vs self-hosted, metadata filtering, hybrid search support, scale, and cost. Pick one with a scorecard and stand it up locally.

10

Lesson 10: Building the Codebase Retrieval Tool

Turn the vector store into a clean tool the agent can call: search_codebase(query, filters) → ranked chunks with citations. Define the contract, handle empty results honestly, and wire it into the Lesson 6 agent.

11

Lesson 11: Hybrid Search — Combining BM25 Keyword + Semantic Vectors

Pure vector search misses exact identifiers; pure keyword search misses paraphrases. Combine them with reciprocal rank fusion and add a re-ranker so Atlas finds both 'RetryPolicy' and 'how we re-attempt uploads.'

12

Lesson 12: Keeping the Index Fresh — Incremental Re-indexing on Git Push

A stale index makes Atlas confidently wrong. Build an incremental pipeline that re-embeds only changed files on each push using git diffs and content hashes — so the source of truth stays true.

13

Lesson 13: Citing Sources — Grounding Answers in File + Line References

An uncited answer is a rumor. Make Atlas attach precise, clickable file+line citations to every claim, enforce citation in the prompt, and detect ungrounded sentences so trust is earned, not assumed.

14

Lesson 14: Evaluating Retrieval Quality — Recall@k, MRR, and Golden Sets

You can't improve what you don't measure. Build a golden question→source dataset and compute recall@k, MRR, and nDCG so every retrieval change (chunking, hybrid, re-ranking, k) is judged by data, not vibes.

15

Lesson 15: MCP Crash Course — One Protocol for All Your Tools

Why every integration in Atlas is an MCP server: the Host→Client→Server model, tools/resources/prompts, the JSON-RPC wire format, and the schema validation that stops the model from passing garbage arguments. Build a minimal server.

16

Lesson 16: Integrating Confluence / the Internal Wiki

Build the Confluence MCP server so Atlas can read architecture docs and runbooks — while treating the wiki as supporting context, not source of truth, and flagging staleness against the code.

17

Lesson 17: Integrating Palantir Foundry — Ontology & Datasets

Connect Atlas to Palantir Foundry so it can answer data questions grounded in the ontology and live datasets — using the OSDK, mapping ontology objects to MCP tools, and respecting Foundry's governance.

18

Lesson 18: Integrating Salesforce — Cases, Accounts, and Knowledge

Connect Atlas to Salesforce so it can tie an engineering question to the customer impact behind it: which accounts are affected, what support cases reference this bug, and what the Salesforce Knowledge base already says.

19

Lesson 19: Integrating GitHub/GitLab — PRs, Issues, and Code History

Static code answers 'what'; git history answers 'who, when, and why it changed.' Build the GitHub MCP server so Atlas can cite the PR that introduced a behavior, find related issues, and avoid duplicate bug reports.

20

Lesson 20: Integrating Observability — Logs, Metrics, and Errors (Datadog/Sentry)

Connect Atlas to Datadog and Sentry so it can ground incident questions in real telemetry: error rates, recent exceptions, and traces — turning 'is something broken?' from speculation into evidence.

21

Lesson 21: Unifying Sources — A Federated Knowledge Router

Six sources, one question. Build the router that decides which sources to consult for a given query, fans out in parallel, merges results, and stays within a latency and token budget — so Atlas is fast and focused, not exhaustive.

22

Lesson 22: Conflicting Sources & Source-of-Truth Precedence

When the code, the wiki, and Salesforce disagree, Atlas needs a rule — not a coin flip. Encode an explicit precedence policy, detect conflicts, resolve them deterministically, and surface the disagreement to the user.

23

Lesson 23: Slack App Setup — Events API, Socket Mode, and Tokens

Stand up the Slack app that is Atlas's front door: scopes, bot vs app tokens, Events API vs Socket Mode, signature verification, and the 3-second ack rule that trips up every first-time Slack developer.

24

Lesson 24: Receiving & Threading — Conversations, Mentions, and Context

Turn raw Slack events into coherent conversations: reply in threads, gather thread history for context, decide when Atlas should and shouldn't chime in, and normalize everything into the canonical InboundRequest from Lesson 4.

25

Lesson 25: Rich Slack Responses — Block Kit, Buttons, and Modals

Plain text undersells a good answer. Use Block Kit to format citations, add 'File a ticket?' / 'Escalate' / 'This helped' buttons, and capture structured feedback and human-in-the-loop approvals right inside Slack.

26

Lesson 26: Email Intake — Parsing Inbound Email into Requests

Bring the shared eng-help inbox into Atlas: receive mail via SES/Gmail, strip quoted replies and signatures, extract the real question and any attachments, and normalize into the same InboundRequest the Slack path produces.

27

Lesson 27: Replying by Email & Maintaining Conversation Context

Send replies that thread correctly, read well in any mail client, carry citations as links, and remember the conversation — plus the deliverability basics (SPF/DKIM/DMARC) that keep Atlas out of spam folders.

28

Lesson 28: Multi-Channel Identity — Mapping Users Across Slack, Email, and Jira

The same person is U07X in Slack, jordan@nimbus in email, and jlee in Jira. Build the identity resolution layer so Atlas knows who's asking across channels — enabling correct ticket reporters, permissions, and personalized context.

29

Lesson 29: Issue Triage — Classifying Questions, Bugs, and Requests

Before Atlas acts, it must understand intent: is this a question to answer, a bug to file, or a feature request to capture? Build a reliable, schema-constrained classifier and measure it against a labeled set.

30

Lesson 30: Severity & Impact Scoring with an LLM Rubric

Turn 'this seems bad' into a defensible severity score. Build an explicit rubric the model applies consistently — blending blast radius, customer impact (from Salesforce), and telemetry — so every ticket carries a justified severity.

31

Lesson 31: Deduplication — Detecting Duplicate Issues Before Filing

An agent that files a fresh ticket for every report becomes a tracker-spamming menace. Build duplicate detection across Jira and GitHub using embeddings + an LLM confirm, then link or comment instead of creating noise.

32

Lesson 32: Creating Jira Tickets via the API

Build the Jira MCP server that turns a triaged, deduped, scored issue into a well-formed ticket: correct project/type/fields, a reproducible description with citations, and an idempotent create that never double-files.

33

Lesson 33: Smart Field Population — Inferring Assignee, Epic, and Sprint

Go beyond a bare ticket: infer the right component owner from CODEOWNERS, link to the relevant epic, suggest a sprint based on severity and capacity — and know when to leave a field blank rather than guess wrong.

34

Lesson 34: Prioritization Engine — WSJF and RICE Scoring for the Backlog

Help the tech lead decide what's next. Implement transparent prioritization models (WSJF and RICE), feed them Atlas's gathered signals, and produce a ranked, explainable backlog rather than an opaque 'AI says do this.'

35

Lesson 35: The Triage Decision Tree — Answer, File, or Escalate

Tie Module E together into one decision policy: given intent, confidence, severity, and dedup results, what does Atlas DO? Build an explicit, testable decision tree so the agent's autonomy is bounded and predictable.

36

Lesson 36: Confidence Estimation — Knowing When It Doesn't Know

Calibrated humility is Atlas's most important safety property. Learn why raw model confidence is unreliable, and build practical confidence signals — retrieval strength, self-consistency, groundedness — that actually predict correctness.

37

Lesson 37: The Orchestrator — Routing Between Tools and Sub-Agents

Assemble the brain that runs the whole show: a top-level orchestrator that owns the loop, routes to specialized sub-agents (answerer, triager, filer), manages the context budget, and enforces the decision policy.

38

Lesson 38: Memory & State — Short-Term Context vs Long-Term Store

Give Atlas a memory: conversation buffers for the current thread, and a durable long-term store for 'we already answered this' and 'this bug is already filed' — plus the summarization that keeps context affordable.

39

Lesson 39: Human-in-the-Loop — Approval Gates for Risky Actions

Decide what Atlas may do alone and what needs a human's yes. Build a permission policy keyed to action risk and reversibility, render approvals as fast Slack interactions, and make gated actions auditable.

40

Lesson 40: Escalation Protocols — Paging the Tech Lead the Right Way

Escalation is Atlas's most important safety valve — and the easiest to ruin. Build tiered escalation (DM vs page), pack escalations with context so the human can act in seconds, and tune thresholds to avoid alert fatigue.

41

Lesson 41: Daily Digest & Reporting — What Atlas Did While You Slept

Turn Atlas's activity into a trusted daily report for the tech lead: what it answered, filed, and escalated; what it's unsure about; and the trends worth your attention — delivered on a schedule, skimmable in two minutes.

42

Lesson 42: Guardrails & Safety — Prompt Injection, PII, and Permissioning

An agent that reads untrusted text and takes actions is an attack surface. Defend against prompt injection, handle PII and secrets safely, enforce least-privilege permissions, and contain the blast radius when something goes wrong.

43

Lesson 43: Evaluating the Whole Agent — Eval Sets, LLM-as-Judge, Trajectories

Retrieval eval (Lesson 14) wasn't enough. Build end-to-end agent evaluation: an offline eval set, LLM-as-judge for answer quality, trajectory scoring for whether Atlas took the right actions, and a regression gate that protects all of it.

44

Lesson 44: Deploying to Production — Hosting, Secrets, Observability, and Cost

Take Atlas from your laptop to reliable production: a durable ingress queue, where to host the orchestrator and MCP servers, secrets and config, tracing and metrics, and the cost controls that keep an LLM agent affordable.

45

Lesson 45: Capstone — The Autonomous Engineering Support Agent, End to End

Assemble everything into the complete Atlas: a real Slack message flows through ingress, knowledge, triage, dedup, filing, and escalation, reports in the digest, and is protected by gates, eval, and observability. Ship it on shadow mode.

AI Agents Bootcamp | ML/AI Bootcamp | Polaris ML/AI Training