Core Building Blocks / AI Agent

Testing

Test agents with core testing helpers and scripted model providers.

Agent testing helpers are exported from @purista/core.

Level	API	Validates
Handler context	`createAgentContextMock(...)`	Direct `setRunFunction` logic with fake resources and model handles.
Runtime harness	`createAgentTestHarness(...)`	Full attached agent definition with model bindings, output validation, skill bindings, and stream execution.
Skill fixtures	`createAgentSkillTestRuntime(...)`	Temporary `SKILL.md` bindings for agents that declare `.useSkills(...)`.

Runtime harness

Use createAgentTestHarness(...) when you want to test the generated attached agent definition without starting a full service:

import { createAgentTestHarness, createScriptedHarnessModel } from '@purista/core'
import { triageTicketAgentBuilder } from './triageTicketAgentBuilder.js'

test('classifies a ticket', async () => {
  const definition = await triageTicketAgentBuilder.getDefinition()
  const model = createScriptedHarnessModel()

  model.enqueueObject({
    object: {
      priority: 'high',
      reason: 'mentions production outage',
    },
    usage: { inputTokens: 0, outputTokens: 0, totalTokens: 0 },
    finishReason: 'stop',
  })

  const harness = await createAgentTestHarness(definition, {
    models: {
      primary: {
        provider: model,
        model: 'fake-object',
        capabilities: ['object'],
      },
    },
  })

  await expect(
    harness.run({
      payload: {
        ticketId: 'T-1',
        text: 'Production outage for an enterprise customer',
      },
    }),
  ).resolves.toEqual({
    priority: 'high',
    reason: 'mentions production outage',
  })
})

Handler context mock

Use createAgentContextMock(...) for narrow tests around custom setRunFunction(...) logic:

import { createAgentContextMock, createScriptedHarnessModel } from '@purista/core'

test('uses the primary model alias', async () => {
  const model = createScriptedHarnessModel()
  model.enqueueObject({
    object: { answer: 'Reset your password from settings.' },
    usage: { inputTokens: 0, outputTokens: 0, totalTokens: 0 },
    finishReason: 'stop',
  })

  const context = createAgentContextMock({
    payload: { query: 'How do I reset my password?' },
    models: {
      primary: model as never,
    },
  })

  const result = await runSupportAgent(context)

  expect(result.answer).toContain('password')
})

Stream tests

The runtime harness can execute the generated stream path and collect chunks:

const result = await harness.stream({
  payload: {
    ticketId: 'T-2',
    text: 'Stream this run',
  },
})

expect(result.chunks.length).toBeGreaterThan(0)

What to cover

success output from scripted model responses
invalid model output rejected by the agent output schema
missing runtime model aliases
capability mismatches between declared models and runtime bindings
command tool and child-agent failure behavior
stream success and stream failure paths
long-running response mode returning jobId, runId, and status or stream URLs