Test an agent

Use @purista/core/testing for deterministic tests. Unit and integration tests should not call real model providers.

The testing helpers let you:

execute an attached agent definition without starting a full service
inject fake model providers
enqueue scripted text, object, embedding, and rerank responses
assert output validation, capability failures, missing aliases, stream errors, and provider failures

Success path

import { createAgentTestHarness, createScriptedHarnessModel } from '@purista/core/testing'

const model = createScriptedHarnessModel()
model.enqueueObject({
  object: {
    priority: 'high',
    reason: 'mentions outage',
  },
  usage: {
    inputTokens: 0,
    outputTokens: 0,
    totalTokens: 0,
  },
  finishReason: 'stop',
})

const harness = createAgentTestHarness(triageAgent, {
  models: {
    primary: {
      provider: model,
      model: 'fake-object',
      capabilities: ['object'],
    },
  },
})

await expect(
  harness.run({
    payload: {
      ticketId: 'T-1',
      text: 'Production outage for enterprise customer',
    },
    message: { id: 'msg-1' },
  }),
).resolves.toEqual({
  priority: 'high',
  reason: 'mentions outage',
})

Invalid model output

Use invalid fake output to prove the PURISTA output schema is enforced.

const failingModel = createScriptedHarnessModel()
failingModel.enqueueObject({
  object: { priority: 'unknown' },
  usage: {
    inputTokens: 0,
    outputTokens: 0,
    totalTokens: 0,
  },
  finishReason: 'stop',
})

const failingHarness = createAgentTestHarness(triageAgent, {
  models: {
    primary: {
      provider: failingModel,
      model: 'fake-object',
      capabilities: ['object'],
    },
  },
})

await expect(
  failingHarness.run({
    payload: {
      ticketId: 'T-2',
      text: 'The request is ambiguous',
    },
  }),
).rejects.toThrow(/output validation failed/i)

Missing alias

Runtime startup should fail when a declared model alias is not bound.

await expect(
  createAgentTestHarness(triageAgent, {
    models: {},
  }).run({
    payload: {
      ticketId: 'T-3',
      text: 'Missing model binding',
    },
  }),
).rejects.toThrow(/missing runtime model binding/i)

Capability mismatch

Assert that tests catch provider capability drift before production startup does.

const model = createScriptedHarnessModel()

await expect(
  createAgentTestHarness(triageAgent, {
    models: {
      primary: {
        provider: model,
        model: 'fake-text',
        capabilities: ['text'],
      },
    },
  }).run({
    payload: {
      ticketId: 'T-4',
      text: 'Provider cannot produce objects',
    },
  }),
).rejects.toThrow(/capabil/i)

Embeddings and rerank

Fake provider calls can cover retrieval flows without a vector provider or external model.

const model = createScriptedHarnessModel()

model.enqueueEmbedding({
  embeddings: [{ index: 0, vector: [0.1, 0.2, 0.3] }],
  usage: { inputTokens: 0, outputTokens: 0, totalTokens: 0 },
})

model.enqueueRerank({
  results: [{ id: 'doc-2', index: 1, score: 0.92 }],
  usage: { inputTokens: 0, outputTokens: 0, totalTokens: 0 },
})

const harness = createAgentTestHarness(answerAgent, {
  models: {
    retrieval: {
      provider: model,
      model: 'fake-embedding',
      capabilities: ['embeddings'],
    },
    ranker: {
      provider: model,
      model: 'fake-rerank',
      capabilities: ['rerank'],
    },
    writer: {
      provider: model,
      model: 'fake-object',
      capabilities: ['object'],
    },
  },
})

For full RAG tests, keep the vector index as a fake PURISTA resource and assert the handler passes tenant filters and candidate text correctly.

Streams

For HTTP stream behavior, assert the generated stream chunks rather than real provider protocols.

const chunks: unknown[] = []

await harness.stream(
  {
    payload: {
      ticketId: 'T-5',
      text: 'Stream this run',
    },
  },
  {
    write: async chunk => {
      chunks.push(chunk)
    },
  },
)

expect(chunks.some(chunk => chunk.data?.type === 'response.created')).toBe(true)
expect(chunks.some(chunk => chunk.data?.type === 'response.completed')).toBe(true)

Also test stream writer failures. A failed writer should reject the stream run and call the failure path instead of losing the error.

Command tools and child agents

When an agent declares canInvoke(...) or canInvokeAgent(...), test both success and failure behavior.

await expect(
  harness.run({
    payload: {
      ticketId: 'T-6',
      text: 'Needs enrichment',
    },
    appContext: {
      service: fakeServiceWithCommandFailure,
    },
  }),
).rejects.toThrow(/customer lookup failed/i)

Useful assertions:

the expected command or child agent was called once
payload and parameter values are schema-shaped
command failure propagates or maps to the intended agent output
child-agent invalid output fails validation
cancellation stops downstream calls

Integration tests

Keep a small number of service-level integration tests around the generated PURISTA artifacts:

service startup fails without queueBridge
service startup fails without ai.models
aggregate command returns validated output
stream endpoint emits lifecycle chunks and closes with final output
long-running response mode returns jobId, runId, statusUrl, or streamUrl
queue worker retries and dead-letter behavior follow the configured queue bridge

Live-provider smoke tests

Live-provider tests are optional and should be isolated from normal CI. Use them only to verify credentials, endpoint configuration, provider options, and model availability.

Normal CI should run against fake providers.

Checklist

no unit test calls a real provider
fake provider tests cover success and invalid output
missing alias and capability mismatch are covered
command tool and child-agent unhappy paths are covered
stream success and writer failure paths are covered
long-running queue behavior has integration coverage

Service

Command

Stream

Subscription

Queues

Stores

AI Agents

Exposing Commands

Connect To PURISTA

Advanced

Event Bridges

Queue Bridges

Microservice Style

Temporal

Enterprise Interoperability

Test an agent

Success path

Invalid model output

Missing alias

Capability mismatch

Embeddings and rerank

Streams

Command tools and child agents

Integration tests

Live-provider smoke tests

Checklist

Test an agent ​

Success path ​

Invalid model output ​

Missing alias ​

Capability mismatch ​

Embeddings and rerank ​

Streams ​

Command tools and child agents ​

Integration tests ​

Live-provider smoke tests ​

Checklist ​

Test an agent

Success path

Invalid model output

Missing alias

Capability mismatch

Embeddings and rerank

Streams

Command tools and child agents

Integration tests

Live-provider smoke tests

Checklist