Observability & Operations

Performance

Measurement, bottlenecks, optimization strategies

Performance in PURISTA comes from horizontal scaling, not faster code. Because services are stateless and communicate through messages, you scale by adding instances — not by optimizing algorithms.

The scaling model

flowchart LR
    LB["Load Balancer<br/>or Broker"] --> I1["Instance 1"]
    LB --> I2["Instance 2"]
    LB --> I3["Instance 3"]
    I1 --> DB[(Database)]
    I2 --> DB
    I3 --> DB
  • The broker distributes messages across service instances
  • No session affinity required
  • Instances are interchangeable — start more, stop some, no data loss
  • Scale per service — User Service needs 3 instances, Email Service needs 1

Measuring performance

Latency

Measure end-to-end latency with OpenTelemetry traces:

// Every message is automatically traced
// Check your Jaeger/Tempo/Zipkin dashboard for:
// - event_bridge.route duration
// - command execution duration
// - subscription processing duration

Throughput

Monitor message rates:

// Messages per second per command/subscription
// Queue backlog depth
// Subscription consumer lag

Resource usage

  • CPU per service instance
  • Memory per service instance
  • Database connection pool utilization
  • Broker queue depth

Common bottlenecks

BottleneckSymptomSolution
Slow database queriesHigh command latencyAdd indexes, optimize queries, use connection pooling
Single hot commandOne instance overloadedScale that service independently
Large payloadsHigh serialization costSplit into smaller messages, use references
Synchronous external callsCommand blocks for secondsUse queues for async work
Missing indexesDatabase scansAdd indexes for query patterns
In-memory cachingState lost on restartUse Redis state store

Optimization strategies

1. Scale horizontally

Add instances for the service that needs more capacity:

# Scale User Service to 5 replicas
apiVersion: apps/v1
kind: Deployment
metadata:
  name: user-service
spec:
  replicas: 5

2. Use queues for long work

Don’t block commands with slow operations. Declare .canEnqueue(queueId, payloadSchema) on the builder to get the typed context.queue.enqueue.queueId(payload) helper:

// ❌ Bad: command blocks for minutes
.setCommandFunction(async function (context, payload) {
  await processLargeFile(payload.fileId) // blocks for 5 minutes
})

// ✅ Good: declare enqueue access, then enqueue and return immediately
.canEnqueue('processFile', z.object({ fileId: z.string() }))
.setCommandFunction(async function (context, payload) {
  const job = await context.queue.enqueue.processFile({ fileId: payload.fileId })
  return { jobId: job.id, status: 'queued' }
})

3. Batch operations

Process multiple items in one command:

.addPayloadSchema(z.object({
  items: z.array(z.object({ id: z.string() })).max(100),
}))
.setCommandFunction(async function (context, payload) {
  const results = await Promise.all(
    payload.items.map(item => processItem(item))
  )
  return { processed: results.length }
})

4. Cache with state stores

.setCommandFunction(async function (context, payload) {
  const cacheKey = `user:${payload.userId}`
  const cached = await context.states.getState(cacheKey)

  if (cached[cacheKey]) {
    return cached[cacheKey]
  }

  const user = await context.resources.db.getUser(payload.userId)
  await context.states.setState(cacheKey, user)
  return user
})

5. Tune queue bridge settings

Queue bridges have their own configuration for batch sizes and recovery behavior. Tuning these affects how quickly jobs are claimed and retried after a worker crash.

RedisQueueBridge exposes scheduleBatchSize (how many scheduled-but-not-yet-due jobs to promote per poll cycle) and recoveryBatchSize (how many expired leases to reclaim per cycle):

import { RedisQueueBridge } from '@purista/redis-queue-bridge'

const queueBridge = new RedisQueueBridge({
  config: { url: process.env.REDIS_URL },
  keyPrefix: 'myapp:queue:',
  scheduleBatchSize: 50,   // jobs promoted from scheduled→pending per poll
  recoveryBatchSize: 20,   // expired leases reclaimed per poll cycle
})

NatsQueueBridge uses a NATS JetStream KV store. To maximize throughput, run more worker instances rather than tuning the bridge — NATS handles distribution automatically:

import { NatsQueueBridge } from '@purista/nats-queue-bridge'

const queueBridge = new NatsQueueBridge({
  connectionOptions: { servers: process.env.NATS_URL },
  subjectPrefix: 'myapp',
  releaseBatchSize: 20,  // expired leases released back to pending per cycle
})

6. Connection pooling

Database and external API connection pools are not managed by PURISTA — configure them in your resources. A common pattern is to share a pool across all commands in a service:

const pool = new Pool({ connectionString: process.env.DATABASE_URL, max: 20 })

const myService = await myV1Service.getInstance(eventBridge, {
  resources: { db: pool },
})

Keep max pool size proportional to the number of concurrent jobs per instance — a worker instance handling 10 parallel jobs typically needs 10–20 database connections.

When to optimize

  • Latency exceeds SLA
  • Throughput cannot keep up with demand
  • Resource costs are too high
  • User experience degrades

When NOT to optimize

  • Premature optimization before measuring
  • Micro-optimizations that hurt readability
  • Optimizing the wrong layer (code vs. infrastructure)

Common pitfalls

  • Optimizing before measuring. Profile first. Optimize the bottleneck.
  • Ignoring the broker. A slow broker affects all services.
  • Over-caching. Stale cache causes bugs. Use TTL.
  • Blocking the event loop. Use queues for CPU-intensive work.

Checklist

  • Latency is measured end-to-end with traces
  • Throughput is monitored per command/subscription
  • Bottlenecks are identified before optimizing
  • Long work uses queues, not blocking commands
  • Caching uses state stores with TTL
  • Scaling is horizontal (more instances) before vertical (bigger instances)
  • Load tests verify performance under realistic conditions

Related

Read Next
Getting Started

from Learning Paths & Tutorials