API Reference

All chaos options and model wrappers

For the base cruel(...) function API, see Core API. This page focuses on cruel/ai-sdk.

cruelModel

Wraps a language model with chaos injection.

import { cruelModel } from "cruel/ai-sdk"

const model = cruelModel(openai("gpt-4o"), {
  rateLimit: 0.2,
  delay: [100, 500],
})

cruelEmbeddingModel

Wraps an embedding model.

import { cruelEmbeddingModel } from "cruel/ai-sdk"

const model = cruelEmbeddingModel(openai.embedding("text-embedding-3-small"), {
  rateLimit: 0.2,
})

cruelImageModel

Wraps an image model.

import { cruelImageModel } from "cruel/ai-sdk"

const model = cruelImageModel(openai.image("dall-e-3"), {
  rateLimit: 0.2,
})

cruelSpeechModel

Wraps a speech model.

import { cruelSpeechModel } from "cruel/ai-sdk"

const model = cruelSpeechModel(openai.speech("tts-1"), {
  rateLimit: 0.1,
})

cruelTranscriptionModel

Wraps a transcription model.

import { cruelTranscriptionModel } from "cruel/ai-sdk"

const model = cruelTranscriptionModel(openai.transcription("whisper-1"), {
  rateLimit: 0.1,
})

cruelVideoModel

Wraps a video model.

import { cruelVideoModel } from "cruel/ai-sdk"

const model = cruelVideoModel(google.video("veo-2.0-generate-001"), {
  rateLimit: 0.2,
})

cruelProvider

Wraps an entire provider and automatically dispatches to the correct wrapper based on model type.

import { cruelProvider } from "cruel/ai-sdk"

const chaos = cruelProvider(openai, {
  rateLimit: 0.1,
  models: {
    "gpt-4o": { rateLimit: 0.5 },
  },
})

chaos("gpt-4o")                    // cruelModel
chaos.embeddingModel("text-embedding") // cruelEmbeddingModel
chaos.imageModel("dall-e-3")           // cruelImageModel

cruelMiddleware

Creates AI SDK middleware for chaos injection.

import { cruelMiddleware } from "cruel/ai-sdk"
import { wrapLanguageModel } from "ai"

const model = wrapLanguageModel({
  model: openai("gpt-4o"),
  middleware: cruelMiddleware({ rateLimit: 0.1 }),
})

cruelTool / cruelTools

Wraps tool execution with chaos.

import { cruelTool, cruelTools } from "cruel/ai-sdk"

const tool = cruelTool(myTool, { toolFailure: 0.2 })
const tools = cruelTools({ search, calc }, { toolFailure: 0.1 })

Model ID Override

If MODEL is set in the environment, Cruel swaps the model ID used by wrappers:

MODEL=gpt-6 bun run your-script.ts
  • gpt-4o -> gpt-6
  • openai/gpt-4o -> openai/gpt-6

Chaos Options

All options are probabilities between 0 and 1 (0 = never, 1 = always).

Pre-call Failures

These fire before the API request is made.

optiontypedescription
rateLimitnumber | { rate, retryAfter? }simulates 429 rate limit (retryable)
overloadednumbersimulates 529 model overloaded (retryable)
modelUnavailablenumbersimulates 503 model not available (retryable)
failnumbersimulates 500 generation failed (retryable)
invalidApiKeynumbersimulates 401 invalid key (not retryable)
quotaExceedednumbersimulates 402 quota exceeded (not retryable)
contextLengthnumbersimulates 400 context too long (not retryable)
contentFilternumbersimulates 400 content filtered (not retryable)
emptyResponsenumbersimulates 200 with empty body (not retryable)
timeoutnumberhangs forever (never resolves)
delaynumber | [min, max]adds latency in ms before the call

Post-call Mutations

These modify the response after a successful API call.

optiontypedescription
partialResponsenumbertruncates the response text randomly
finishReasonstringoverrides the finish reason
tokenUsage{ inputTokens?, outputTokens? }overrides token counts

Stream Transforms

These modify the token stream in real-time.

optiontypedescription
slowTokensnumber | [min, max]adds delay between each token in ms
streamCutnumberkills the stream mid-transfer
corruptChunksnumberreplaces random characters with the replacement character

Tool Options

optiontypedescription
toolFailurenumbertool execution throws an error
toolTimeoutnumbertool execution hangs forever

Presets

import { presets } from "cruel/ai-sdk"
presetrateLimitoverloadedstreamCutdelay
realistic0.020.01-50-200ms
unstable0.10.050.05100-500ms
harsh0.20.10.1200-1000ms
nightmare0.30.150.15500-2000ms
apocalypse0.40.20.21000-5000ms

onChaos Callback

Every chaos event fires a callback with the event type and model ID.

const model = cruelModel(openai("gpt-4o"), {
  rateLimit: 0.2,
  onChaos: (event) => {
    console.log(event.type, event.modelId)
  },
})

Event types: rateLimit, overloaded, contextLength, contentFilter, modelUnavailable, invalidApiKey, quotaExceeded, emptyResponse, fail, timeout, delay, streamCut, slowTokens, corruptChunk, partialResponse, toolFailure, toolTimeout.

Diagnostics

Programmatic chaos reporting for test suites. Track events, record results, compute stats, and print reports.

import { cruelModel, diagnostics } from "cruel/ai-sdk"

const ctx = diagnostics.context()

const model = cruelModel(openai("gpt-4o"), {
  rateLimit: 0.3,
  delay: [100, 500],
  onChaos: diagnostics.tracker(ctx),
})

for (let i = 1; i <= 10; i++) {
  diagnostics.before(ctx, i)
  const start = performance.now()
  try {
    const result = await generateText({ model, prompt: "test", maxRetries: 2 })
    diagnostics.success(ctx, i, Math.round(performance.now() - start), result.text)
  } catch (e) {
    diagnostics.failure(ctx, i, Math.round(performance.now() - start), e)
  }
}

diagnostics.print(ctx)

Raw Stats for Assertions

const s = diagnostics.stats(ctx)

s.total          // number of requests
s.succeeded      // number of successes
s.failed         // number of failures
s.successRate    // 0-1
s.duration       // total ms
s.totalEvents    // number of chaos events
s.events         // [{ type, count, percent }]
s.errors         // failed requests with status, retryable, event chain
s.requests       // all requests with events
s.latency.success // { avg, p50, p99, min, max }
s.latency.failure // { avg, p50, p99, min, max }

use in tests

test("survives 30% rate limits", async () => {
  const ctx = diagnostics.context()
  const model = cruelModel(openai("gpt-4o"), {
    rateLimit: 0.3,
    onChaos: diagnostics.tracker(ctx),
  })

  for (let i = 1; i <= 20; i++) {
    diagnostics.before(ctx, i)
    const start = performance.now()
    try {
      await generateText({ model, prompt: "test", maxRetries: 2 })
      diagnostics.success(ctx, i, Math.round(performance.now() - start), "ok")
    } catch (e) {
      diagnostics.failure(ctx, i, Math.round(performance.now() - start), e)
    }
  }

  const s = diagnostics.stats(ctx)
  expect(s.successRate).toBeGreaterThan(0.5)
  expect(s.latency.success.p99).toBeLessThan(5000)
})