How We Built a Context System That Manages 20+ AI Steps Without Losing Data
How We Built a Context System That Manages 20+ AI Steps Without Losing Data
TL;DR: We built a sophisticated context management system for multi-step AI website generation. It tracks 20+ sequential steps, handles failures gracefully, enables undo/redo, and maintains data consistency. Built with TypeScript, DynamoDB, and a custom step orchestrator. Zero data loss in 10,000+ website generations.
The Problem: AI Workflows Are Stateful Nightmares
Generating a website isn’t one AI call—it’s 20+ sequential calls:
- Research business
- Generate strategy
- Create brand guidelines
- Design logo
- Generate hero image
- Create header
- Create footer
- Plan pages
- Generate page 1 (sections 1-5)
- Generate page 2 (sections 1-3)
- … (20+ total steps)
Each step depends on previous steps:
- Logo needs brand colors (from step 3)
- Header needs logo (from step 4)
- Pages need strategy (from step 2)
What happens when step 12 fails?
- Do we restart from step 1? (expensive)
- Do we skip step 12? (incomplete website)
- Do we retry step 12? (how many times?)
Traditional approaches:
- Global state object: Pass massive object through every step (memory leak)
- Database queries: Query DB for every dependency (slow)
- No state management: Hope nothing fails (it will)
We needed something better.
The Insight: Context as a First-Class Citizen
The breakthrough came when we stopped treating context as “data we pass around” and started treating it as “the system itself.”
Bad: Steps receive data, return data
function generateLogo(businessName: string, brandColors: string[]) {
// Where did brandColors come from? Who knows!
return logoUrl;
}
Good: Steps receive context, modify context
function generateLogo(context: WorkflowContext) {
// Context knows where brandColors came from
const brandColors = context.websiteContext.getBrandStrategy().colors;
const logoUrl = await createLogo(brandColors);
// Store result in context
context.websiteContext.setLogo(logoUrl);
// Context tracks completion
context.websiteContext.markStepCompleted('logo');
}
The difference? Context is self-documenting, self-validating, and self-healing.
How It Works: The Technical Architecture
1. WebsiteGenerationContext (The Core)
The central data structure that holds ALL website data:
class WebsiteGenerationContext {
private data: {
businessResearch?: BusinessResearch;
webSearchResults?: WebSearchResults;
strategy?: Strategy;
brandStrategy?: BrandStrategy;
logo?: Logo;
heroImage?: HeroImage;
header?: Header;
footer?: Footer;
pages?: Page[];
completedSteps: Set<string>;
};
constructor(
private businessId: string,
private businessName: string,
private versionId: string
) {
this.data = {
completedSteps: new Set()
};
}
// Getters (with validation)
getBusinessResearch(): BusinessResearch {
if (!this.data.businessResearch) {
throw new Error('Business research not yet completed');
}
return this.data.businessResearch;
}
getBrandStrategy(): BrandStrategy {
if (!this.data.brandStrategy) {
throw new Error('Brand strategy not yet completed');
}
return this.data.brandStrategy;
}
// Setters (with auto-completion tracking)
setBusinessResearch(research: BusinessResearch) {
this.data.businessResearch = research;
this.markStepCompleted('research');
}
setBrandStrategy(strategy: BrandStrategy) {
this.data.brandStrategy = strategy;
this.markStepCompleted('brand-strategy');
}
// Step tracking
markStepCompleted(stepId: string) {
this.data.completedSteps.add(stepId);
}
isStepCompleted(stepId: string): boolean {
return this.data.completedSteps.has(stepId);
}
// Persistence
async save() {
await saveToStorage(this.businessId, this.versionId, this.data);
}
static async load(businessId: string, versionId: string): Promise<WebsiteGenerationContext> {
const data = await loadFromStorage(businessId, versionId);
const context = new WebsiteGenerationContext(businessId, data.businessName, versionId);
context.data = data;
return context;
}
}
2. WorkflowContext (The Wrapper)
Wraps WebsiteGenerationContext with version info and metadata:
interface WorkflowContext {
// Core data
websiteContext: WebsiteGenerationContext;
// Version info
versionInfo: {
versionId: string;
createdAt: Date;
isLive: boolean;
};
// Execution metadata
currentStep?: string;
totalSteps?: number;
startTime?: Date;
// Helper methods
getBusinessId(): string;
getBusinessName(): string;
getVersionId(): string;
}
function createWorkflowContext(
websiteContext: WebsiteGenerationContext,
versionId: string
): WorkflowContext {
return {
websiteContext,
versionInfo: {
versionId,
createdAt: new Date(),
isLive: false
},
getBusinessId: () => websiteContext.businessId,
getBusinessName: () => websiteContext.businessName,
getVersionId: () => versionId
};
}
3. Step Interface (The Contract)
Every step implements this interface:
interface Step {
id: string;
name: string;
description: string;
// Dependencies (what data this step needs)
requiredInputs: string[];
// Validation (check if dependencies are met)
validateInputs(context: WorkflowContext): ValidationResult;
// Execution (do the work)
execute(context: WorkflowContext): Promise<StepResult>;
// Progress tracking
progressWeight: number; // 0-100 (how much of total progress)
timeout: number; // Max execution time (ms)
}
interface ValidationResult {
valid: boolean;
missingInputs?: string[];
errorMessage?: string;
}
interface StepResult {
success: boolean;
data?: any;
error?: string;
shouldRetry?: boolean;
}
4. Step Implementation Example
Here’s how a real step uses the context:
class BrandStrategyStep implements Step {
id = 'brand-strategy';
name = 'Brand Strategy';
description = 'Generate brand colors, fonts, and tone';
requiredInputs = ['businessResearch', 'strategy'];
progressWeight = 10;
timeout = 60000; // 60 seconds
validateInputs(context: WorkflowContext): ValidationResult {
const missing: string[] = [];
try {
context.websiteContext.getBusinessResearch();
} catch {
missing.push('businessResearch');
}
try {
context.websiteContext.getStrategy();
} catch {
missing.push('strategy');
}
if (missing.length > 0) {
return {
valid: false,
missingInputs: missing,
errorMessage: `Missing required inputs: ${missing.join(', ')}`
};
}
return { valid: true };
}
async execute(context: WorkflowContext): Promise<StepResult> {
// Get dependencies from context
const research = context.websiteContext.getBusinessResearch();
const strategy = context.websiteContext.getStrategy();
try {
// Generate brand strategy with AI
const brandStrategy = await generateBrandStrategy({
businessName: context.getBusinessName(),
businessType: research.businessType,
targetAudience: strategy.targetAudience,
brandPersonality: strategy.brandPersonality
});
// Store result in context
context.websiteContext.setBrandStrategy(brandStrategy);
// Save context to disk
await context.websiteContext.save();
return {
success: true,
data: brandStrategy
};
} catch (error) {
return {
success: false,
error: error.message,
shouldRetry: true // Retry on failure
};
}
}
}
5. Step Orchestrator (The Brain)
Manages step execution, retries, and error handling:
class StepOrchestrator {
private steps: Step[];
private maxRetries = 3;
constructor(steps: Step[]) {
this.steps = steps;
}
async executeWorkflow(context: WorkflowContext): Promise<WorkflowResult> {
const results: StepResult[] = [];
let currentProgress = 0;
for (const step of this.steps) {
console.log(`Executing step: ${step.name}`);
// Update current step in context
context.currentStep = step.id;
// Validate inputs
const validation = step.validateInputs(context);
if (!validation.valid) {
return {
success: false,
failedStep: step.id,
error: validation.errorMessage
};
}
// Execute with retries
let result: StepResult;
let attempts = 0;
while (attempts < this.maxRetries) {
attempts++;
try {
// Execute with timeout
result = await Promise.race([
step.execute(context),
this.timeout(step.timeout)
]);
if (result.success) {
break; // Success, move to next step
}
if (!result.shouldRetry) {
break; // Don't retry, fail immediately
}
console.log(`Step ${step.name} failed, retrying (${attempts}/${this.maxRetries})`);
} catch (error) {
result = {
success: false,
error: error.message,
shouldRetry: true
};
}
}
if (!result.success) {
return {
success: false,
failedStep: step.id,
error: result.error,
completedSteps: results.length
};
}
results.push(result);
// Update progress
currentProgress += step.progressWeight;
await this.publishProgress(context, currentProgress);
}
return {
success: true,
results
};
}
private timeout(ms: number): Promise<never> {
return new Promise((_, reject) => {
setTimeout(() => reject(new Error('Step timeout')), ms);
});
}
private async publishProgress(context: WorkflowContext, progress: number) {
await publishSSE(context.getVersionId(), {
type: 'progress',
progress,
currentStep: context.currentStep
});
}
}
6. Dynamic Step Enqueueing
Some steps create more steps (e.g., planning creates page generation steps):
class PlanningStep implements Step {
id = 'planning';
name = 'Planning';
description = 'Plan website pages and sections';
requiredInputs = ['strategy'];
progressWeight = 5;
timeout = 30000;
async execute(context: WorkflowContext): Promise<StepResult> {
const strategy = context.websiteContext.getStrategy();
// Generate page plan
const pagePlan = await generatePagePlan(strategy);
// Store plan in context
context.websiteContext.setPagePlan(pagePlan);
// Enqueue dynamic page generation steps
for (const page of pagePlan.pages) {
const pageStep = new PageGenerationStep(page.id, page.name, page.sections);
this.orchestrator.enqueueStep(pageStep);
}
return { success: true, data: pagePlan };
}
}
7. Persistence and Recovery
Context is saved after every step:
async function saveWorkflowContext(context: WorkflowContext) {
const data = {
businessId: context.getBusinessId(),
versionId: context.getVersionId(),
websiteData: context.websiteContext.data,
completedSteps: Array.from(context.websiteContext.data.completedSteps),
versionInfo: context.versionInfo,
savedAt: new Date()
};
// Save to DynamoDB
await dynamodb.put({
TableName: 'WebsiteGenerationContexts',
Item: data
});
// Also save to S3 for backup
await s3.putObject({
Bucket: 'webzum-contexts',
Key: `${context.getBusinessId()}/${context.getVersionId()}/context.json`,
Body: JSON.stringify(data, null, 2)
});
}
async function loadWorkflowContext(businessId: string, versionId: string): Promise<WorkflowContext> {
// Try DynamoDB first (fast)
const dynamoData = await dynamodb.get({
TableName: 'WebsiteGenerationContexts',
Key: { businessId, versionId }
});
if (dynamoData.Item) {
return reconstructContext(dynamoData.Item);
}
// Fall back to S3 (slower but always available)
const s3Data = await s3.getObject({
Bucket: 'webzum-contexts',
Key: `${businessId}/${versionId}/context.json`
});
return reconstructContext(JSON.parse(s3Data.Body.toString()));
}
The Challenges We Solved
Challenge 1: Circular Dependencies
Problem: Header needs logo, logo needs brand colors, brand colors need strategy, strategy needs research
Solution: Dependency graph validation
function validateDependencyGraph(steps: Step[]): ValidationResult {
const graph = new Map<string, Set<string>>();
// Build dependency graph
for (const step of steps) {
graph.set(step.id, new Set(step.requiredInputs));
}
// Detect cycles using DFS
const visited = new Set<string>();
const recursionStack = new Set<string>();
function hasCycle(stepId: string): boolean {
visited.add(stepId);
recursionStack.add(stepId);
const deps = graph.get(stepId) || new Set();
for (const dep of deps) {
if (!visited.has(dep)) {
if (hasCycle(dep)) return true;
} else if (recursionStack.has(dep)) {
return true; // Cycle detected!
}
}
recursionStack.delete(stepId);
return false;
}
for (const stepId of graph.keys()) {
if (!visited.has(stepId)) {
if (hasCycle(stepId)) {
return {
valid: false,
errorMessage: `Circular dependency detected involving ${stepId}`
};
}
}
}
return { valid: true };
}
Challenge 2: Partial Failures
Problem: Step 15 fails, but steps 1-14 succeeded. Don’t want to regenerate everything.
Solution: Resume from last successful step
async function resumeWorkflow(businessId: string, versionId: string) {
// Load existing context
const context = await loadWorkflowContext(businessId, versionId);
// Find last completed step
const completedSteps = context.websiteContext.data.completedSteps;
const allSteps = getWorkflowSteps();
// Filter out completed steps
const remainingSteps = allSteps.filter(step => !completedSteps.has(step.id));
console.log(`Resuming from step ${remainingSteps[0]?.name}`);
// Execute remaining steps
const orchestrator = new StepOrchestrator(remainingSteps);
return await orchestrator.executeWorkflow(context);
}
Challenge 3: Version Management
Problem: User edits website, creates new version. How do we track multiple versions?
Solution: Version-aware context
async function createNewVersion(businessId: string, baseVersionId: string): Promise<string> {
// Load base version
const baseContext = await loadWorkflowContext(businessId, baseVersionId);
// Create new version ID
const newVersionId = generateVersionId();
// Clone context with new version ID
const newContext = {
...baseContext,
versionInfo: {
versionId: newVersionId,
createdAt: new Date(),
isLive: false,
parentVersionId: baseVersionId
}
};
// Save new version
await saveWorkflowContext(newContext);
return newVersionId;
}
The Results: Zero Data Loss
Before (no context system):
- 15% of website generations failed mid-process
- No way to resume failed generations
- Users had to start over (terrible UX)
- Lost $3,000/month in wasted AI API calls
After (context system):
- 0.1% failure rate (only unrecoverable errors)
- 99.9% of failures resume successfully
- Users never see failures (auto-retry)
- Saved $3,000/month in wasted API calls
Additional benefits:
- Undo/redo: Easy to implement (just load previous context)
- A/B testing: Generate multiple versions in parallel
- Debugging: Full audit trail of what happened when
- Analytics: Track which steps take longest, fail most often
Why This Matters for AI Applications
Most AI applications treat state as an afterthought. We learned:
Bad: Hope AI calls succeed, panic when they fail Good: Design for failure, make recovery automatic
The startup lesson: Context management is the difference between “AI toy” and “AI product.” Users don’t care about your AI—they care that it works reliably.
Key Insights
- Context as a system: Not just data, but the source of truth
- Validate early: Check dependencies before executing
- Save often: After every step, not just at the end
- Resume gracefully: Don’t make users start over
What’s Next
We’re exploring:
- Distributed execution: Run steps in parallel when possible
- Context diffing: Show exactly what changed between versions
- Time travel debugging: Replay workflow with different inputs
- Context sharing: Let multiple users collaborate on same context
But the core insight remains: State management is infrastructure, not a feature.
Try it yourself: Generate a website with WebZum, watch the progress bar. Each step is tracked in the context system—if it fails, we resume automatically.
Building an AI workflow? Key takeaway: Design your context system first, build your AI second. Context is the foundation that makes everything else possible.
The future of AI applications isn’t better models—it’s better state management.