2026

Periskope AI Self Learning

AI Infrastructure · Periskope

I built an automated AI training pipeline for Periskope that extracts knowledge from successful WhatsApp conversations, keeping the AI's knowledge base fresh without manual data labeling.

AI Training

Gemini

Mistral

BullMQ

RabbitMQ

TypeScript

PlatformWeb · API · AI Infrastructure

ClientPeriskope

My Role

AI Engineer

Full-Stack Engineer

The Knowledge Decay Problem

In the fast-moving world of WhatsApp business communication, information becomes stale quickly. At Periskope, we noticed a recurring friction point: support teams were too busy to update their documentation, leading to AI agents providing outdated answers. The manual 'Review-Identify-Update' cycle was a bottleneck that prevented our AI from scaling.

I wanted to build a 'Self-Learning' loop—a system that could identify when the AI successfully navigated a complex customer query and automatically internalize that knowledge for the next time. The goal wasn't just to automate data entry, but to turn every successful customer interaction into a training sample for the model's long-term memory.

The best training data isn't in a static PDF; it's buried in the successful resolutions of yesterday's support tickets.

Architecting the Learning Pipeline

To handle this at scale across hundreds of organizations, I designed a multi-stage pipeline triggered by staggered cron jobs. Instead of running all training at midnight, we distribute the load into 24-hour slots to prevent API rate-limiting spikes. This process begins by emitting an organization ID and a time range into a RabbitMQ training queue.

The first worker picks up these jobs and fetches up to 300 candidate conversations where the AI agent successfully provided an internal reply. This initial 'broad net' approach ensures we don't miss nuanced interactions, but it introduces a significant amount of noise that requires a tiered filtering strategy.

pipeline-trigger.ts

interface TrainingJob {
  orgId: string;
  timeRange: { start: Date; end: Date };
  status: 'pending' | 'processing' | 'completed';
}

// Staggered trigger to prevent API thundering herd
async function triggerStaggeredCron() {
  const orgs = await db.orgs.findMany();
  orgs.forEach((org, index) => {
    const delay = (index * 2) * 60000; // 2 min intervals
    setTimeout(() => enqueueTraining(org.id), delay);
  });
}

Cost-Efficient Pre-Screening

Processing 300 conversations through a high-reasoning model like Gemini or GPT-4 would be prohibitively expensive. To optimize for cost, I implemented a pre-screening layer using Ministral 3B via OpenRouter. This lightweight model processes candidates in batches of 10, performing a binary classification: is this conversation suitable training material?

We look for conversations that are resolved, professional, and contain generalizable knowledge rather than specific PII. This stage aggressively filters the 300 candidates down to the top 50 high-quality interactions, ensuring our heavy-lifting models only see the 'gold' data.

83% reduction in noise

Filtering Efficiency

<$0.01 per org/day

Avg. Processing Cost

Topical Grouping with Union-Find

A common issue in automated training is redundancy—learning the same 'Refund Policy' update ten times from ten different chats. To solve this, I used a Union-Find (Disjoint Set Union) algorithm to group conversations based on tag overlap. If two chats share at least two metadata tags (e.g., #billing and #refund), they are unified into the same topical cluster.

By identifying these clusters, we can select the 'best' representative conversation (the most comprehensive one) for the final training stage. This drastically reduces the number of tokens sent to the final LLM and prevents the creation of duplicate knowledge base entries.

The Agentic Training Stage

For the actual extraction of knowledge, I used Gemini 1.5 Flash. It operates as a tool-calling agent with three primary capabilities: creating a new context entry, updating an existing one, or skipping the entry entirely if it doesn't offer new value.

The agent is prompted to generate the knowledge in a specific format: a question with three varied phrasings to improve vector search recall, and an answer that supports `<INSTRUCTION>` tags for conditional logic. For example, 'If the customer is on a legacy plan, tell them X; otherwise, tell them Y.' This allows the AI to learn not just facts, but business rules.

gemini-tools.ts

const trainingTools = {
  create_context: (params: { q: string[], a: string, tags: string[] }) => {
    return db.ai_contexts.create({ ...params, status: 'pending_approval' });
  },
  update_context: (params: { id: string, new_info: string }) => {
    // Appends information to existing KB entry
    return db.ai_contexts.update({ where: { id: params.id }, data: { ... } });
  }
};

Human-in-the-Loop & Safety

Even with high-quality models, fully automated AI learning is risky. Every entry generated by the pipeline is stored in `tbl_ai_contexts` with a `pending_approval` status. We auto-generate an embedding vector using OpenAI's `text-embedding-3-small` so the entry is ready for retrieval the moment it's approved.

Each entry also retains metadata pointing back to the source `chat_id`. When a human reviewer looks at a suggested knowledge base update, they can click a link to view the exact conversation the AI learned it from. This 'provenance' is crucial for building trust with our users.

Provenance is the key to AI trust. Always provide a link back to the source data that influenced a model's decision.

Reflections and Future Improvements

The pipeline has been a game-changer for our customers, particularly those with rapidly evolving product catalogs. However, the current Union-Find approach relies heavily on the quality of manual tags applied by agents. If I were to rebuild this today, I would move toward a semantic clustering approach using vector embeddings to group conversations, making the pipeline less dependent on human tagging.

I'm also exploring a 'Self-Correction' mechanism where the AI can flag its own knowledge base entries for deletion if it notices a conversation where a previously 'learned' fact was corrected by a human agent in real-time. This would turn the system into a truly self-healing knowledge ecosystem.