Skip to main content

Architecture

The AI Log Inspector Agent follows a modular, tool-based architecture built on top of Symfony AI components. This design enables flexible AI-powered log analysis with semantic search, conversational debugging, and extensible tool integration.

High-Level Architecture

┌──────────────────────────────────────────────────────────┐
│ Application Layer │
│ (Your Code: Controllers, Commands, Services) │
└──────────────┬───────────────────────────────────────────┘


┌──────────────────────────────────────────────────────────┐
│ LogInspectorAgent / LogInspectorChat │
│ Orchestrates AI interactions, tool selection, context │
└───────┬──────────────────────────────────┬───────────────┘
│ │
▼ ▼
┌─────────────────┐ ┌──────────────────────┐
│ Tool Layer │ │ Platform Layer │
│ │ │ │
│ • LogSearchTool │ │ • OpenAI Platform │
│ • RequestTool │◄─────────────│ • Anthropic Platform │
│ • Custom Tools │ │ • Ollama Platform │
└────────┬────────┘ └──────────────────────┘


┌──────────────────────────────────────────────────────────┐
│ Retriever Layer │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ LogRetriever (wraps Symfony Retriever) │ │
│ │ • Vectorizes query → Searches store → Returns │ │
│ └─────────────────────────────────────────────────────┘ │
└───────────────────────────┬──────────────────────────────┘


┌──────────────────────────────────────────────────────────┐
│ Indexer Layer │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ VectorLogDocumentIndexer │ │
│ │ • Loader → Transformers → Vectorizer → Store │ │
│ └─────────────────────────────────────────────────────┘ │
└───────────────────────────┬──────────────────────────────┘


┌──────────────────────────────────────────────────────────┐
│ Storage Layer │
│ │
│ ┌────────────────┐ ┌──────────────────────┐ │
│ │ Vector Store │ │ Message Store │ │
│ │ • InMemory │ │ (Chat History) │ │
│ │ • Chroma │ │ │ │
│ │ • Pinecone │ └──────────────────────┘ │
│ └────────────────┘ │
└──────────────────────────────────────────────────────────┘

Core Components

1. LogInspectorAgent

The main orchestrator that handles user queries and coordinates tools and AI platforms.

use Hakam\AiLogInspector\Agent\LogInspectorAgent;

$agent = new LogInspectorAgent(
platform: $platform, // AI platform (OpenAI, Anthropic, etc.)
tools: [$logSearchTool], // Tools the agent can use
systemPrompt: $customPrompt // Optional custom behavior
);

$result = $agent->ask('Why did payment fail?');

Responsibilities:

  • Parse and understand user questions
  • Select appropriate tools based on query
  • Coordinate multi-tool workflows
  • Format and return results

2. Tool System

Tools are specialized components that the agent can invoke to perform specific tasks.

LogSearchTool

Semantic search across log entries with AI-powered root cause analysis.

#[AsTool(
name: 'log_search',
description: 'Search logs for relevant entries'
)]
class LogSearchTool
{
public function __invoke(string $query): array
{
// 1. Retrieve via LogRetriever (vectorize + search)
// 2. Filter by relevance threshold
// 3. AI analysis of matching logs
// 4. Return structured results with evidence
}
}

RequestContextTool

Trace complete request lifecycles across distributed systems.

#[AsTool(
name: 'request_context',
description: 'Fetch all logs related to a request_id or trace_id'
)]
class RequestContextTool
{
public function __invoke(string $requestId): array
{
// 1. Find all logs with matching request ID
// 2. Sort chronologically
// 3. Build timeline
// 4. Return complete context
}
}

3. Platform Abstraction

Platform-agnostic AI integration supporting multiple providers.

use Hakam\AiLogInspector\Platform\LogDocumentPlatformFactory;

// OpenAI
$platform = LogDocumentPlatformFactory::create([
'provider' => 'openai',
'api_key' => $apiKey,
'model' => ['name' => 'gpt-4o-mini']
]);

// Anthropic
$platform = LogDocumentPlatformFactory::create([
'provider' => 'anthropic',
'api_key' => $apiKey,
'model' => ['name' => 'claude-3-5-sonnet-20241022']
]);

// Ollama (local)
$platform = LogDocumentPlatformFactory::create([
'provider' => 'ollama',
'host' => 'http://localhost:11434',
'model' => ['name' => 'llama3.2:1b']
]);

Features:

  • Unified interface across providers
  • Model capability detection
  • Automatic fallbacks
  • Configuration abstraction

4. Vector Store

Semantic similarity search using embeddings.

use Hakam\AiLogInspector\Store\VectorLogDocumentStore;

$store = new VectorLogDocumentStore($internalStore);

// Add documents
$store->add($vectorDocument);

// Query by similarity
$results = $store->queryForVector($queryVector, ['maxItems' => 10]);

Supported Backends:

  • InMemoryStore: Development/testing
  • ChromaStore: Production-ready, self-hosted
  • PineconeStore: Managed cloud service

5. Indexer Pipeline

The VectorLogDocumentIndexer provides a complete pipeline for loading, transforming, vectorizing, and storing log documents.

use Hakam\AiLogInspector\Indexer\LogFileIndexer;
use Hakam\AiLogInspector\Document\CachedLogsDocumentLoader;
use Hakam\AiLogInspector\Store\VectorLogDocumentStore;

// Create a loader for your log files
$loader = new CachedLogsDocumentLoader('/var/log/app');

// Create the indexer
$indexer = new LogFileIndexer(
embeddingPlatform: $platform,
model: 'text-embedding-3-small',
loader: $loader,
logStore: new VectorLogDocumentStore(),
chunkSize: 500, // Characters per text chunk
chunkOverlap: 100 // Overlap between chunks for context
);

// Index methods
$indexer->indexLogFile('app.log'); // Single file
$indexer->indexLogFiles(['a.log', 'b.log']); // Multiple files
$indexer->indexAllLogs(); // All .log files

Process:

  1. Loader reads log files from disk
  2. TextSplitTransformer chunks large documents
  3. Vectorizer converts chunks to embeddings via AI API
  4. VectorDocuments stored in Vector Store

6. Retriever

The LogRetriever handles vectorization and search in a single step, wrapping Symfony's Retriever internally:

use Hakam\AiLogInspector\Retriever\LogRetriever;

// The retriever handles vectorization + search in one step
$retriever = new LogRetriever(
embeddingPlatform: $platform->getPlatform(),
model: 'text-embedding-3-small',
logStore: $store
);

// Retrieve semantically similar logs
$results = $retriever->retrieve('payment timeout errors', ['maxItems' => 15]);

The LogRetriever implements LogRetrieverInterface and is used by both LogSearchTool and RequestContextTool for semantic search. The Vectorizer is still used internally by the Indexer pipeline for document ingestion.

Data Flow

Flow 1: Log File to Store (Indexing)

This flow shows how raw log files are ingested, transformed, and stored as searchable vector documents.

┌─────────────────────────────────────────────────────────────────────┐
│ LOG FILES ON DISK │
│ /var/log/app/*.log │
└──────────────────────────────┬──────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────────┐
│ 1. LOADER │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ CachedLogsDocumentLoader / TextFileLoader / InMemoryLoader │ │
│ │ │ │
│ │ • Reads raw log files from disk (or memory) │ │
│ │ • Produces iterable<TextDocument> │ │
│ │ • Each TextDocument = content string + metadata │ │
│ └────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────┬──────────────────────────────────────┘
│ TextDocument[]

┌─────────────────────────────────────────────────────────────────────┐
│ 2. TRANSFORMER │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ TextSplitTransformer │ │
│ │ │ │
│ │ • Splits large documents into smaller chunks │ │
│ │ • chunkSize: 500 characters per chunk │ │
│ │ • chunkOverlap: 100 characters overlap for context │ │
│ │ • One large log → multiple TextDocument chunks │ │
│ └────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────┬──────────────────────────────────────┘
│ TextDocument[] (chunked)

┌─────────────────────────────────────────────────────────────────────┐
│ 3. VECTORIZER │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ Symfony Vectorizer (via VectorizerFactory) │ │
│ │ │ │
│ │ • Calls AI Embedding API (e.g., text-embedding-3-small) │ │
│ │ • Converts each text chunk → 1536-dim float vector │ │
│ │ • TextDocument → VectorDocument (text + vector + metadata) │ │
│ └────────────────────────────────────────────────────────────────┘ │
│ │
│ Uses: PlatformInterface (OpenAI / Anthropic / Ollama) │
└──────────────────────────────┬──────────────────────────────────────┘
│ VectorDocument[]

┌─────────────────────────────────────────────────────────────────────┐
│ 4. STORE │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ VectorLogDocumentStore │ │
│ │ │ │
│ │ • Persists VectorDocuments with embeddings + metadata │ │
│ │ • Supports queryForVector() for similarity search │ │
│ │ • Backend: InMemoryStore / ChromaStore / PineconeStore │ │
│ └────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘

Orchestrated by:

┌─────────────────────────────────────────────────────────────────────┐
│ AbstractLogIndexer (base class) │
│ ├── LogFileIndexer → indexLogFile() / indexAllLogs() │
│ └── LogDocumentIndexer → indexLogDocuments() │
│ │
│ Constructor wires: Loader + Transformer + Vectorizer + Store │
│ into a Symfony AI Indexer pipeline that runs steps 1→2→3→4 │
└─────────────────────────────────────────────────────────────────────┘

Flow 2: User Ask and Chat

This flow shows how a user question is processed, routed to the right tool, and answered with evidence from the vector store.

Single Question (LogInspectorAgent)

┌─────────────────────────────────────────────────────────────────────┐
│ USER QUESTION │
│ "Why did the payment fail for order 12345?" │
└──────────────────────────────┬──────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────────┐
│ 1. LogInspectorAgent │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ • Wraps Symfony AI Agent │ │
│ │ • Creates MessageBag (system prompt + user question) │ │
│ │ • Sends to LLM via LogDocumentPlatform │ │
│ │ • LLM decides which tool to call based on question │ │
│ └────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────┬──────────────────────────────────────┘
│ LLM selects tool
┌──────────┴──────────┐
▼ ▼
┌──────────────────────────┐ ┌──────────────────────────────────────┐
│ LogSearchTool │ │ RequestContextTool │
│ (general log search) │ │ (request/trace ID lookup) │
│ │ │ │
│ Triggered by: │ │ Triggered by: │
│ "payment errors" │ │ "debug request req_12345" │
│ "database timeouts" │ │ "trace trace_abc123" │
│ "what errors occurred?" │ │ "session sess_xyz789" │
└────────────┬─────────────┘ └──────────────────┬───────────────────┘
│ │
▼ ▼
┌─────────────────────────────────────────────────────────────────────┐
│ 2. SEMANTIC SEARCH (primary strategy) │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ LogRetriever │ │
│ │ • Wraps Symfony Retriever internally │ │
│ │ • Vectorizes query text → embedding vector │ │
│ │ • Searches VectorLogDocumentStore by cosine similarity │ │
│ │ • Returns VectorDocument[] ranked by relevance │ │
│ └────────────────────────────────────────────────────────────────┘ │
│ │
│ On \Throwable → automatic fallback ▼ │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ KEYWORD SEARCH (fallback) │ │
│ │ • Retrieves all docs via neutral vector │ │
│ │ • Scores by: direct match, category, level, tags, synonyms │ │
│ │ • Sorts by score, applies threshold │ │
│ └────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────┬──────────────────────────────────────┘
│ VectorDocument[] (filtered)

┌─────────────────────────────────────────────────────────────────────┐
│ 3. FILTER & FORMAT │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ LogSearchTool │ RequestContextTool │ │
│ │ • Filter by relevance ≥ 0.7 │ • Filter by identifier │ │
│ │ • Extract: content, log_id, │ • Sort chronologically │ │
│ │ timestamp, level, source, tags │ • Build request timeline │ │
│ │ │ • Group by service │ │
│ │ │ • Calculate time span │ │
│ └────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────┬──────────────────────────────────────┘
│ evidence logs

┌─────────────────────────────────────────────────────────────────────┐
│ 4. AI ANALYSIS │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ LogDocumentPlatform.__invoke() │ │
│ │ │ │
│ │ • Sends evidence logs to LLM with analysis prompt │ │
│ │ • LLM generates: root cause explanation, summary │ │
│ │ • On failure → fallback to pattern-based analysis │ │
│ │ (regex matching: timeout, database, payment, etc.) │ │
│ └────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────┬──────────────────────────────────────┘
│ structured result

┌─────────────────────────────────────────────────────────────────────┐
│ 5. AGENT RESPONSE │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ • Tool returns structured array to Symfony Agent │ │
│ │ • Agent incorporates evidence into final LLM call │ │
│ │ • LLM generates natural language response with citations │ │
│ │ • Returns ResultInterface to user │ │
│ └────────────────────────────────────────────────────────────────┘ │
│ │
│ Response includes: │
│ • success: bool • evidence_logs: [{id, content, ...}] │
│ • reason: string • search_method: 'semantic'|'keyword-based'│
│ • root_cause: string • services_involved (RequestContextTool) │
└─────────────────────────────────────────────────────────────────────┘

Conversational Flow (LogInspectorChat)

┌─────────────────────────────────────────────────────────────────────┐
│ INVESTIGATION START │
│ $chat->startInvestigation('Payment incident - Jan 29') │
│ │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ LogInspectorChat │ │
│ │ • Creates system prompt with investigation context │ │
│ │ • Initializes MessageStore (Session or InMemory) │ │
│ │ • Wraps Symfony Chat for conversation management │ │
│ └────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────┬──────────────────────────────────────┘


┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
TURN 1: "What payment errors occurred?"
│ │
┌───────────────┐ ┌───────────────┐ ┌──────────────────────┐
│ │ User Message │───▶│ Message Store │───▶│ LogInspectorAgent │ │
│ │ │ (append) │ │ (full history) │
│ └───────────────┘ └───────────────┘ └──────────┬───────────┘ │

│ Runs Ask Flow above │

│ ┌───────────────┐ ┌───────────────┐ │ │
│ AI Response │◀───│ Message Store │◀──────────────┘
│ │ + log refs │ │ (append) │ │
└───────────────┘ └───────────────┘
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘


┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
TURN 2: "Were there database issues related to those?"
│ │
┌───────────────┐ ┌───────────────┐ ┌──────────────────────┐
│ │ User Message │───▶│ Message Store │───▶│ LogInspectorAgent │ │
│ │ │ (Turn 1 + │ │ (sees Turn 1 context │
│ └───────────────┘ │ Turn 2) │ │ + new question) │ │
└───────────────┘ └──────────┬───────────┘
│ │ │
Runs Ask Flow above
│ with conversation │
context preserved
│ │ │
┌───────────────┐ ┌───────────────┐ │
│ │ AI Response │◀───│ Message Store │◀──────────────┘ │
│ (contextual) │ │ (append) │
│ └───────────────┘ └───────────────┘ │
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘


┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
TURN N: "What was the root cause?"
│ │
Agent has full conversation history (Turn 1..N-1)
│ and can correlate findings across all previous answers │
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘

Semantic Search Explained

How Vector Similarity Works

// Example: Query vs Log Comparison

// Query: "payment failed"
$queryVector = [0.2, 0.8, 0.1, 0.9, 0.3, ...]; // 1536 dimensions

// Log 1: "PaymentGatewayException: timeout"
$log1Vector = [0.2, 0.7, 0.1, 0.9, 0.4, ...]; // Similar!

// Log 2: "User logged in successfully"
$log2Vector = [0.9, 0.1, 0.8, 0.2, 0.1, ...]; // Different!

// Cosine similarity calculation
$similarity1 = cosineSimilarity($queryVector, $log1Vector); // 0.94 ✅
$similarity2 = cosineSimilarity($queryVector, $log2Vector); // 0.23 ❌

// Only keep similarity > 0.7
$relevantLogs = [$log1]; // Log 1 passes threshold
Keyword SearchSemantic Search
"payment" only matches exact wordUnderstands "transaction", "checkout", "billing"
No context understandingKnows "failed" relates to "error", "exception"
Case-sensitive issuesCase-insensitive by nature
No synonym handlingAutomatic synonym matching
Boolean operators requiredNatural language queries

Extensibility

Custom Tools

Create your own tools to extend functionality:

use Symfony\AI\Agent\Toolbox\Attribute\AsTool;

#[AsTool(
name: 'security_analyzer',
description: 'Detect security threats in logs'
)]
class SecurityAnalyzerTool implements LogInspectorToolInterface
{
public function __invoke(string $timeRange): array
{
// Custom security analysis logic
return [
'threats' => [...],
'severity' => 'high',
'recommendations' => [...]
];
}
}

// Register with agent
$agent = new LogInspectorAgent(
$platform,
[$logSearchTool, $requestContextTool, new SecurityAnalyzerTool()]
);

Custom Platforms

Integrate with your own AI infrastructure:

use Symfony\AI\Platform\PlatformInterface;

class CustomAIPlatform implements PlatformInterface
{
public function complete(Message $message, array $options = []): ResultInterface
{
// Your custom AI implementation
}

public function embed(string $text, string $model): Vector
{
// Your embedding logic
}
}

$platform = new LogDocumentPlatform(
new CustomAIPlatform(),
new Model('your-model', [Capability::TEXT, Capability::EMBEDDING])
);

Performance Considerations

Vector Store Scaling

Store TypeCapacityQuery SpeedUse Case
InMemory~10K logs< 10msDevelopment
ChromaMillions~50msProduction (self-hosted)
PineconeBillions~100msEnterprise (managed)

Token Optimization

// Only send relevant logs to AI (reduces costs)
$relevantLogs = array_filter($allLogs, fn($log) => $log->similarity > 0.7);

// Truncate very long log entries
$truncatedContent = substr($log->content, 0, 1000);

// Batch questions in conversations
$chat->ask('Question 1'); // Reuses context
$chat->ask('Question 2'); // No re-indexing
$chat->ask('Question 3'); // Efficient!

Caching Strategies

// Cache vectorized queries
$cacheKey = hash('sha256', $query);
if ($cached = $cache->get($cacheKey)) {
return $cached;
}

// Cache frequent log patterns
$frequentPatterns = $cache->remember('frequent_errors', 3600, function() {
return $this->analyzePatterns();
});

Security Considerations

API Key Management

// ❌ Never hardcode keys
$apiKey = 'sk-abc123...'; // BAD!

// ✅ Use environment variables
$apiKey = $_ENV['OPENAI_API_KEY'];

// ✅ Or secrets manager
$apiKey = $secretsManager->get('openai-api-key');

Log Data Privacy

// Sanitize sensitive data before indexing
$sanitizer = new LogSanitizer();
$cleanLog = $sanitizer->removePII($rawLog); // Remove emails, IPs, etc.

// Encrypt sensitive fields
$metadata['user_id'] = encrypt($userId);

Access Control

// Restrict agent access per user
if (!$user->can('view-production-logs')) {
throw new AccessDeniedException();
}

// Filtered vector store
$store = new FilteredVectorStore($baseStore, $user->getPermissions());

Next Steps