Architecture at a Glance - RightNow-AI/openfang

Source Files

This page is generated from the following source files:

OpenFang is an open-source Agent Operating System built in Rust, designed as a modular Cargo workspace with 14 crates organized in a layered dependency architecture. The system provides a comprehensive platform for autonomous agent execution, featuring capability-based security, multi-protocol communication support (MCP, A2A), and extensive integration capabilities.

Crate Structure Overview

The project follows a strict layered architecture where dependencies flow downward—lower crates never depend on higher-level ones. This design ensures clean separation of concerns and enables independent testing and reuse of core components.

Crate	Layer	Primary Responsibility
openfang-types	Foundation	Core type definitions: `AgentManifest`, `AgentId`, `Capability`, `Event`, `ToolDefinition`, `KernelConfig`, taint tracking, manifest signing
openfang-memory	Data	SQLite-backed memory substrate with semantic search, knowledge graph, session management
openfang-runtime	Execution	Agent loop, LLM drivers, tool execution, WASM sandbox, MCP/A2A protocols
openfang-kernel	Coordination	Central coordinator assembling all subsystems, workflow engine, RBAC
openfang-api	Interface	HTTP/WebSocket API server with 76 endpoints, OpenAI compatibility
openfang-channels	Integration	40 channel adapters for external communication platforms
openfang-wire	Networking	P2P protocol (OFP) with HMAC-SHA256 authentication
openfang-cli	User Interface	Command-line interface with daemon auto-detection
openfang-desktop	User Interface	Tauri 2.0 native desktop application
openfang-skills	Extensibility	60 bundled skills with marketplace integration
openfang-migrate	Migration	OpenClaw YAML to TOML conversion engine

The foundation crate openfang-types defines all shared data structures used across the kernel, runtime, memory substrate, and wire protocol without containing any business logic (crates/openfang-types/src/lib.rs:1-24). The kernel crate exposes all major subsystems through its library interface, including authentication, capabilities, scheduling, supervision, and workflow management (crates/openfang-kernel/src/lib.rs:1-29).

正在加载图表渲染器...

Architecture Diagram Explanation:

Layered Dependency Flow: All dependencies point downward—UI crates depend on API, API depends on kernel, kernel depends on runtime and foundation layers
Foundation Isolation: openfang-types contains zero business logic, only type definitions shared across all layers
Kernel as Hub: The kernel crate coordinates multiple subsystems (channels, wire, skills) while delegating execution to runtime
Runtime Independence: The runtime layer is self-contained and can operate without the kernel for testing purposes

The complete crate structure with dependency hierarchy is documented in the architecture specification (docs/architecture.md:25-66).

Kernel Core Subsystems

The OpenFangKernel struct serves as the central coordinator, assembling all subsystems required for agent operation. It manages agent lifecycles, memory, permissions, scheduling, and inter-agent communication through a collection of specialized components.

Core Components

The kernel struct contains over 40 fields representing distinct subsystems (crates/openfang-kernel/src/kernel.rs:60-100):

Subsystem	Type	Responsibility
`registry`	`AgentRegistry`	Concurrent agent storage using DashMap
`capabilities`	`CapabilityManager`	Capability grants and inheritance validation
`event_bus`	`EventBus`	Async broadcast channel for system events
`scheduler`	`AgentScheduler`	Quota tracking with hourly window reset
`supervisor`	`Supervisor`	Health monitoring with panic/restart counters
`workflows`	`WorkflowEngine`	Workflow registration and execution (run cap: 200)
`triggers`	`TriggerEngine`	Event pattern matching for reactive behaviors
`background`	`BackgroundExecutor`	Background agent execution
`audit_log`	`Arc<AuditLog>`	Merkle hash chain audit trail
`metering`	`Arc<MeteringEngine>`	Cost tracking with model pricing catalog
`auth`	`AuthManager`	RBAC authentication manager
`wasm_sandbox`	`WasmSandbox`	WASM execution with fuel+epoch metering

Responsibility Boundaries

The kernel's primary responsibilities include:

Agent Lifecycle: Spawn, kill, and manage agent processes with state persistence
Message Dispatch: Route messages between agents and external channels
Workflow Execution: Orchestrate multi-step agent workflows with step agents
Trigger Evaluation: Match events against patterns for reactive automation
Capability Validation: Enforce capability inheritance and access control
Graceful Shutdown: Persist state and cleanup resources on termination

The kernel explicitly delegates to the runtime layer for:

LLM completion requests
Tool execution
WASM sandbox operations
MCP/A2A protocol handling

This separation is documented in the architecture overview (docs/architecture.md:52-58).

Key Data Structures

rust
1// Core kernel configuration
2pub config: KernelConfig,
3
4// Agent management
5pub registry: AgentRegistry,
6pub running_tasks: dashmap::DashMap<AgentId, tokio::task::AbortHandle>,
7
8// Protocol support
9pub mcp_connections: tokio::sync::Mutex<Vec<McpConnection>>,
10pub mcp_tools: std::sync::Mutex<Vec<ToolDefinition>>,
11pub a2a_task_store: A2aTaskStore,
12pub a2a_external_agents: std::sync::Mutex<Vec<(String, AgentCard)>>,

The kernel maintains cancellation support through running_tasks which maps agent IDs to their abort handles, enabling graceful termination of long-running agents.

Runtime and Agent Execution

The runtime crate (openfang-runtime) manages the agent execution loop, LLM driver abstraction, tool execution, and WASM sandboxing for untrusted code. It serves as the execution engine that the kernel delegates to for actual agent operations.

Module Organization

The runtime exposes 58 modules covering all aspects of agent execution (crates/openfang-runtime/src/lib.rs:1-58):

Module Category	Modules	Purpose
Agent Loop	`agent_loop`, `loop_guard`, `session_repair`	Core execution cycle with loop detection
LLM Integration	`llm_driver`, `llm_errors`, `routing`, `model_catalog`	Multi-provider LLM support
Tool System	`tool_runner`, `tool_policy`, `host_functions`	Built-in tools and execution policies
Sandboxing	`sandbox`, `docker_sandbox`, `subprocess_sandbox`, `workspace_sandbox`	Isolated code execution
Protocols	`mcp`, `mcp_server`, `a2a`	External protocol support
Media	`media_understanding`, `image_gen`, `tts`	Multimodal capabilities
Web	`web_search`, `web_fetch`, `web_content`, `web_cache`	Internet access with SSRF protection

Agent Loop Architecture

The agent loop is the core execution cycle that processes messages and generates responses:

Message Processing: Parse incoming message and build context
Tool Discovery: Collect available tools (builtin + MCP + skills)
LLM Completion: Request completion from configured provider
Response Handling: Process streaming events or complete responses
Tool Execution: Execute requested tools with capability checks
Loop Detection: SHA256-based detection prevents infinite tool loops
Context Compaction: Block-aware compaction when context exceeds limits

MCP Protocol Integration

The Model Context Protocol (MCP) integration enables discovery and execution of external tools through JSON-RPC 2.0. Tool names are namespaced to prevent collisions:

rust
1// Tool namespacing: mcp_{server}_{tool}
2pub fn format_mcp_tool_name(server: &str, tool: &str) -> String {
3    format!("mcp_{}_{}", normalize_name(server), normalize_name(tool))
4}

The MCP module handles server name extraction with hyphen normalization (crates/openfang-runtime/src/mcp.rs:548-590):

rust
1// Handles server names with hyphens (e.g., "bocha-search")
2pub fn extract_mcp_server_from_known<'a>(
3    tool_name: &str,
4    server_names: &[&'a str],
5) -> Option<&'a str> {
6    // Sort by length descending for longest match first
7    let mut sorted: Vec<&&str> = server_names.iter().collect();
8    sorted.sort_by_key(|a| std::cmp::Reverse(a.len()));
9    for name in sorted {
10        let prefix = format!("mcp_{}_", normalize_name(name));
11        if tool_name.starts_with(&prefix) {
12            return Some(name);
13        }
14    }
15    None
16}

A2A Protocol Support

Agent-to-Agent (A2A) communication enables inter-agent task delegation. The protocol defines message structures for task lifecycle management (crates/openfang-runtime/src/a2a.rs:156-200):

rust
1/// A2A message in a task conversation
2pub struct A2aMessage {
3    pub role: String,        // "user" or "agent"
4    pub parts: Vec<A2aPart>, // Content parts (text, file, etc.)
5}
6
7/// A2A message content part
8pub enum A2aPart {
9    Text { text: String },
10    File { name: String, mime_type: String, data: String },
11    // ... additional variants
12}

Task status transitions follow a defined state machine:

Submitted → Working → Completed
                   → Failed
                   → Cancelled

The A2aTaskStore manages task lifecycle with thread-safe operations for status updates, completion, failure, and cancellation.

API Layer and External Interfaces

The API crate provides the HTTP/WebSocket interface for external communication, built on Axum 0.8 with comprehensive middleware support.

API Structure

The API server is organized into specialized modules (crates/openfang-api/src/lib.rs:1-17):

Module	Purpose
`routes`	REST endpoint definitions
`ws`	WebSocket handler for real-time chat
`openai_compat`	OpenAI-compatible endpoints (`/v1/chat/completions`, `/v1/models`)
`middleware`	Auth, rate limiting, logging, security headers
`stream_chunker`	SSE streaming response handling
`webchat`	Web chat interface support
`channel_bridge`	External channel integration

Endpoint Categories

The API exposes 76 endpoints across multiple categories (docs/architecture.md:58-60):

Agent Management: Spawn, list, chat, kill agents
Workflow Operations: Create, run, monitor workflows
Trigger Management: Event-driven automation configuration
Memory Access: Session and knowledge graph queries
Channel Configuration: Setup and manage external channels
Model/Provider Management: List available models and providers
Skill Operations: Install, search, manage skills
Health/Status: System health and version information

Middleware Stack

The API implements a comprehensive middleware pipeline:

Bearer Token Auth: Validates API keys for protected endpoints
Request ID Injection: Unique IDs for request tracing
Structured Logging: JSON-formatted request logs
GCRA Rate Limiting: Cost-aware rate limiting per user
Security Headers: CSP, X-Frame-Options, etc.
Health Redaction: Sanitizes health endpoint output

OpenAI Compatibility

The API provides drop-in compatibility with OpenAI's API format:

POST /v1/chat/completions  → Agent chat completion
GET  /v1/models            → List available models

This enables integration with existing tools and SDKs designed for OpenAI's API.

Extended Capabilities

Beyond core agent execution, the kernel integrates advanced capabilities for multimodal interaction and external system integration.

Browser Automation

The browser module provides Playwright-based browser automation with cross-platform Chromium detection (crates/openfang-runtime/src/browser.rs:745-790):

rust
1fn chromium_candidates() -> Vec<String> {
2    let mut paths = Vec::new();
3    
4    #[cfg(windows)]
5    {
6        // Check ProgramFiles, ProgramFiles(x86), LOCALAPPDATA
7        paths.push(format!("{pf}\\Google\\Chrome\\Application\\chrome.exe"));
8        paths.push(format!("{pf}\\Microsoft\\Edge\\Application\\msedge.exe"));
9        paths.push(format!("{pf}\\BraveSoftware\\Brave-Browser\\Application\\brave.exe"));
10    }
11    
12    #[cfg(target_os = "macos")]
13    {
14        paths.push("/Applications/Google Chrome.app/Contents/MacOS/Google Chrome".into());
15        paths.push("/Applications/Chromium.app/Contents/MacOS/Chromium".into());
16        paths.push("/Applications/Microsoft Edge.app/Contents/MacOS/Microsoft Edge".into());
17    }
18    
19    #[cfg(target_os = "linux")]
20    {
21        paths.push("/usr/bin/google-chrome".into());
22        // ... additional Linux paths
23    }
24    
25    paths
26}

Browser commands support navigation, clicking, typing, screenshots, and other automation actions through a structured command enum.

Text-to-Speech Engine

The TTS engine provides text-to-speech capabilities with multi-provider support (crates/openfang-runtime/src/tts.rs:236-260):

rust
1pub struct TtsConfig {
2    pub enabled: bool,
3    pub max_text_length: usize,  // Default: 4096
4    pub timeout_secs: u64,       // Default: 30
5    pub openai: OpenAiTtsConfig,
6    pub elevenlabs: ElevenLabsConfig,
7}
8
9pub struct OpenAiTtsConfig {
10    pub voice: String,   // Default: "alloy"
11    pub model: String,   // Default: "tts-1"
12    pub format: String,  // Default: "mp3"
13    pub speed: f32,      // Default: 1.0
14}

The engine validates text length and handles synthesis errors gracefully, returning descriptive error messages when disabled or given empty input.

Kernel Extended Fields

The kernel maintains extended capability contexts (crates/openfang-kernel/src/kernel.rs:105-145):

Field	Type	Purpose
`browser_ctx`	`BrowserManager`	Playwright bridge sessions
`media_engine`	`MediaEngine`	Image description, audio transcription
`tts_engine`	`TtsEngine`	Text-to-speech output
`web_ctx`	`WebToolsContext`	Multi-provider search + SSRF-protected fetch
`extension_registry`	`IntegrationRegistry`	Bundled MCP templates + install state
`credential_resolver`	`CredentialResolver`	Vault → dotenv → env var priority chain
`process_manager`	`ProcessManager`	Persistent processes for REPLs, servers
`peer_registry`	`PeerRegistry`	OFP connected peers tracking

Data Flow and Request Processing

The following diagram illustrates the end-to-end data flow for a typical agent chat request:

正在加载图表渲染器...

Data Flow Explanation:

Request Entry: User requests enter through the API layer, passing through authentication and rate limiting middleware
Kernel Dispatch: The kernel looks up the target agent and validates capabilities before delegating to runtime
Agent Loop: Runtime executes the core agent loop, building context and collecting available tools
LLM Interaction: Completion requests are sent to configured providers with streaming support
Tool Execution: Tool calls route back through the kernel for capability validation before execution
Loop Guard: SHA256-based detection prevents infinite tool loops
Audit & Metering: All operations are logged to the Merkle hash chain audit trail and cost metering engine

Module Dependency Graph

The following diagram shows the dependency relationships between core modules:

正在加载图表渲染器...

Dependency Graph Explanation:

Unidirectional Flow: All dependencies flow downward from application to base layer
Types as Foundation: Every crate depends on openfang-types for shared definitions
Kernel Hub: The kernel depends on multiple coordination-layer crates but not directly on storage
Runtime Independence: Runtime can function independently for testing without kernel overhead

Core Design Decisions

1. Layered Crate Architecture

Decision: Organize code into 14 crates with strict dependency ordering.

Rationale: Enables independent testing, reduces compilation times through incremental builds, and allows reuse of core components (types, memory) in other projects.

Trade-off: Increases complexity of cross-crate refactoring but improves long-term maintainability.

2. Capability-Based Security Model

Decision: Implement capability tokens for permission management rather than role-based access control alone.

Rationale: Fine-grained permissions enable agents to have specific capabilities (e.g., "file_read:/tmp" rather than "file_access"). Capability inheritance validation prevents privilege escalation.

Evidence: CapabilityManager uses DashMap for concurrent capability grants (crates/openfang-kernel/src/kernel.rs:65-66).

3. WASM Sandboxing for Untrusted Code

Decision: Execute skills and plugins in WASM sandbox with dual fuel+epoch metering.

Rationale: Prevents malicious code from accessing system resources while enabling extensible plugin architecture. Wasmtime provides near-native performance with strong isolation guarantees.

Trade-off: Adds complexity to tool execution but essential for security.

4. Multi-Protocol Support (MCP + A2A)

Decision: Support both Model Context Protocol and Agent-to-Agent protocols natively.

Rationale: MCP enables integration with external tool providers (e.g., GitHub, databases). A2A enables agent collaboration and task delegation across instances.

Evidence: MCP tool namespacing prevents collisions (crates/openfang-runtime/src/mcp.rs:548-555).

5. SQLite for Memory Substrate

Decision: Use SQLite as the primary storage backend rather than PostgreSQL or embedded key-value stores.

Rationale: Zero-configuration deployment, single-file database simplifies backup/restore, sufficient performance for typical agent workloads. Schema migrations are explicit and versioned.

Trade-off: Limits horizontal scaling but appropriate for single-node deployment.

6. Async-First Architecture

Decision: Build all I/O operations on Tokio async runtime.

Rationale: Enables handling thousands of concurrent agent operations without thread overhead. Natural fit for LLM API calls which are latency-bound.

Evidence: Kernel uses tokio::sync::Mutex for MCP connections (crates/openfang-kernel/src/kernel.rs:97-98).

7. Merkle Hash Chain Audit Trail

Decision: Implement audit logging as Merkle hash chain rather than simple append-only log.

Rationale: Enables cryptographic verification of audit log integrity. Each entry includes hash of previous entry, making tampering detectable.

Evidence: AuditLog is stored as Arc<AuditLog> in kernel (crates/openfang-kernel/src/kernel.rs:81-82).

8. OpenAI API Compatibility

Decision: Provide OpenAI-compatible endpoints (/v1/chat/completions, /v1/models).

Rationale: Enables drop-in replacement for existing tools and SDKs designed for OpenAI. Reduces integration friction for users migrating from other platforms.

Evidence: API includes openai_compat module (crates/openfang-api/src/lib.rs:8).

Technology Selection

Technology	Purpose	Selection Rationale	Alternatives Considered
Rust	Core language	Memory safety, zero-cost abstractions, async support	Go, C++
Tokio	Async runtime	Industry standard, excellent ecosystem, proven at scale	async-std, smol
Axum 0.8	HTTP framework	Type-safe routing, Tower middleware integration	Actix-web, Warp
SQLite	Storage	Zero-config, single-file, sufficient performance	PostgreSQL, RocksDB
Wasmtime	WASM runtime	Fast compilation, fuel metering, WASI support	Wasmer, V8
Serde	Serialization	Derive macros, format-agnostic, widely adopted	miniserde, speedy
DashMap	Concurrent maps	Lock-free reads, fine-grained locking	RwLock, chashmap
Tracing	Observability	Structured logging, async-aware, span propagation	log, slog
Clap	CLI parsing	Derive macros, subcommand support, help generation	structopt, argh
Tauri 2.0	Desktop app	Smaller binaries than Electron, native performance	Electron, Qt

Configuration and Startup Flow

The kernel boot sequence follows a deterministic initialization order:

Configuration Loading: Read ~/.openfang/config.toml with #[serde(default)] for forward compatibility
Data Directory: Ensure ~/.openfang/data/ exists
Memory Substrate: Open SQLite database and run schema migrations (up to v5)
LLM Driver: Read API keys from environment, create provider-specific driver
Model Catalog: Build catalog with 51 builtin models, detect auth status
Metering Engine: Initialize cost catalog for pricing calculations
Core Subsystems: Initialize registry, capabilities, event bus, scheduler, supervisor
Extended Systems: Initialize workflows, triggers, background executor, sandbox
Protocol Connections: Establish MCP connections, discover A2A agents
Background Agents: Start configured background agents

The boot sequence is documented in the architecture specification (docs/architecture.md:69-112).

Key Configuration Structures

rust
1pub struct KernelConfig {
2    // LLM configuration
3    pub default_model: Option<String>,
4    pub providers: Vec<ProviderConfig>,
5    
6    // Memory configuration
7    pub memory_decay_rate: f64,
8    
9    // Security configuration
10    pub capabilities: Vec<CapabilityGrant>,
11    
12    // Protocol configuration
13    pub mcp_servers: Vec<McpServerConfigEntry>,
14    pub a2a_config: Option<A2aConfig>,
15}

All configuration structs use #[serde(default)] for forward-compatible TOML parsing, allowing new fields to be added without breaking existing configurations.