Tổng quan kiến trúc - google/adk-go

Tệp nguồn liên quan

Trang này được tạo dựa trên các tệp nguồn sau:

Tổng quan kiến trúc hệ thống ADK-Go

ADK-Go (Agent Development Kit for Go) là một framework được thiết kế để xây dựng các hệ thống Agent dựa trên LLM với kiến trúc module hóa, hỗ trợ chuyển giao giữa các Agent, tích hợp công cụ linh hoạt và quản lý phiên làm việc hiệu quả. Framework này tuân theo mô hình hierarchical agent tree, trong đó mỗi Agent có thể có các SubAgent và chuyển giao quyền điều khiển giữa chúng thông qua LLM reasoning.

Thành phần cốt lõi của Agent

Agent là đơn vị cơ bản trong kiến trúc ADK-Go, đóng vai trò là container cho logic xử lý và các SubAgent. Mỗi Agent được định nghĩa thông qua cấu trúc Config chứa các thuộc tính thiết yếu như tên, mô tả, danh sách SubAgent và các callback functions.

Cấu trúc dữ liệu Config định nghĩa các thuộc tính của Agent bao gồm tên (phải là chuỗi không rỗng và duy nhất trong cây Agent), mô tả khả năng (được LLM sử dụng để quyết định có nên chuyển giao quyền điều khiển hay không), và danh sách các SubAgent mà Agent này có thể ủy thác nhiệm vụ (agent/agent.go:74-91).

go
1type Config struct {
2    Name        string
3    Description string
4    SubAgents   []Agent
5    BeforeAgentCallbacks []BeforeAgentCallback
6    AfterAgentCallbacks  []AfterAgentCallback
7    Run         func(ctx InvocationContext) iter.Seq2[*session.Event, error]
8}

Hàm khởi tạo New(cfg Config) thực hiện validation để đảm bảo không có SubAgent trùng lặp trong danh sách, sau đó tạo instance của Agent với các callback được cấu hình sẵn (agent/agent.go:53-72). Cơ chế callback BeforeAgentCallbacks và AfterAgentCallbacks cho phép can thiệp vào quy trình thực thi trước và sau khi Agent chạy logic chính.

Giao diện Agent cung cấp các phương thức cơ bản để quản lý vòng đời và thực thi, bao gồm Name(), Description(), SubAgents() để truy xuất thông tin, và Run() để thực thi logic của Agent (agent/agent.go:148-158). Phương thức Run() trả về một iterator sequence của events và errors, cho phép xử lý streaming responses từ LLM.

Giao diện Model và Tool

Model Interface

Giao diện LLM cung cấp abstraction layer để tương tác với các mô hình ngôn ngữ lớn. Interface này định nghĩa phương thức GenerateContent chấp nhận context, request và streaming flag, trả về iterator sequence của responses và errors (model/llm.go:25-29).

Cấu trúc LLMRequest đóng gói dữ liệu yêu cầu gửi đến mô hình, bao gồm tên model, danh sách contents (history), config và tools map (model/llm.go:31-38). Cấu trúc LLMResponse chứa response từ mô hình với các metadata như citation, grounding, usage statistics và các flags như Partial, TurnComplete, Interrupted để quản lý streaming state.

Tool Interface

Giao diện Tool định nghĩa contract cho các công cụ có thể được gọi bởi Agent. Interface này yêu cầu các phương thức Name(), Description() và IsLongRunning() để xác định đặc tính của công cụ (tool/tool.go:41-50). Flag IsLongRunning được sử dụng để xác định các công cụ trả về resource ID trước và hoàn thành operation sau.

Giao diện Context cho tool cung cấp quyền truy cập vào thông tin invocation và các hành động của Agent. Nó mở rộng agent.CallbackContext và cung cấp các phương thức như FunctionCallID(), Actions(), SearchMemory(), ToolConfirmation() và RequestConfirmation() (tool/tool.go:52-66). Cơ chế RequestConfirmation cho phép implement Human-in-the-Loop (HITL) workflow, trong đó tool có thể yêu cầu sự phê duyệt của người dùng trước khi thực hiện các thao tác nhạy cảm.

Cơ chế chuyển giao giữa các Agent

ADK-Go implement cơ chế chuyển giao linh hoạt giữa các Agent trong hệ thống phân cấp thông qua TransferToAgentTool. Công cụ này cho phép LLM quyết định chuyển quyền điều khiển sang Agent khác dựa trên mô tả khả năng của các Agent.

TransferToAgentTool được định nghĩa với description giải thích mục đích chuyển giao câu hỏi sang Agent khác phù hợp hơn để trả lời (internal/llminternal/agent_transfer.go:99-132). Tool này có parameter agent_name để chỉ định Agent đích và được tự động inject vào request của LLM Agent thông qua AgentTransferRequestProcessor.

Hàm transferTargets xác định các Agent đích có thể chuyển giao, bao gồm:

SubAgents: Các Agent con trực tiếp
Parent Agent: Nếu DisallowTransferToParent là false
Peer Agents: Các Agent cùng cấp nếu DisallowTransferToPeers là false và parent sử dụng AutoFlow

(internal/llminternal/agent_transfer.go:158-184)

Phương thức isTransferableAcrossAgentTree trong Runner kiểm tra khả năng chuyển giao trong cây Agent bằng cách duyệt từ Agent hiện tại lên parent chain và verify rằng tất cả đều là LLMAgent và không có DisallowTransferToParent được set (runner/runner.go:390-403).

Xử lý Tool và Function Calls

Flow xử lý Function Calls

Hàm handleFunctionCalls trong base_flow.go chịu trách nhiệm xử lý các function calls từ phản hồi LLM. Hàm này extract danh sách function calls từ response content và thực thi các tools tương ứng (internal/llminternal/base_flow.go:560-577).

Cơ chế tracing được implement để track việc thực thi tool calls, với merged span được tạo khi có nhiều hơn một tool call để trace toàn bộ batch execution.

Long-running Function Calls

Hàm findLongRunningFunctionCallIDs quét qua các function calls trong content và xác định các function call IDs của tools có flag IsLongRunning() trả về true (internal/llminternal/base_flow.go:519-530). Điều này cho phép hệ thống quản lý các operations bất đồng bộ chạy lâu một cách phù hợp.

Error Handling

Hàm newToolNotFoundError tạo thông báo lỗi chi tiết khi LLM hallucinate function name hoặc tool chưa được đăng ký (internal/llminternal/base_flow.go:542-558). Error message bao gồm:

Tên tool không tìm thấy
Danh sách các tools có sẵn
Các nguyên nhân có thể (LLM hallucination, tool chưa đăng ký, typo)
Các gợi ý fix tương ứng

Quản lý Session và State

Session Service

Session Service cung cấp các thao tác CRUD cho session management. Các cấu trúc request/response được định nghĩa rõ ràng cho từng operation:

Cấu trúc GetRequest định nghĩa request để truy xuất session với các filter như NumRecentEvents và After timestamp (session/service.go:59-75). GetResponse chứa session được truy xuất.

Các cấu trúc ListRequest/ListResponse và DeleteRequest định nghĩa các request để liệt kê và xóa sessions (session/service.go:77-93). Session Service interface cung cấp abstraction cho các implementation khác nhau như in-memory, database-backed, hoặc distributed storage.

Runner và Agent Tree Management

Runner quản lý việc thực thi Agent và duy trì parent relationships thông qua map parents. Hàm findAgent thực hiện DFS traversal để tìm Agent trong cây theo tên, bắt đầu từ Agent hiện tại và đệ quy qua các SubAgents (runner/runner.go:405-416).

Streaming Response Aggregation

streamingResponseAggregator xử lý việc aggregate các partial responses từ streaming LLM calls. Cấu trúc này maintain state cho:

Usage metadata, grounding metadata, citation metadata
Text buffer cho streaming text
Function call state (name, ID, args) cho streaming function calls

(internal/llminternal/stream_aggregator.go:34-49)

Phương thức ProcessResponse transform GenerateContentResponse thành LLMResponse và yield cả intermediate aggregated responses khi có (internal/llminternal/stream_aggregator.go:56-79). Hàm processStreamingFunctionCallPart xử lý việc build function call từ các partial args sử dụng JSONPath (internal/llminternal/stream_aggregator.go:144-167).

Kiến trúc tổng quan hệ thống

正在加载图表渲染器...

Giải thích kiến trúc:

Client Layer: Xử lý requests từ client applications hoặc qua A2A protocol
Runner Layer: Điều phối việc thực thi Agent, quản lý sessions và parent relationships
Agent Layer: Cây Agent với cơ chế chuyển giao giữa các nodes
LLM Layer: Xử lý communication với LLM providers và streaming aggregation
Tool Layer: Execute tools với context và confirmation support

Luồng dữ liệu xử lý request

正在加载图表渲染器...

Giải thích luồng dữ liệu:

Request Initiation: Client gửi request đến Runner, Runner load hoặc tạo session
Agent Invocation: Runner invoke Agent.Run(), Agent execute before callbacks
LLM Processing: Flow gửi request đến LLM, xử lý streaming responses
Tool Execution: Khi có function calls, Flow execute tools với confirmation check
Response Aggregation: Stream aggregator tổng hợp partial responses thành final response
Session Update: Events được lưu vào session và stream về client

Sơ đồ phụ thuộc module

正在加载图表渲染器...

Giải thích phụ thuộc:

agent: Core module định nghĩa Agent interface và Config, phụ thuộc vào session, model, tool
llminternal: Module nội bộ xử lý LLM flow logic, có phụ thuộc vào hầu hết các core modules
runner: Điều phối execution, phụ thuộc vào agent và llminternal
tool: Định nghĩa tool interface, có dependency ngược về agent cho context types

Ví dụ tích hợp thực tế

A2A (Agent-To-Agent) Integration

Ví dụ A2A minh họa cách tạo Agent và expose qua giao thức Agent-To-Agent. Hàm newWeatherAgent tạo một LLM Agent đơn giản với Gemini model và Google Search tool (examples/a2a/main.go:44-63).

go
1func newWeatherAgent(ctx context.Context) agent.Agent {
2    model, err := gemini.NewModel(ctx, "gemini-2.5-flash", &genai.ClientConfig{
3        APIKey: os.Getenv("GOOGLE_API_KEY"),
4    })
5    
6    agent, err := llmagent.New(llmagent.Config{
7        Name:        "weather_time_agent",
8        Model:       model,
9        Description: "Agent to answer questions about the time and weather in a city.",
10        Instruction: "I can answer your questions about the time and weather in a city.",
11        Tools:       []tool.Tool{geminitool.GoogleSearch{}},
12    })
13    return agent
14}

Hàm startWeatherAgentServer khởi tạo HTTP server expose Agent qua A2A protocol với AgentCard chứa metadata về capabilities và skills (examples/a2a/main.go:66-108). Runner được cấu hình với Agent và InMemory Session Service (examples/a2a/main.go:92-98).

Bảng tổng hợp kỹ thuật

Kỹ thuật	Mục đích	Lý do chọn	Giải pháp thay thế
Go Generics	Type-safe collections	Compile-time safety, performance	Interface-based collections
iter.Seq2	Streaming iteration	Go 1.23+ native support, lazy evaluation	Channels, callback functions
google.golang.org/genai	LLM client library	Official Google SDK, feature complete	OpenAI SDK, Anthropic SDK
Hierarchical Agent Tree	Agent organization	Natural delegation model, clear ownership	Flat agent registry
InMemory Session Service	Session storage	Zero dependencies, fast for development	Redis, PostgreSQL
A2A Protocol	Agent-to-agent communication	Standard protocol, interoperability	gRPC, REST
Tool Confirmation	HITL workflow	Safety for sensitive operations	Automatic execution
Streaming Aggregation	Response handling	Real-time feedback, reduced latency	Batch processing
Telemetry/Tracing	Observability	Debug production issues	Logging only

Các quyết định thiết kế quan trọng

1. Iterator-based Streaming

ADK-Go sử dụng iter.Seq2 của Go 1.23+ cho streaming thay vì channels. Điều này cho phép lazy evaluation và better control over iteration lifecycle. Evidence: model/llm.go:28 định nghĩa GenerateContent trả về iter.Seq2[*LLMResponse, error].

2. Callback-based Agent Lifecycle

Before và After callbacks cho phép inject logic vào Agent execution lifecycle mà không cần modify Agent code. Callbacks được execute sequentially và có thể short-circuit execution. Evidence: agent/agent.go:89-91 định nghĩa callback types.

3. Tool Confirmation Pattern

Tool confirmation sử dụng error-based flow control với ErrConfirmationRequired và ErrConfirmationRejected để pause tool execution và request user approval. Evidence: tool/tool.go:35-39 định nghĩa error types.

4. Parent Map for Agent Tree

Runner maintain parent relationships trong map thay vì store trong Agent struct, cho phép same Agent instance có thể được reuse trong different tree positions. Evidence: runner/runner.go:391 access r.parents map.

5. Streaming Function Call Building

Function calls được build incrementally từ partial args sử dụng JSONPath, cho phép LLM stream function call arguments. Evidence: internal/llminternal/stream_aggregator.go:151-161 process partial args.

Cấu hình và khởi chạy

Cấu hình Agent cơ bản

go
1config := llmagent.Config{
2    Name:        "my_agent",
3    Model:       model,
4    Description: "Agent description for LLM routing",
5    Instruction: "System instruction for the agent",
6    Tools:       []tool.Tool{tool1, tool2},
7    SubAgents:   []agent.Agent{subAgent1, subAgent2},
8}

Cấu hình Runner

go
1runnerConfig := runner.Config{
2    AppName:        "my_app",
3    Agent:          rootAgent,
4    SessionService: session.InMemoryService(),
5}

Khởi chạy với A2A

go
1executor := adka2a.NewExecutor(adka2a.ExecutorConfig{
2    RunnerConfig: runnerConfig,
3})
4handler := a2asrv.NewHandler(executor)
5mux.Handle(agentPath, a2asrv.NewJSONRPCHandler(handler))

Giới hạn đã biết

Single Tool Execution: Tool calls hiện tại được execute sequentially, không concurrent. Comment trong code: "TODO: check feasibility of running tool.Run concurrently" (internal/llminternal/base_flow.go:563).
ConfirmationProvider Experimental: ConfirmationProvider và WithConfirmation được đánh dấu là experimental và không trong scope v1.0 API (tool/tool.go:182-183).
Agent Type Support: Agent transfer hiện tại chỉ support LLMAgent types. Comment: "TODO: support agent types other than LLMAgent, that have parent/subagents?" (internal/llminternal/agent_transfer.go:70).