AgentSociety: Technical Architecture

System Architecture Overview

AgentSociety's technical architecture is designed to support large-scale social simulations with thousands of LLM-driven agents interacting in a realistic societal environment. The architecture consists of three main components:

Shared Services: Common infrastructure used across all simulations
Simulation Tasks: Experiment-specific computational tasks
GUI Component: Optional visualization and interaction interface

Shared Services

LLM API: The core intelligence behind agents, providing a standard request-response interface
- Supports public services (OpenAI, DeepSeek) or local deployment (vllm, ollama)
- Handles token management and response parsing
MQTT Server: High-performance messaging system for inter-agent communication
- Uses the emqx implementation for reliability and scalability
- Enables protocol-compliant message delivery between agents
Database: PostgreSQL database for storing simulation results
- Optimized for high-performance batch writing using COPY FROM commands
- Stores agent states, interactions, and experimental outcomes
Metric Recorder: mlflow-based system for tracking experimental metrics
- Centralized server capabilities for research collaboration
- Records key performance indicators and experimental results

Simulation Tasks

Each experiment corresponds to an Agent Simulation object that manages:

Environment Simulators: Run as subprocesses to maintain separation
- Urban environment (roads, POIs, transportation)
- Social environment (networks, interactions)
- Economic environment (firms, markets, policies)
Agent Groups: Organized as Ray actors operating in separate processes
- Each group contains multiple agents sharing client connections
- Enables distributed computing across multiple machines
- Balances communication costs with parallel acceleration

GUI Component

Backend: Connects to database and MQTT server
- Retrieves simulation data for visualization
- Processes user inputs for agent interaction
Frontend: Provides visualization and interaction interface
- Displays agent states, locations, and interactions
- Enables direct communication with agents through chat or surveys

Group-based Distributed Execution

AgentSociety addresses the challenge of scaling to thousands of agents through an innovative group-based execution model:

Agent Grouping Strategy

Agents are evenly distributed into multiple groups
Each group operates within a single process
Groups share client connections to shared services
Reduces TCP port resource consumption while maintaining agent independence

Parallel Execution

Uses Ray framework for multi-process parallel execution
Leverages Python's asyncio for asynchronous I/O within processes
Enables concurrent LLM requests while maximizing CPU utilization
Supports horizontal scaling across multiple machines

Parallel Execution

Performance Optimization

Connection pooling for LLM API calls
Asynchronous environment interactions
Efficient message routing through MQTT
Batch processing of database operations

MQTT-powered Agent Messaging System

The messaging system is a critical component enabling agent-to-agent communication and external interaction:

Topic Structure

exps/<exp_uuid>/agents/<agent_uuid>/agent-chat: For agent-to-agent messages
exps/<exp_uuid>/agents/<agent_uuid>/user-chat: For user-to-agent messages
exps/<exp_uuid>/agents/<agent_uuid>/user-survey: For structured surveys

Implementation Benefits

Supports hundreds of thousands of connected agents
Provides reliable message delivery with minimal resource consumption
Enables publish/subscribe architecture for efficient message routing
Facilitates external interaction through standardized interfaces

Performance Metrics

Achieves 44,702 messages per second throughput
Outperforms alternatives like RabbitMQ (23,667 msg/s)
Provides built-in GUI tools for monitoring and debugging

Utilities and Toolbox

AgentSociety includes comprehensive utilities to support development and research:

Core Utilities

LLM API Adapter: Supports multiple LLM providers with consistent interface
Retry Mechanism: Automatically handles LLM API errors
JSON Parser: Processes structured responses from LLMs
Metric Recorder: Tracks statistical metrics during experiments
Logging and Saving: Archives simulation data in AVRO format and PostgreSQL

Intervention Tools:
- Agent Configuration: Modifies internal settings before simulation
- State Manipulation: Alters agent states during simulation
- Message Notification: Sends external stimuli to agents
Interview System:
- Enables direct questioning of agents
- Processes responses without interrupting ongoing actions
- Distributes questions via MQTT messaging
Survey System:
- Distributes structured questionnaires to agents
- Collects formatted responses for analysis
- Supports various response formats (multiple-choice, ranking, etc.)

Performance Evaluation

Comprehensive performance testing reveals AgentSociety's capabilities and limitations:

Environment Performance

Successfully handles 1,000,000 agents with minimal degradation
Mean time per simulation step scales efficiently with agent count:
- 10³ agents: 8.578×10⁻³ seconds
- 10⁶ agents: 0.1680 seconds

Messaging System Performance

MQTT achieves 44,702 messages per second
Redis Pub/Sub: 81,216 messages per second (higher throughput but lacks built-in tools)
RabbitMQ: 23,667 messages per second

Overall System Performance

Successfully simulates 10,000+ agents with realistic behaviors
LLM API calls remain the primary bottleneck
Parallel execution significantly improves performance:
- 10⁴ agents, 8 processes: 5,681 seconds per round
- 10⁴ agents, 32 processes: 458 seconds per round

Technical Challenges and Solutions

Challenge: TCP Port Exhaustion

Problem: Individual agent processes would exhaust available TCP ports (65,535 limit)

Solution: Group-based execution with connection sharing

Multiple agents operate within single processes
Shared client connections to services
Maintains agent independence through asynchronous execution

Challenge: LLM API Latency

Problem: LLM API calls introduce significant latency

Solution: Asynchronous execution and parallelization

Concurrent LLM requests through asyncio
Multi-process execution through Ray
Connection pooling for efficient resource utilization

Challenge: Inter-agent Communication

Problem: Efficient message routing between thousands of agents

Solution: MQTT-based messaging system

Lightweight publish/subscribe architecture
Topic-based routing for efficient delivery
Scalable to hundreds of thousands of connections

Challenge: Data Management

Problem: Storing and analyzing massive simulation data

Solution: Hybrid storage approach

PostgreSQL for structured data with COPY FROM optimization
AVRO format for local file storage
mlflow for metric tracking and experiment comparison

Conclusion

AgentSociety's technical architecture represents a significant advancement in large-scale social simulation, addressing key challenges in scalability, communication, and computational efficiency. By leveraging distributed computing, asynchronous execution, and efficient messaging, the platform enables unprecedented scale and realism in agent-based social modeling.

The integration of sophisticated LLM-driven agents with a realistic societal environment and powerful simulation engine opens new possibilities for social science research, policy evaluation, and complex system modeling. As computational resources and LLM capabilities continue to advance, this architecture provides a foundation for even more ambitious simulations of human society.

Technical_Architecture

Technical_Architecture

AgentSociety: Technical Architecture

System Architecture Overview

Shared Services

Simulation Tasks

GUI Component

Group-based Distributed Execution

Agent Grouping Strategy

Parallel Execution

Performance Optimization

MQTT-powered Agent Messaging System

Topic Structure

Implementation Benefits

Performance Metrics

Utilities and Toolbox

Core Utilities

Performance Evaluation

Environment Performance

Messaging System Performance

Overall System Performance

Technical Challenges and Solutions

Challenge: TCP Port Exhaustion

Challenge: LLM API Latency

Challenge: Inter-agent Communication

Challenge: Data Management

Conclusion

Technical_Architecture

AgentSociety: Technical Architecture

System Architecture Overview

Shared Services

Simulation Tasks

GUI Component

Group-based Distributed Execution

Agent Grouping Strategy

Parallel Execution

Performance Optimization

MQTT-powered Agent Messaging System

Topic Structure

Implementation Benefits

Performance Metrics

Utilities and Toolbox

Core Utilities

Social Science Toolbox

Performance Evaluation

Environment Performance

Messaging System Performance

Overall System Performance

Technical Challenges and Solutions

Challenge: TCP Port Exhaustion

Challenge: LLM API Latency

Challenge: Inter-agent Communication

Challenge: Data Management

Conclusion