AgentSociety_Codebase_Analysis
Last updated: 3/24/2025, 6:40:28 PM
AgentSociety Codebase Analysis
1. Overview
AgentSociety is a large-scale social simulation framework designed to support thousands of LLM-driven agents interacting in a realistic societal environment. The codebase is structured around three primary components:
- LLM-driven Social Generative Agents: Intelligent entities with psychological states, memory systems, and behavioral capabilities
- Realistic Societal Environment: Virtual world with urban, social, and economic spaces
- Large-Scale Simulation Engine: Computational infrastructure enabling efficient execution of simulations
The framework is implemented in Python and uses Ray for distributed computing, MQTT for messaging, and PostgreSQL for data storage. It supports various LLM providers including OpenAI, DeepSeek, and ZhipuAI.
2. Project Structure
The codebase is organized into the following main directories:
agentsociety/
├── agent/ # Base agent implementation
├── cityagent/ # City-specific agent implementations
│ ├── blocks/ # Behavioral blocks for agents
├── cli/ # Command-line interface
├── configs/ # Configuration handling
├── environment/ # Environment simulation
│ ├── economy/ # Economic simulation
│ ├── sim/ # Urban simulation
├── llm/ # LLM client implementations
├── memory/ # Memory systems
├── message/ # Messaging system
├── metrics/ # Metrics collection
├── simulation/ # Simulation management
├── survey/ # Survey system for agents
├── tools/ # Utility tools
├── utils/ # Utility functions
└── workflow/ # Workflow management
3. Core Components
3.1 Agent Architecture
The agent system is built on a hierarchical class structure with Agent as the base class. The main agent implementations include:
SocietyAgent: Citizen agents with social behaviorsFirmAgent: Business entities in the economic systemBankAgent: Financial institutionsGovernmentAgent: Government entitiesNBSAgent: National Bureau of Statistics for economic monitoring
Each agent is composed of modular blocks that handle different aspects of behavior:
3.1.1 Agent Base Class
The Agent base class (in agent_base.py) provides the foundation for all agent types with core functionality:
class Agent(ABC):
"""Agent base class"""
def __init__(
self,
name: str,
type: AgentType = AgentType.Unspecified,
llm_client: Optional[LLM] = None,
economy_client: Optional[EconomyClient] = None,
messager: Optional[ray.ObjectRef] = None,
message_interceptor: Optional[ray.ObjectRef] = None,
simulator: Optional[Simulator] = None,
memory: Optional[Memory] = None,
avro_file: Optional[dict[str, str]] = None,
copy_writer: Optional[ray.ObjectRef] = None,
) -> None:
# Initialization code...
Key methods in the base class include:
forward(): Abstract method that defines agent behaviorrun(): Entry point for executing agent logicgenerate_user_chat_response(): Generates responses to user messagesgenerate_user_survey_response(): Generates responses to surveyssend_message_to_agent(): Sends messages to other agents
3.1.2 SocietyAgent Implementation
The SocietyAgent class extends the base agent with social behaviors and cognitive capabilities:
class SocietyAgent(CitizenAgent):
"""Agent implementation with configurable cognitive/behavioral modules and social interaction capabilities."""
update_with_sim = UpdateWithSimulator()
mindBlock: MindBlock
planAndActionBlock: PlanAndActionBlock
configurable_fields = [
"enable_cognition",
"enable_mobility",
"enable_social",
"enable_economy",
]
The SocietyAgent implements a modular architecture with specialized blocks:
MindBlock: Handles cognitive processes and emotionsPlanAndActionBlock: Coordinates planning and action executionMonthPlanBlock: Long-term planningNeedsBlock: Manages agent needsPlanBlock: Creates short-term plansMobilityBlock: Handles movement in the environmentSocialBlock: Manages social interactionsEconomyBlock: Handles economic activities
3.2 Memory System
The memory system is a sophisticated component that provides agents with different types of memory:
3.2.1 Memory Architecture
The Memory class integrates three types of memory:
class Memory:
"""A class to manage different types of memory (state, profile, dynamic)."""
def __init__(
self,
config: Optional[dict[Any, Any]] = None,
profile: Optional[dict[Any, Any]] = None,
base: Optional[dict[Any, Any]] = None,
activate_timestamp: bool = False,
embedding_model: Optional[Embeddings] = None,
faiss_query: Optional[FaissQuery] = None,
) -> None:
# Initialization code...
ProfileMemory: Static information about the agent (demographics, personality)StateMemory: Dynamic state information (location, status)StreamMemory: Time-ordered events and experiences
3.2.2 Memory Retrieval
The memory system uses vector embeddings for semantic search:
async def search(
self,
query: str,
tag: Optional[MemoryTag] = None,
top_k: int = 3,
day_range: Optional[tuple[int, int]] = None,
time_range: Optional[tuple[int, int]] = None,
) -> str:
"""Search stream memory with optional filters and return formatted results."""
# Implementation...
This allows agents to retrieve relevant memories based on semantic similarity, time, and other filters.
3.3 LLM Integration
The LLM system provides a unified interface to different language model providers:
class LLM:
"""Main class for the Large Language Model (LLM) object used by Agent(Soul)."""
def __init__(self, config: LLMRequestConfig) -> None:
"""Initializes the LLM instance."""
self.config = config
# Initialization code...
Key features include:
- Support for multiple LLM providers (OpenAI, DeepSeek, ZhipuAI, etc.)
- Token usage tracking and cost estimation
- Asynchronous request handling with retry logic
- Round-robin client selection for load balancing
3.4 Simulation Engine
The simulation engine coordinates the entire simulation process:
class AgentSimulation:
"""A class to simulate a multi-agent system."""
def __init__(
self,
config: SimConfig,
agent_class: Union[None, type[Agent], list[type[Agent]]] = None,
agent_class_configs: Optional[dict] = None,
metric_extractors: Optional[list[tuple[int, Callable]]] = None,
agent_prefix: str = "agent_",
exp_name: str = "default_experiment",
logging_level: int = logging.WARNING,
):
# Initialization code...
Key components include:
- Agent Group Management: Organizes agents into groups for distributed processing
- Environment Simulation: Manages the urban, social, and economic environments
- Experiment Tracking: Records metrics and experiment status
- Workflow Execution: Runs simulation steps according to the defined workflow
4. Key Implementation Details
4.1 Distributed Computing with Ray
AgentSociety uses Ray for distributed computing, allowing it to scale to thousands of agents:
# Agent groups are implemented as Ray actors
group = AgentGroup.remote(
agent_class,
number_of_agents,
memory_config_function_group,
self.config,
self._map_ref,
self.exp_name,
self.exp_id,
# Other parameters...
)
This approach addresses the TCP port exhaustion problem by grouping multiple agents within a single process and sharing client connections.
4.2 MQTT Messaging System
The messaging system uses MQTT for efficient communication between agents:
async def send_message_to_agent(
self, to_agent_id: int, content: str, type: str = "social"
):
"""Send a social or economy message to another agent."""
# Implementation...
topic = f"exps/{self._exp_id}/agents/{to_agent_id}/agent-chat"
await self._messager.send_message.remote(
topic,
payload,
self.id,
to_agent_id,
)
The topic structure follows a hierarchical pattern:
exps/<exp_uuid>/agents/<agent_uuid>/agent-chat: For agent-to-agent communicationexps/<exp_uuid>/agents/<agent_uuid>/user-chat: For user-to-agent communicationexps/<exp_uuid>/agents/<agent_uuid>/user-survey: For surveys
4.3 Memory Implementation
The memory system uses FAISS for efficient vector search:
class FaissQuery:
"""A class for performing vector similarity search using FAISS."""
async def similarity_search(
self,
query: str,
agent_id: int,
k: int = 4,
return_score_type: Literal["similarity_score", "distance"] = "similarity_score",
filter: Optional[dict] = None,
) -> list[tuple[str, float, dict]]:
# Implementation...
This enables efficient semantic search across agent memories, which is crucial for context-relevant decision making.
4.4 Agent Behavior Implementation
Agent behaviors are implemented as modular blocks that can be enabled or disabled:
class PlanAndActionBlock(Block):
"""Active workflow coordinating needs assessment, planning, and action execution."""
# Sub-modules for different behavioral aspects
monthPlanBlock: MonthPlanBlock
needsBlock: NeedsBlock
planBlock: PlanBlock
mobilityBlock: MobilityBlock
socialBlock: SocialBlock
economyBlock: EconomyBlock
otherBlock: OtherBlock
Each block handles a specific aspect of agent behavior, making the system modular and extensible.
4.5 Social Interaction
Social interactions are implemented through a messaging system with content generation:
async def process_agent_chat_response(self, payload: dict) -> str: # type:ignore
"""Process incoming social/economic messages and generate responses."""
if payload["type"] == "social":
# Extract message content
sender_id = payload.get("from")
raw_content = payload.get("content", "")
# Generate response using LLM
response_prompt = f"""Based on:
- Received message: "{content}"
- Our relationship score: {relationship_score}/100
- My profile: {{
"gender": "{await self.memory.status.get("gender") or ""}",
"education": "{await self.memory.status.get("education") or ""}",
"personality": "{await self.memory.status.get("personality") or ""}",
"occupation": "{await self.memory.status.get("occupation") or ""}"
}}
- My current emotion: {await self.memory.status.get("emotion_types")}
- Recent chat history: {chat_histories.get(sender_id, "")}
Generate an appropriate response that:
1. Matches my personality and background
2. Maintains natural conversation flow
3. Is concise (under 100 characters)
4. Reflects our relationship level
Response should be ONLY the message text, no explanations."""
response = await self.llm.atext_request(
[
{
"role": "system",
"content": "You are helping generate a chat response.",
},
{"role": "user", "content": response_prompt},
]
)
This approach ensures that social interactions are personalized based on the agent's profile, emotional state, and relationship with the other agent.
5. Configuration System
AgentSociety uses a comprehensive configuration system with two main components:
5.1 Simulation Configuration
The SimConfig class defines the simulation environment settings:
sim_config = (
SimConfig()
.SetLLMRequest(
request_type=LLMRequestType.ZhipuAI, api_key="YOUR-API-KEY", model="GLM-4-Flash"
)
.SetSimulatorRequest()
.SetMQTT(server="mqtt.example.com", username="user", port=1883, password="pass")
.SetMapRequest(file_path="map.pb")
.SetPostgreSql(dsn="postgresql://user:pass@localhost:5432/db", enabled=True)
.SetMetricRequest(
username="mlflow_user", password="mlflow_pass", mlflow_uri="http://mlflow:5000"
)
)
This configuration includes settings for:
- LLM API access
- Simulator parameters
- MQTT messaging
- Map data
- Database connection
- Metrics collection
5.2 Experiment Configuration
The ExpConfig class defines the experiment parameters:
exp_config = (
ExpConfig(
exp_name="allinone_economy", llm_semaphore=200, logging_level=logging.INFO
)
.SetAgentConfig(
number_of_citizen=100,
number_of_firm=5,
memory_config_func={SocietyAgent: memory_config_societyagent},
agent_class_configs={
SocietyAgent: json.load(open("society_agent_config.json"))
},
)
.SetWorkFlow(
[
WorkflowStep(type=WorkflowType.RUN, days=10, times=1, description=""),
]
)
.SetMetricExtractors(
metric_extractors=[(1, economy_metric), (12, gather_ubi_opinions)]
)
)
This configuration includes:
- Experiment metadata
- Agent population settings
- Workflow definition
- Metric collection functions
6. Workflow System
The workflow system defines how simulations are executed:
.SetWorkFlow(
[
WorkflowStep(type=WorkflowType.RUN, days=10, times=1, description=""),
]
)
Workflow steps can be of different types:
RUN: Run the simulation for a specified number of daysSTEP: Execute a specific number of simulation stepsINTERVIEW: Send interview questions to agentsSURVEY: Send surveys to agentsINTERVENE: Modify the simulation stateFUNCTION: Execute a custom function
7. Example Use Cases
The repository includes several example use cases that demonstrate the capabilities of the framework:
7.1 Universal Basic Income (UBI) Simulation
This example simulates the impact of UBI on a society:
async def gather_ubi_opinions(simulation: AgentSimulation):
citizen_agents = await simulation.filter(types=[SocietyAgent])
opinions = await simulation.gather("ubi_opinion", citizen_agents)
with open("opinions.pkl", "wb") as f:
pkl.dump(opinions, f)
# Configuration and execution
exp_config = (
ExpConfig(
exp_name="allinone_economy", llm_semaphore=200, logging_level=logging.INFO
)
.SetAgentConfig(
number_of_citizen=100,
number_of_firm=5,
memory_config_func={SocietyAgent: memory_config_societyagent},
agent_class_configs={
SocietyAgent: json.load(open("society_agent_config.json"))
},
)
.SetWorkFlow(
[
WorkflowStep(type=WorkflowType.RUN, days=10, times=1, description=""),
]
)
.SetMetricExtractors(
metric_extractors=[(1, economy_metric), (12, gather_ubi_opinions)]
)
)
7.2 Hurricane Impact
This example simulates the impact of a hurricane on a community, including evacuation behaviors and economic effects.
7.3 Inflammatory Message Propagation
This example studies how inflammatory messages spread through a social network and how different intervention strategies affect their propagation.
7.4 Political Polarization
This example examines how echo chambers and backfiring effects contribute to political polarization in a social network.
8. Performance Optimization
The codebase includes several optimizations for performance and scalability:
8.1 Agent Grouping
Agents are organized into groups that run as separate Ray actors:
for i, (
agent_class,
number_of_agents,
memory_config_function_group,
group_name,
config_file,
) in enumerate(group_creation_params):
# create async task directly
group = AgentGroup.remote(
agent_class,
number_of_agents,
memory_config_function_group,
self.config,
self._map_ref,
self.exp_name,
self.exp_id,
# Other parameters...
)
self._groups[group_name] = group
This approach allows for efficient distribution of agents across available computational resources.
8.2 Asynchronous Execution
The codebase extensively uses Python's asyncio for asynchronous execution:
async def step(self, num_simulator_steps: int = 1):
"""Execute one step of the simulation where each agent performs its forward action."""
try:
# step
simulator_day = await self._simulator.get_simulator_day()
simulator_time = int(
await self._simulator.get_simulator_second_from_start_of_day()
)
logger.info(
f"Start simulation day {simulator_day} at {simulator_time}, step {self._total_steps}"
)
tasks = []
for group in self._groups.values():
tasks.append(group.step.remote())
self._simulator.step(num_simulator_steps)
log_messages_groups = await asyncio.gather(*tasks)
# More code...
This allows for efficient handling of I/O-bound operations like LLM API calls and database operations.
8.3 Database Optimization
The system uses batch operations for database writes:
if self._pgsql_writer is not None:
if self._last_asyncio_pg_task is not None:
await self._last_asyncio_pg_task
_keys = ["id", "day", "t", "type", "speaker", "content", "created_at"]
_data = [
tuple([_dict[k] if k != "created_at" else _date_time for k in _keys])
for _dict, _date_time in pg_list
]
self._last_asyncio_pg_task = (
self._pgsql_writer.async_write_dialog.remote(
_data
)
)
This approach minimizes the overhead of database operations by batching writes and executing them asynchronously.
9. Conclusion
AgentSociety represents a significant advancement in large-scale social simulation, providing a comprehensive framework for simulating complex social dynamics with LLM-driven agents. The codebase is well-structured, modular, and designed for scalability, making it suitable for a wide range of social science research applications.
Key strengths of the implementation include:
- Modular Agent Architecture: The block-based design allows for flexible agent behavior configuration
- Sophisticated Memory System: The multi-layered memory with semantic search enables context-aware decision making
- Distributed Computing: The Ray-based implementation allows for scaling to thousands of agents
- Comprehensive Environment: The integration of urban, social, and economic environments creates a realistic simulation space
- Flexible Configuration: The extensive configuration system makes it easy to set up different experiments
The framework provides a solid foundation for studying complex social phenomena through agent-based simulation, with potential applications in policy analysis, social science research, and artificial society modeling.