Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/salesforce/ai-economist/llms.txt

Use this file to discover all available pages before exploring further.

The Gather-Trade-Build simulation is the core scenario in the AI Economist Foundation framework. Mobile agents navigate a 2D grid world, gather scarce resources (wood and stone), trade them with each other through a commodity exchange, and use the resources to build houses and earn income. A social planner agent sets tax policy to influence the distribution of wealth.

Scenario variants

Two scenario names are registered:
Scenario nameLayout
uniform/simple_wood_and_stoneResources placed stochastically at reset with configurable clumping and gradient
layout_from_file/simple_wood_and_stoneResources placed according to a fixed map file (e.g. quadrant_25x25_20each_30clump.txt)
Both variants share the same agent types (BasicMobileAgent, BasicPlanner) and required entities (Wood, Stone).

Components used

Registered as Gather. Allows mobile agents to move around the world and collect resources. Supports heterogeneous collection skill via skill_dist (none, pareto, or lognormal). Key parameters:
  • move_labor (float, default 1.0): Labor cost per move.
  • collect_labor (float, default 1.0): Additional labor cost when collecting a resource.
  • skill_dist (str, default "none"): Skill distribution for bonus collection probability.
Registered as Build. Allows agents to spend 1 wood + 1 stone to place a house landmark, earning coin. Supports heterogeneous build skill. Key parameters:
  • payment (int, default 10): Base coin earned per house built.
  • payment_max_skill_multiplier (int, default 1): Upper bound on the skill multiplier.
  • skill_dist (str, default "none"): Skill distribution for build income.
  • build_labor (float, default 10.0): Labor cost per build action.
Registered as ContinuousDoubleAuction. Implements a commodity-exchange-style market where agents submit bids and asks for resources. Trades clear when a bid meets or exceeds an ask. Key parameters:
  • max_bid_ask (int, default 10): Maximum coin value for a bid or ask.
  • order_labor (float, default 0.25): Labor cost for placing an order.
  • order_duration (int, default 50): Timesteps before an unfilled order expires.
  • max_num_orders (int, optional): Maximum open orders per resource per agent.
Two redistribution components are available:
  • WealthRedistribution: Passively divides total agent coin equally each step. No planner actions required.
  • PeriodicBracketTax: The planner sets marginal tax bracket rates. Taxes are collected and redistributed as lump-sum payments at the end of each tax period.
WealthRedistribution must always be placed last in the component list.

Key scenario parameters

The Uniform scenario (uniform/simple_wood_and_stone) exposes the following constructor arguments in addition to the base environment arguments.
ParameterTypeDefaultDescription
n_agentsintNumber of mobile agents (passed via base env config)
world_sizelist[height, width] of the grid (passed via base env config)
episode_lengthintNumber of timesteps per episode (passed via base env config)
starting_wood_coveragefloat0.025Target fraction of tiles covered by wood at reset
wood_regen_halfwidthint0Spatial halfwidth of the wood regeneration kernel
wood_regen_weightfloat0.01Per-tile regen probability for wood
wood_max_healthint1Maximum wood units per source tile
wood_clumpinessfloat0.35Degree of spatial wood clustering
starting_stone_coveragefloat0.025Target fraction of tiles covered by stone at reset
stone_regen_halfwidthint0Spatial halfwidth of the stone regeneration kernel
stone_regen_weightfloat0.01Per-tile regen probability for stone
stone_max_healthint1Maximum stone units per source tile
stone_clumpinessfloat0.5Degree of spatial stone clustering
gradient_steepnessfloat8How steeply wood/stone are restricted to opposite ends of the map
starting_agent_coinfloat0Coin each agent starts with
isoelastic_etafloat0.23Isoelastic utility shape parameter (0 = linear, 1 = log)
energy_costfloat0.21Coefficient converting labor to negative utility
energy_warmup_constantfloat0Annealing decay constant for energy cost (0 = no annealing)
energy_warmup_methodstr"decay""decay" (episode count) or "auto" (positive-reward timesteps)
planner_reward_typestr"coin_eq_times_productivity"Planner reward: "coin_eq_times_productivity", "inv_income_weighted_coin_endowment", or "inv_income_weighted_utility"
mixing_weight_gini_vs_coinfloat0.0Weight on productivity vs. equality for the planner reward (0 = equal weighting, 1 = productivity only)

Instantiating the environment

import ai_economist.foundation as foundation

env_config = {
    "scenario_name": "uniform/simple_wood_and_stone",

    # Basic world settings
    "n_agents": 4,
    "world_size": [25, 25],
    "episode_length": 1000,

    # Resource regeneration
    "starting_wood_coverage": 0.025,
    "wood_regen_halfwidth": 0,
    "wood_regen_weight": 0.01,
    "wood_max_health": 1,
    "wood_clumpiness": 0.35,
    "starting_stone_coverage": 0.025,
    "stone_regen_halfwidth": 0,
    "stone_regen_weight": 0.01,
    "stone_max_health": 1,
    "stone_clumpiness": 0.5,
    "gradient_steepness": 8,

    # Utility / reward
    "isoelastic_eta": 0.23,
    "energy_cost": 0.21,
    "planner_reward_type": "coin_eq_times_productivity",
    "mixing_weight_gini_vs_coin": 0.0,

    # Components
    "components": [
        {"Gather": {"move_labor": 1.0, "collect_labor": 1.0, "skill_dist": "pareto"}},
        {"Build":  {"payment": 10, "skill_dist": "pareto", "build_labor": 10.0}},
        {"ContinuousDoubleAuction": {"max_bid_ask": 10, "order_labor": 0.25, "order_duration": 50}},
        {"PeriodicBracketTax": {
            "period": 100,
            "bracket_spacing": "us-federal",
            "usd_scaling": 1000.0,
            "disable_taxes": False
        }},
    ],
}

env = foundation.make_env_instance(**env_config)
obs = env.reset()
To use the fixed-layout variant instead, change the scenario name and add the layout file parameter:
env_config["scenario_name"] = "layout_from_file/simple_wood_and_stone"
env_config["env_layout_file"] = "quadrant_25x25_20each_30clump.txt"
env_config["resource_regen_prob"] = 0.01

Observation space

Each mobile agent’s observation consists of:
  • A spatial tensor (egocentric or full-world view, depending on full_observability) with channels for each resource type, agent locations, house locations, and water.
  • Inventory contents: current holdings of Wood, Stone, Coin, and Labor.
  • Endogenous quantities: accumulated labor.
The planner agent receives:
  • A spatial tensor of the world (if planner_gets_spatial_info=True).
  • The inventory of each mobile agent.
  • Current tax rates (when using PeriodicBracketTax).

Reward structure

Mobile agents receive isoelastic utility over coin minus a labor cost:
reward_agent = u(coin) - energy_cost * labor
where u(coin) is the isoelastic function parameterized by isoelastic_eta. The social planner receives one of three reward types (set via planner_reward_type):
TypeDescription
coin_eq_times_productivityProduct of equality (1 − Gini) and total coin productivity, weighted by mixing_weight_gini_vs_coin
inv_income_weighted_coin_endowmentInverse-income-weighted average coin endowment
inv_income_weighted_utilityInverse-income-weighted average utility

Tutorials

Economic simulation (basic)

Interact with and visualize the simulation interactively in Colab.

Economic simulation (advanced)

Explore composable building blocks and custom scenario construction.

Optimal taxation

Use the simulation to study optimal tax policy design.

Multi-agent training with RLlib

Train agents with distributed RL using RLlib.