Gather, Trade & Build - AI Economist / Foundation

The Gather-Trade-Build simulation is the core scenario in the AI Economist Foundation framework. Mobile agents navigate a 2D grid world, gather scarce resources (wood and stone), trade them with each other through a commodity exchange, and use the resources to build houses and earn income. A social planner agent sets tax policy to influence the distribution of wealth.

Scenario variants

Two scenario names are registered:

Scenario name	Layout
`uniform/simple_wood_and_stone`	Resources placed stochastically at reset with configurable clumping and gradient
`layout_from_file/simple_wood_and_stone`	Resources placed according to a fixed map file (e.g. `quadrant_25x25_20each_30clump.txt`)

Both variants share the same agent types (BasicMobileAgent, BasicPlanner) and required entities (Wood, Stone).

Components used

Gather (move.py)

Registered as Gather. Allows mobile agents to move around the world and collect resources. Supports heterogeneous collection skill via skill_dist (none, pareto, or lognormal). Key parameters:

move_labor (float, default 1.0): Labor cost per move.
collect_labor (float, default 1.0): Additional labor cost when collecting a resource.
skill_dist (str, default "none"): Skill distribution for bonus collection probability.

Build (build.py)

Registered as Build. Allows agents to spend 1 wood + 1 stone to place a house landmark, earning coin. Supports heterogeneous build skill. Key parameters:

payment (int, default 10): Base coin earned per house built.
payment_max_skill_multiplier (int, default 1): Upper bound on the skill multiplier.
skill_dist (str, default "none"): Skill distribution for build income.
build_labor (float, default 10.0): Labor cost per build action.

ContinuousDoubleAuction (continuous_double_auction.py)

Registered as ContinuousDoubleAuction. Implements a commodity-exchange-style market where agents submit bids and asks for resources. Trades clear when a bid meets or exceeds an ask. Key parameters:

max_bid_ask (int, default 10): Maximum coin value for a bid or ask.
order_labor (float, default 0.25): Labor cost for placing an order.
order_duration (int, default 50): Timesteps before an unfilled order expires.
max_num_orders (int, optional): Maximum open orders per resource per agent.

WealthRedistribution / PeriodicBracketTax (redistribution.py)

Two redistribution components are available:

WealthRedistribution: Passively divides total agent coin equally each step. No planner actions required.
PeriodicBracketTax: The planner sets marginal tax bracket rates. Taxes are collected and redistributed as lump-sum payments at the end of each tax period.

WealthRedistribution must always be placed last in the component list.

Key scenario parameters

The Uniform scenario (uniform/simple_wood_and_stone) exposes the following constructor arguments in addition to the base environment arguments.

Parameter	Type	Default	Description
`n_agents`	int	—	Number of mobile agents (passed via base env config)
`world_size`	list	—	`[height, width]` of the grid (passed via base env config)
`episode_length`	int	—	Number of timesteps per episode (passed via base env config)
`starting_wood_coverage`	float	`0.025`	Target fraction of tiles covered by wood at reset
`wood_regen_halfwidth`	int	`0`	Spatial halfwidth of the wood regeneration kernel
`wood_regen_weight`	float	`0.01`	Per-tile regen probability for wood
`wood_max_health`	int	`1`	Maximum wood units per source tile
`wood_clumpiness`	float	`0.35`	Degree of spatial wood clustering
`starting_stone_coverage`	float	`0.025`	Target fraction of tiles covered by stone at reset
`stone_regen_halfwidth`	int	`0`	Spatial halfwidth of the stone regeneration kernel
`stone_regen_weight`	float	`0.01`	Per-tile regen probability for stone
`stone_max_health`	int	`1`	Maximum stone units per source tile
`stone_clumpiness`	float	`0.5`	Degree of spatial stone clustering
`gradient_steepness`	float	`8`	How steeply wood/stone are restricted to opposite ends of the map
`starting_agent_coin`	float	`0`	Coin each agent starts with
`isoelastic_eta`	float	`0.23`	Isoelastic utility shape parameter (0 = linear, 1 = log)
`energy_cost`	float	`0.21`	Coefficient converting labor to negative utility
`energy_warmup_constant`	float	`0`	Annealing decay constant for energy cost (0 = no annealing)
`energy_warmup_method`	str	`"decay"`	`"decay"` (episode count) or `"auto"` (positive-reward timesteps)
`planner_reward_type`	str	`"coin_eq_times_productivity"`	Planner reward: `"coin_eq_times_productivity"`, `"inv_income_weighted_coin_endowment"`, or `"inv_income_weighted_utility"`
`mixing_weight_gini_vs_coin`	float	`0.0`	Weight on productivity vs. equality for the planner reward (0 = equal weighting, 1 = productivity only)

Instantiating the environment

import ai_economist.foundation as foundation

env_config = {
    "scenario_name": "uniform/simple_wood_and_stone",

    # Basic world settings
    "n_agents": 4,
    "world_size": [25, 25],
    "episode_length": 1000,

    # Resource regeneration
    "starting_wood_coverage": 0.025,
    "wood_regen_halfwidth": 0,
    "wood_regen_weight": 0.01,
    "wood_max_health": 1,
    "wood_clumpiness": 0.35,
    "starting_stone_coverage": 0.025,
    "stone_regen_halfwidth": 0,
    "stone_regen_weight": 0.01,
    "stone_max_health": 1,
    "stone_clumpiness": 0.5,
    "gradient_steepness": 8,

    # Utility / reward
    "isoelastic_eta": 0.23,
    "energy_cost": 0.21,
    "planner_reward_type": "coin_eq_times_productivity",
    "mixing_weight_gini_vs_coin": 0.0,

    # Components
    "components": [
        {"Gather": {"move_labor": 1.0, "collect_labor": 1.0, "skill_dist": "pareto"}},
        {"Build":  {"payment": 10, "skill_dist": "pareto", "build_labor": 10.0}},
        {"ContinuousDoubleAuction": {"max_bid_ask": 10, "order_labor": 0.25, "order_duration": 50}},
        {"PeriodicBracketTax": {
            "period": 100,
            "bracket_spacing": "us-federal",
            "usd_scaling": 1000.0,
            "disable_taxes": False
        }},
    ],
}

env = foundation.make_env_instance(**env_config)
obs = env.reset()

To use the fixed-layout variant instead, change the scenario name and add the layout file parameter:

env_config["scenario_name"] = "layout_from_file/simple_wood_and_stone"
env_config["env_layout_file"] = "quadrant_25x25_20each_30clump.txt"
env_config["resource_regen_prob"] = 0.01

Observation space

Each mobile agent’s observation consists of:

A spatial tensor (egocentric or full-world view, depending on full_observability) with channels for each resource type, agent locations, house locations, and water.
Inventory contents: current holdings of Wood, Stone, Coin, and Labor.
Endogenous quantities: accumulated labor.

The planner agent receives:

A spatial tensor of the world (if planner_gets_spatial_info=True).
The inventory of each mobile agent.
Current tax rates (when using PeriodicBracketTax).

Reward structure

Mobile agents receive isoelastic utility over coin minus a labor cost:

reward_agent = u(coin) - energy_cost * labor

where u(coin) is the isoelastic function parameterized by isoelastic_eta. The social planner receives one of three reward types (set via planner_reward_type):

Type	Description
`coin_eq_times_productivity`	Product of equality (1 − Gini) and total coin productivity, weighted by `mixing_weight_gini_vs_coin`
`inv_income_weighted_coin_endowment`	Inverse-income-weighted average coin endowment
`inv_income_weighted_utility`	Inverse-income-weighted average utility

Tutorials

Economic simulation (basic)

Interact with and visualize the simulation interactively in Colab.

Economic simulation (advanced)

Explore composable building blocks and custom scenario construction.

Optimal taxation

Use the simulation to study optimal tax policy design.

Multi-agent training with RLlib

Train agents with distributed RL using RLlib.

Documentation Index

​Scenario variants

​Components used

​Key scenario parameters

​Instantiating the environment

​Observation space

​Reward structure

​Tutorials