Documentation Index
Fetch the complete documentation index at: https://mintlify.com/salesforce/ai-economist/llms.txt
Use this file to discover all available pages before exploring further.
The One-Step Economy is a deliberately minimal scenario that distills the essential tax-and-labor dynamics of the full Gather-Trade-Build simulation into just two timesteps. It is designed for rapid reinforcement learning experimentation and theoretical analysis of optimal tax policy.
How it works
The scenario runs with an episode_length of 2:
- Step 1 — Tax setting: The planner agent sets marginal tax bracket rates via the
PeriodicBracketTax component.
- Step 2 — Labor selection: Each mobile agent selects how much labor to supply via the
SimpleLabor component. Each agent’s optimal labor depends on its skill level and the tax rates, but not on the choices of other agents.
Because agents make analytically tractable decisions, this scenario is well suited to studying how tax schedules affect labor supply and income distribution without the confounding dynamics of resource gathering, trading, or spatial navigation.
Scenario name
Defined in:
ai_economist/foundation/scenarios/one_step_economy/one_step_economy.py
The registered class is OneStepEconomy, which extends BaseEnvironment. It uses:
- Agent types:
BasicMobileAgent, BasicPlanner
- Required entities:
Coin
Intended components
This scenario is designed to be paired with:
| Component | Role |
|---|
PeriodicBracketTax | Planner sets marginal tax rates at the start of each period |
SimpleLabor | Agents choose a discrete labor level; income = skill × labor |
Key parameters
| Parameter | Type | Default | Description |
|---|
agent_reward_type | str | "coin_minus_labor_cost" | Utility function for mobile agents. Options: "coin_minus_labor_cost", "isoelastic_coin_minus_labor" |
isoelastic_eta | float | 0.23 | Shape parameter for the isoelastic utility function (used when agent_reward_type="isoelastic_coin_minus_labor") |
labor_exponent | float | 2.0 | Exponent in the "coin_minus_labor_cost" utility function |
labor_cost | float | 1.0 | Coefficient weighting the cost of labor |
planner_reward_type | str | "inv_income_weighted_utility" | Social welfare function for the planner. Options: "inv_income_weighted_utility", "coin_eq_times_productivity", "inv_income_weighted_coin_endowment" |
mixing_weight_gini_vs_coin | float | 0 | Weight on productivity vs. equality for "coin_eq_times_productivity" (0 = equal, 1 = productivity only) |
Instantiating the environment
import ai_economist.foundation as foundation
env_config = {
"scenario_name": "one-step-economy",
# Must be 2: step 1 = tax setting, step 2 = labor choice
"episode_length": 2,
"n_agents": 10,
# Agent reward
"agent_reward_type": "coin_minus_labor_cost",
"labor_exponent": 2.0,
"labor_cost": 1.0,
# Planner reward
"planner_reward_type": "inv_income_weighted_utility",
# Components
"components": [
{"PeriodicBracketTax": {
"period": 1,
"bracket_spacing": "us-federal",
"usd_scaling": 1000.0,
"disable_taxes": False
}},
{"SimpleLabor": {
"mask_first_step": True,
"payment_max_skill_multiplier": 3,
"pareto_param": 4.0,
}},
],
}
env = foundation.make_env_instance(**env_config)
obs = env.reset()
# Step through one episode
done = False
while not done:
actions = {} # supply actions for each agent
obs, rewards, done, info = env.step(actions)
Observations
On each step, agents observe:
- Their inventory (
Coin).
- Per-agent:
normalized_per_capita_productivity and equality metrics (available to the planner).
- The planner observes aggregate coin endowments and the derived equality and productivity metrics.
From generate_observations in one_step_economy.py:
coin_endowments = np.array(
[agent.total_endowment("Coin") for agent in self.world.agents]
)
equality = social_metrics.get_equality(coin_endowments)
productivity = social_metrics.get_productivity(coin_endowments)
normalized_per_capita_productivity = productivity / self.num_agents / 1000
Reward structure
Mobile agents receive utility based on agent_reward_type:
"coin_minus_labor_cost": coin - labor_cost * labor^labor_exponent
"isoelastic_coin_minus_labor": isoelastic utility over coin minus labor cost, shaped by isoelastic_eta
The planner receives social welfare according to planner_reward_type:
| Type | Description |
|---|
"inv_income_weighted_utility" | Weighted average of agent utilities, with higher weight on lower-income agents |
"coin_eq_times_productivity" | Product of equality (1 − Gini) and total productivity |
"inv_income_weighted_coin_endowment" | Inverse-income-weighted average coin endowment |
Use in research
This scenario was introduced in:
The AI Economist: Optimal Economic Policy Design via Two-level Deep Reinforcement Learning (arXiv:2108.02755)
Stephan Zheng, Alexander Trott, Sunil Srinivasa, David C. Parkes, Richard Socher.
It enables tractable two-level RL experiments where the planner’s optimal tax policy can be analyzed analytically alongside learned policies.