Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/salesforce/ai-economist/llms.txt

Use this file to discover all available pages before exploring further.

FoundationEnvWrapper wraps a Foundation BaseEnvironment instance and decides whether environment reset and step operations run on the CPU or the GPU.
  • CPU mode (use_cuda=False): every reset() and step() call runs on the host using the standard Python environment.
  • GPU mode (use_cuda=True, requires WarpDrive): the first reset() runs on the CPU, copies all relevant data to the GPU, and all subsequent steps execute as CUDA kernels.
The wrapper also attaches observation_space and action_space attributes to the wrapped environment, making it compatible with RLlib and other Gym-style training frameworks.
from ai_economist.foundation.env_wrapper import FoundationEnvWrapper
from ai_economist import foundation

ScenarioClass = foundation.scenarios.get("uniform/simple_wood_and_stone")
env_obj = ScenarioClass(
    components=[("Build", {"payment": 20}), ("Gather", {})],
    n_agents=4,
    world_size=[25, 25],
)

wrapper = FoundationEnvWrapper(env_obj=env_obj)
obs = wrapper.reset()
obs, rew, done, info = wrapper.step(actions)

recursive_obs_dict_to_spaces_dict()

recursive_obs_dict_to_spaces_dict(obs: dict) -> gym.spaces.Dict
Module-level utility function that converts an observation dictionary (keyed by agent index) into a corresponding gym.spaces.Dict of observation spaces. Used internally by FoundationEnvWrapper.__init__ to populate env.observation_space. Each leaf value is converted according to these rules:
Python typeGym space
np.ndarrayBox(low=-BIG_NUMBER, high=BIG_NUMBER, shape=v.shape, dtype=v.dtype)
listConverted to np.ndarray, then treated as above
int / float / np.integer / np.floatingWrapped in a 1-D np.array, then treated as above
dictRecurses into a nested spaces.Dict
The BIG_NUMBER bound is 1e20, halved iteratively until the Box low/high values are numerically valid.

Parameters

obs
dict
required
A dictionary of observations keyed by agent index, as returned by a multi-agent Foundation environment’s reset() call.

Returns

spaces
gym.spaces.Dict
A gym.spaces.Dict whose structure mirrors the input observation dictionary, with every leaf replaced by an appropriate Box space.

FoundationEnvWrapper

Constructor

FoundationEnvWrapper(
    env_obj=None,
    env_name=None,
    env_config=None,
    num_envs=1,
    use_cuda=False,
    env_registrar=None,
    event_messenger=None,
    process_id=0,
)
You must supply either env_obj or the triple (env_name, env_config, env_registrar).

Parameters

env_obj
BaseEnvironment
default:"None"
A fully constructed Foundation environment instance. When provided, env_name, env_config, and env_registrar are ignored.
env_name
str
default:"None"
Name of an environment registered on the WarpDrive environment registrar. Required when env_obj is not provided.
env_config
dict
default:"None"
Keyword arguments passed to the environment class retrieved from env_registrar. Required when env_obj is not provided.
num_envs
int
default:"1"
Number of parallel environments to run simultaneously. Only relevant when use_cuda=True.
use_cuda
bool
default:"False"
When True, runs environment steps on the GPU via WarpDrive CUDA kernels. Requires a CUDA-capable GPU and the rl-warp-drive package.
env_registrar
EnvironmentRegistrar
default:"None"
WarpDrive EnvironmentRegistrar object that provides customized environment info (such as CUDA source paths) needed for the GPU build. Required when env_obj is not provided or when use_cuda=True.
event_messenger
multiprocessing.Event
default:"None"
Optional multiprocessing Event used to synchronise the CUDA build when multiple worker processes are launched simultaneously.
process_id
int
default:"0"
Integer identifier of the process running WarpDrive. Used to avoid build collisions in multi-process setups.
When use_cuda=True, the wrapped environment must already expose use_cuda, cuda_data_manager, and cuda_function_manager attributes, and so must env.world. A GPU must be available or the constructor raises an AssertionError.

What the constructor does

1

Attach the environment

Stores the environment as self.env. If env_obj is provided it is used directly; otherwise the environment is instantiated from the registrar.
2

Build observation space

Calls obs_at_reset() (which calls env.reset() once on the CPU) and passes the result to recursive_obs_dict_to_spaces_dict(). The resulting gym.spaces.Dict is stored on env.observation_space.
3

Build action space

Iterates over mobile agents and the planner. Each agent whose multi_action_mode is True receives a MultiDiscrete space; otherwise a Discrete space. All spaces are stored in env.action_space keyed by agent index string (e.g. "0", "1", …, "p").
4

GPU initialisation (use_cuda=True only)

Initialises CUDADataManager and CUDAFunctionManager, compiles and loads the CUDA kernels for the scenario step and each component step, and registers a CUDAEnvironmentReset resetter. All managers are attached to both self.env and self.env.world.

Properties

observation_space

wrapper.env.observation_space  # -> gym.spaces.Dict
A gym.spaces.Dict keyed by agent index strings (e.g. "0", "1", …, "p"). Built from the initial observation returned by the first reset() call. Attached to wrapper.env during construction.

action_space

wrapper.env.action_space  # -> dict[str, gym.spaces.Space]
Dictionary mapping agent index strings to their action spaces:
  • Discrete(n) when agent.multi_action_mode is False
  • MultiDiscrete([n1, n2, ...]) when agent.multi_action_mode is True
The planner is always keyed as "p". All spaces have dtype = np.int32.

Methods

reset()

wrapper.reset()
Gym-style alias for reset_all_envs(). Use this in CPU mode or when writing code that is agnostic to the execution backend.

Returns

obs
dict
Initial observation dictionary, identical in structure to what env.reset() returns (after reformatting collated agent observations).

step()

wrapper.step(actions=None)
Gym-style alias for step_all_envs(). In CPU mode, requires actions. In GPU mode, steps happen on the CUDA device and None is returned.

Parameters

actions
dict
default:"None"
Dictionary {agent_idx: action} of agent actions. Required in CPU mode. Ignored in GPU mode (actions are managed on-device).

Returns

result
tuple | None
In CPU mode: (obs, rew, done, info) tuple with the same structure as BaseEnvironment.step().In GPU mode: None — arrays are updated in place on the device.

reset_all_envs()

wrapper.reset_all_envs()
Full reset implementation.
  • If reset_on_host is True: calls obs_at_reset() on the CPU. In GPU mode this also copies all data tensors to the device (expanding along the environment dimension for num_envs > 1) and then sets reset_on_host = False so all future resets run on the GPU.
  • If reset_on_host is False (GPU mode only): invokes CUDAEnvironmentReset with mode="force_reset" and returns an empty dict.

Returns

obs
dict
Initial observation dictionary on the first (CPU) reset. An empty dictionary {} for subsequent GPU resets.

reset_only_done_envs()

wrapper.reset_only_done_envs()
GPU-only method. Selectively resets only those parallel environments whose done flag is True, leaving others running. Requires use_cuda=True and a completed first reset (reset_on_host=False).
Calling this method in CPU mode or before the first GPU reset raises an AssertionError.

Returns

obs
dict
Always returns an empty dictionary {} — done-env resets modify device arrays in place.

step_all_envs()

wrapper.step_all_envs(actions=None)
Full step implementation.
  • CPU mode: requires actions. Calls env.step(actions), reformats collated observations and rewards, and returns (obs, rew, done, info).
  • GPU mode: steps each component via component.component_step(), calls env.scenario_step(), then calls env.generate_rewards(). Returns None.

Parameters

actions
dict
default:"None"
Action dictionary. Must be provided in CPU mode; ignored in GPU mode.

Returns

result
tuple | None
(obs, rew, done, info) in CPU mode; None in GPU mode.

obs_at_reset()

wrapper.obs_at_reset()
Calls env.reset() on the CPU and applies _reformat_obs(). Used internally during construction to build the observation space and during the first GPU reset to populate device data.

Returns

obs
dict
Initial observation dictionary after reformatting.

GPU mode

GPU-accelerated simulation requires the rl-warp-drive package and a CUDA-capable GPU.
pip install rl-warp-drive
The wrapper detects available GPUs at import time using GPUtil. If no GPUs are found, all simulation falls back to CPU automatically.
wrapper = FoundationEnvWrapper(env_obj=env_obj, use_cuda=False)
obs = wrapper.reset()
obs, rew, done, info = wrapper.step(actions)
In GPU mode, step() returns None. Observations and rewards are written directly to device memory and must be read via the CUDADataManager — they are not returned as Python dicts.

Internal helper methods

_reformat_obs()

wrapper._reformat_obs(obs: dict) -> dict
If the environment uses collated observations (key "a" present), expands each agent’s slice into a separate str(agent_id) key and removes the "a" key. Returns the reformatted dictionary unchanged otherwise.

_reformat_rew()

wrapper._reformat_rew(rew: dict) -> dict
If the environment uses collated rewards (key "a" present), expands each agent’s scalar into a separate str(agent_id) key and removes the "a" key. Returns the dictionary unchanged otherwise.