`entity_gym.env`

The environment module defines the core interfaces that make up an Environment.

Actions

Actions are how agents interact with the environment. There are three parts to every action:

ActionSpace defines the shape of the action. For example, a categorical action space consisting of the 4 discrete choices “up”, “down”, “left”, and “right”.
ActionMask is used to further constrain the available actions on a specific timestep. For example, only “up” and “down” may be available some timestep.
Action represent the actual action that is chosen by an agent. For example, the “down” action may have been chosen.

There are currently three different action spaces:

GlobalCategoricalActionSpace allows the agent to choose a single option from a discrete set of actions.
CategoricalActionSpace allows multiple entities to choose a single option from a discrete set of actions.
SelectEntityActionSpace allows multiple entities to choose another entity.

Observations

Observations are how agents receive information from the environment. Each Environment must define an ObsSpace, which specifies the shape of the observations returned by this environment. On each timestep, the environment returns an Observation object, which contains all the entities and features that are visible to the agent.

Classes

Environment: Abstract base class for all environments.
ObsSpace: Defines what features can be observed by an agent.
Entity: Defines the set of features for an entity type.
Observation: Observation returned by the environment on one timestep.
EntityName: str(object=’’) -> str
ActionName: str(object=’’) -> str
CategoricalActionSpace: Defines a discrete set of actions that can be taken by multiple entities.
CategoricalAction: Outcome of a categorical action.
CategoricalActionMask: Action mask for categorical action that specifies which agents can perform the action,
GlobalCategoricalActionSpace: Defines a discrete set of actions that can be taken on each timestep.
GlobalCategoricalAction: Outcome of a global categorical action.
GlobalCategoricalActionMask: Action mask for global categorical action.
SelectEntityAction: Outcome of a select entity action.
SelectEntityActionSpace: Allows multiple entities to each select another entity.
SelectEntityActionMask: Action mask for select entity action that specifies which agents can perform the action,
VecEnv: Interface for vectorized environments. The main goal of VecEnv is to allow
EnvList: Interface for vectorized environments. The main goal of VecEnv is to allow
ParallelEnvList: We fork the subprocessing from the stable-baselines implementation, but use RaggedBuffers for collecting batches
VecObs: A batch of observations from a vectorized environment.
VecCategoricalActionMask: VecCategoricalActionMask(actors: RaggedBufferI64, mask: Optional[RaggedBufferBool])
VecSelectEntityActionMask: VecSelectEntityActionMask(actors: RaggedBufferI64, actees: RaggedBufferI64)
ValidatingEnv: Abstract base class for all environments.
AddMetricsWrapper: Interface for vectorized environments. The main goal of VecEnv is to allow

class entity_gym.env.Environment

Abstract base class for all environments.

Inheritance

abstract act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) → entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters: actions – Maps the name of each action type to the action to perform.

act_filter(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]], obs_filter: entity_gym.env.environment.ObsSpace) → entity_gym.env.environment.Observation: Performs the given action and returns the resulting observation. Any entities or features that are not present in the filter are removed from the observation.

abstract action_space() → Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]: Defines the types of actions that can be taken in the environment.

close() → None: Closes the environment.

abstract obs_space() → entity_gym.env.environment.ObsSpace: Defines the shape of observations returned by the environment.

render(**kwargs: Any) → numpy.ndarray[Any, numpy.dtype[numpy.uint8]]

Renders the environment.

Parameters: kwargs – a dictionary of arguments to send to the rendering process

abstract reset() → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation.

reset_filter(obs_filter: entity_gym.env.environment.ObsSpace) → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation. Any entities or features that are not present in the filter are removed from the observation.

class entity_gym.env.ObsSpace(global_features: typing.List[str] = <factory>, entities: typing.Dict[str, entity_gym.env.environment.Entity] = <factory>)

Defines what features can be observed by an agent.

Inheritance

entities: Dict[str, entity_gym.env.environment.Entity]: Defines the types of entities that can be observed. On a given timestep, an Observation may contain multiple entities of each type.

global_features: List[str]: fixed size list of features that are observable on each timestep

class entity_gym.env.Entity(features: List[str])

Defines the set of features for an entity type.

Inheritance

class entity_gym.env.Observation(*, done: bool, reward: float, visible: Optional[Mapping[str, Union[numpy.ndarray[Any, numpy.dtype[numpy.bool_]], Sequence[bool]]]] = None, entities: Optional[Mapping[str, Optional[Union[numpy.ndarray[Any, numpy.dtype[numpy.float32]], Sequence[Sequence[float]], Tuple[Union[numpy.ndarray[Any, numpy.dtype[numpy.float32]], Sequence[Sequence[float]]], Sequence[Any]]]]]] = None, features: Optional[Mapping[str, Union[numpy.ndarray[Any, numpy.dtype[numpy.float32]], Sequence[Sequence[float]]]]] = None, ids: Optional[Mapping[str, Sequence[Any]]] = None, global_features: Optional[Union[numpy.ndarray[Any, numpy.dtype[numpy.float32]], Sequence[float]]] = None, actions: Optional[Mapping[str, Union[entity_gym.env.action.CategoricalActionMask, entity_gym.env.action.SelectEntityActionMask, entity_gym.env.action.GlobalCategoricalActionMask]]] = None, metrics: Optional[Dict[str, float]] = None)

Observation returned by the environment on one timestep.

Parameters

features – Maps each entity type to a list of features for the entities of that type.
actions – Maps each action type to an ActionMask specifying which entities can perform the action.
reward – Reward received on this timestep.
done – Whether the episode has ended.
ids – Maps each entity type to a list of entity ids for the entities of that type.
visible – Optional mask for each entity type that prevents the policy but not the value function from observing certain entities.

Inheritance

entity_gym.env.EntityName: alias of str

entity_gym.env.ActionName: alias of str

class entity_gym.env.CategoricalActionSpace(index_to_label: List[str])

Defines a discrete set of actions that can be taken by multiple entities.

Inheritance

index_to_label: List[str]: list of human-readable labels for each action

class entity_gym.env.CategoricalAction(actors: Sequence[Any], indices: numpy.ndarray[Any, numpy.dtype[numpy.int64]], index_to_label: List[str], probs: Optional[numpy.ndarray[Any, numpy.dtype[numpy.float32]]] = None)

Outcome of a categorical action.

Inheritance

actors: Sequence[Any]: the ids of the entities that chose the actions

index_to_label: List[str]: mapping from action indices to human readable labels

indices: numpy.ndarray[Any, numpy.dtype[numpy.int64]]: the indices of the actions that were chosen

property labels: List[str]: the human readable labels of the actions that were performed

probs: Optional[numpy.ndarray[Any, numpy.dtype[numpy.float32]]] = None: the probablity assigned to each action by each agent

class entity_gym.env.CategoricalActionMask(actor_ids: Optional[Sequence[Any]] = None, actor_types: Optional[Sequence[str]] = None, mask: Optional[Union[Sequence[Sequence[bool]], numpy.ndarray]] = None)

Action mask for categorical action that specifies which agents can perform the action, and includes a dense mask that further constraints the choices available to each agent.

Inheritance

actor_ids: Optional[Sequence[Any]] = None: The ids of the entities that can perform the action. If None, all entities can perform the action. Mutually exclusive with actor_types.

actor_types: Optional[Sequence[str]] = None: The types of the entities that can perform the action. If None, all entities can perform the action. Mutually exclusive with actor_ids.

mask: Optional[Union[Sequence[Sequence[bool]], numpy.ndarray]] = None: A boolean array of shape (len(actor_ids), len(choices)) that prevents specific actions from being available to certain entities. If mask[i, j] is True, then the entity with id actor_ids[i] can perform action j.

class entity_gym.env.GlobalCategoricalActionSpace(index_to_label: List[str])

Defines a discrete set of actions that can be taken on each timestep.

For example, the following actions space allows the agent to choose between four actions “up”, “down”, “left”, and “right”:

GlobalCategoricalActionSpace(["up", "down", "left", "right"])

Inheritance

index_to_label: List[str]: list of human-readable labels for each action

class entity_gym.env.GlobalCategoricalAction(index: int, label: str, probs: Optional[numpy.ndarray[Any, numpy.dtype[numpy.float32]]] = None)

Outcome of a global categorical action.

Inheritance

index: int: the index of the action that was chosen

label: str: the human readable label of the action that was chosen

probs: Optional[numpy.ndarray[Any, numpy.dtype[numpy.float32]]] = None: the probablity assigned to the action by each agent

class entity_gym.env.GlobalCategoricalActionMask(mask: Optional[Union[Sequence[Sequence[bool]], numpy.ndarray]] = None)

Action mask for global categorical action.

Inheritance

mask: Optional[Union[Sequence[Sequence[bool]], numpy.ndarray]] = None: An optional boolean array of shape (len(choices),). If mask[i] is True, then action choice i can be performed.

class entity_gym.env.SelectEntityAction(actors: Sequence[Any], actees: Sequence[Any], probs: Optional[numpy.ndarray[Any, numpy.dtype[numpy.float32]]] = None)

Outcome of a select entity action.

Inheritance

actees: Sequence[Any]: the ids of the entities that were selected by the actors

actors: Sequence[Any]: the ids of the entities that chose the action

probs: Optional[numpy.ndarray[Any, numpy.dtype[numpy.float32]]] = None: the probablity assigned to each selection by each agent

class entity_gym.env.SelectEntityActionSpace

Allows multiple entities to each select another entity.

Inheritance

class entity_gym.env.SelectEntityActionMask(actor_ids: Optional[Sequence[Any]] = None, actor_types: Optional[Sequence[str]] = None, actee_types: Optional[Sequence[str]] = None, actee_ids: Optional[Sequence[Any]] = None, mask: Optional[numpy.ndarray[Any, numpy.dtype[numpy.bool_]]] = None)

Action mask for select entity action that specifies which agents can perform the action, and includes a dense mask that further constraints what other entities can be selected by each actor.

Inheritance

actee_ids: Optional[Sequence[Any]] = None: The ids of the entities of each type that can be selected by each actor. If None, all entities can be selected by each actor.

actee_types: Optional[Sequence[str]] = None: The types of entities that can be selected by each actor. If None, all entities types can be selected by each actor.

actor_ids: Optional[Sequence[Any]] = None: The ids of the entities that can perform the action. If None, all entities can perform the action.

actor_types: Optional[Sequence[str]] = None: The types of the entities that can perform the action. If None, all entities can perform the action.

mask: Optional[numpy.ndarray[Any, numpy.dtype[numpy.bool_]]] = None: An boolean array of shape (len(actor_ids), len(actee_ids)). If mask[i, j] is True, then the agent with id actor_ids[i] can select entity with id actee_ids[j]. (NOT CURRENTLY IMPLEMENTED)

class entity_gym.env.VecEnv

Interface for vectorized environments. The main goal of VecEnv is to allow for maximally efficient environment implementations.

Inheritance

abstract act(actions: Mapping[str, RaggedBufferI64], obs_filter: entity_gym.env.environment.ObsSpace) → entity_gym.env.vec_env.VecObs: Performs the given actions on the underlying environments and returns the resulting observations. Any environment that reaches the end of its episode is reset and returns the initial observation of the next episode.

abstract action_space() → Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]: Returns a dictionary mapping the name of actions to their action space.

abstract obs_space() → entity_gym.env.environment.ObsSpace: Returns a dictionary mapping the name of observable entities to their type.

abstract reset(obs_config: entity_gym.env.environment.ObsSpace) → entity_gym.env.vec_env.VecObs: Resets all environments and returns the initial observations.

class entity_gym.env.EnvList(create_env: Callable[[], entity_gym.env.environment.Environment], num_envs: int)

Inheritance

act(actions: Mapping[str, RaggedBufferI64], obs_space: entity_gym.env.environment.ObsSpace) → entity_gym.env.vec_env.VecObs: Performs the given actions on the underlying environments and returns the resulting observations. Any environment that reaches the end of its episode is reset and returns the initial observation of the next episode.

action_space() → Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]: Returns a dictionary mapping the name of actions to their action space.

obs_space() → entity_gym.env.environment.ObsSpace: Returns a dictionary mapping the name of observable entities to their type.

reset(obs_space: entity_gym.env.environment.ObsSpace) → entity_gym.env.vec_env.VecObs: Resets all environments and returns the initial observations.

class entity_gym.env.ParallelEnvList(create_env: Callable[[], entity_gym.env.environment.Environment], num_envs: int, num_processes: int, start_method: Optional[str] = None)

We fork the subprocessing from the stable-baselines implementation, but use RaggedBuffers for collecting batches

Citation here: https://github.com/DLR-RM/stable-baselines3/blob/master/CITATION.bib

Inheritance

act(actions: Mapping[str, RaggedBufferI64], obs_space: entity_gym.env.environment.ObsSpace) → entity_gym.env.vec_env.VecObs: Performs the given actions on the underlying environments and returns the resulting observations. Any environment that reaches the end of its episode is reset and returns the initial observation of the next episode.

action_space() → Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]: Returns a dictionary mapping the name of actions to their action space.

obs_space() → entity_gym.env.environment.ObsSpace: Returns a dictionary mapping the name of observable entities to their type.

reset(obs_space: entity_gym.env.environment.ObsSpace) → entity_gym.env.vec_env.VecObs: Resets all environments and returns the initial observations.

class entity_gym.env.VecObs(features: Dict[str, RaggedBufferF32], visible: Dict[str, RaggedBufferBool], action_masks: Dict[str, Union[entity_gym.env.vec_env.VecCategoricalActionMask, entity_gym.env.vec_env.VecSelectEntityActionMask]], reward: numpy.ndarray[Any, numpy.dtype[numpy.float32]], done: numpy.ndarray[Any, numpy.dtype[numpy.bool_]], metrics: Dict[str, entity_gym.env.vec_env.Metric])

A batch of observations from a vectorized environment.

Inheritance

class entity_gym.env.VecCategoricalActionMask(actors: RaggedBufferI64, mask: Optional[RaggedBufferBool]): Inheritance

class entity_gym.env.VecSelectEntityActionMask(actors: RaggedBufferI64, actees: RaggedBufferI64): Inheritance

class entity_gym.env.ValidatingEnv(env: entity_gym.env.environment.Environment)

Inheritance

act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) → entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters: actions – Maps the name of each action type to the action to perform.

action_space() → Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]: Defines the types of actions that can be taken in the environment.

obs_space() → entity_gym.env.environment.ObsSpace: Defines the shape of observations returned by the environment.

render(**kwargs: Any) → numpy.ndarray[Any, numpy.dtype[numpy.uint8]]

Renders the environment.

Parameters: kwargs – a dictionary of arguments to send to the rendering process

reset() → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation.

class entity_gym.env.AddMetricsWrapper(env: entity_gym.env.vec_env.VecEnv, filter: Optional[numpy.ndarray[Any, numpy.dtype[numpy.bool_]]] = None)

Inheritance

act(actions: Mapping[str, RaggedBufferI64], obs_filter: entity_gym.env.environment.ObsSpace) → entity_gym.env.vec_env.VecObs: Performs the given actions on the underlying environments and returns the resulting observations. Any environment that reaches the end of its episode is reset and returns the initial observation of the next episode.

action_space() → Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]: Returns a dictionary mapping the name of actions to their action space.

obs_space() → entity_gym.env.environment.ObsSpace: Returns a dictionary mapping the name of observable entities to their type.

reset(obs_config: entity_gym.env.environment.ObsSpace) → entity_gym.env.vec_env.VecObs: Resets all environments and returns the initial observations.

entity_gym.env.EntityID

Special type indicating an unconstrained type.

Any is compatible with every type.
Any assumed to have all methods.
All values assumed to be instances of Any.

Note that all the above statements are true from the point of view of static type checkers. At runtime, Any should not be used with instance or class checks.

typing.Any

entity_gym.env.Action

typing.Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]

alias of Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]

entity_gym.env.ActionSpace

typing.Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]

alias of Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]

entity_gym.env.ActionMask

typing.Union[entity_gym.env.action.CategoricalActionMask, entity_gym.env.action.SelectEntityActionMask, entity_gym.env.action.GlobalCategoricalActionMask]

alias of Union[entity_gym.env.action.CategoricalActionMask, entity_gym.env.action.SelectEntityActionMask, entity_gym.env.action.GlobalCategoricalActionMask]

entity_gym.env.VecActionMask

typing.Union[entity_gym.env.vec_env.VecCategoricalActionMask, entity_gym.env.vec_env.VecSelectEntityActionMask]

alias of Union[entity_gym.env.vec_env.VecCategoricalActionMask, entity_gym.env.vec_env.VecSelectEntityActionMask]

entity_gym.env

Actions

Observations

Classes

Variables

`entity_gym.env`