`entity_gym.examples`

Classes

Classes 

MoveToOrigin: Task with a single Spaceship that is rewarded for moving as close to the origin as possible.
CherryPick: The CherryPick environment is initialized with a list of 32 cherries of random quality.
PickMatchingBalls: The PickMatchingBalls environment is initialized with a list of 32 balls of different colors.
Minefield: Task with a Vehicle entity that has to reach a target point, receiving a reward of 1.
MultiSnake: Turn-based version of Snake with multiple snakes.
MultiArmedBandit: Task with single cateorical action with 5 choices which gives a reward of 1 for choosing action 0 and reward of 0 otherwise.
NotHotdog: On each timestep, there is either a generic “Object” entity with a is_hotdog property, or a “Hotdog” object.
Xor: There are three entities types, each with one instance on each timstep.
Count: There are between 0 and 10 “Bean” entities.
FloorIsLava: The player is surrounded by 8 tiles, 7 of which are lava and 1 of which is high ground.
MineSweeper: The MineSweeper environment contains two types of objects, mines and robots.
RockPaperScissors: This environment tests giving additional information to the value function
TreasureHunt: Abstract base class for all environments.

class entity_gym.examples.MoveToOrigin(x_pos: float = 0.0, y_pos: float = 0.0, x_velocity: float = 0.0, y_velocity: float = 0.0, step: int = 0)

Task with a single Spaceship that is rewarded for moving as close to the origin as possible. The Spaceship has two actions for accelerating the Spaceship in the x and y directions.

Inheritance

act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) → entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters: actions – Maps the name of each action type to the action to perform.

action_space() → Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]: Defines the types of actions that can be taken in the environment.

obs_space() → entity_gym.env.environment.ObsSpace: Defines the shape of observations returned by the environment.

reset() → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation.

class entity_gym.examples.CherryPick(num_cherries: int = 32, cherries: typing.List[float] = <factory>, last_reward: float = 0.0, step: int = 0)

The CherryPick environment is initialized with a list of 32 cherries of random quality. On each timestep, the player can pick up one of the cherries. The player receives a reward of the quality of the cherry picked. The environment ends after 16 steps. The quality of the top 16 cherries is normalized so that the maximum total achievable reward is 1.0.

Inheritance

act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) → entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters: actions – Maps the name of each action type to the action to perform.

action_space() → Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]: Defines the types of actions that can be taken in the environment.

obs_space() → entity_gym.env.environment.ObsSpace: Defines the shape of observations returned by the environment.

reset() → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation.

class entity_gym.examples.PickMatchingBalls(max_balls: int = 32, balls: typing.List[entity_gym.examples.pick_matching_balls.Ball] = <factory>, one_hot: bool = False, randomize: bool = False)

The PickMatchingBalls environment is initialized with a list of 32 balls of different colors. On each timestamp, the player can pick up one of the balls. The episode ends when the player picks up a ball of a different color from the last one. The player receives a reward equal to the number of balls picked up divided by the maximum number of balls of the same color.

Inheritance

act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) → entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters: actions – Maps the name of each action type to the action to perform.

action_space() → Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]: Defines the types of actions that can be taken in the environment.

obs_space() → entity_gym.env.environment.ObsSpace: Defines the shape of observations returned by the environment.

reset() → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation.

class entity_gym.examples.Minefield(vehicle: entity_gym.examples.minefield.Vehicle = <factory>, target: entity_gym.examples.minefield.Target = <factory>, mine: entity_gym.examples.minefield.Mine = <factory>, max_mines: int = 10, max_steps: int = 200, translate: bool = False, width: float = 200.0)

Task with a Vehicle entity that has to reach a target point, receiving a reward of 1. If the vehicle collides with any of the randomly placed mines, the episode ends without reward. The available actions either turn the vehicle left, right, or go straight.

Inheritance

act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) → entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters: actions – Maps the name of each action type to the action to perform.

act_filter(action: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]], obs_filter: entity_gym.env.environment.ObsSpace) → entity_gym.env.environment.Observation: Performs the given action and returns the resulting observation. Any entities or features that are not present in the filter are removed from the observation.

action_space() → Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]: Defines the types of actions that can be taken in the environment.

obs_space() → entity_gym.env.environment.ObsSpace: Defines the shape of observations returned by the environment.

reset() → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation.

reset_filter(obs_space: entity_gym.env.environment.ObsSpace) → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation. Any entities or features that are not present in the filter are removed from the observation.

class entity_gym.examples.MultiSnake(board_size: int = 10, num_snakes: int = 2, num_players: int = 1, max_snake_length: int = 11, max_steps: int = 180)

Turn-based version of Snake with multiple snakes. Each snake has a different color. For each snake, Food of that color is placed randomly on the board. Snakes can only eat Food of their color. When a snake eats Food of the same color, it grows by one unit. When a snake grows and it’s length was less than 11, the player receives a reward of 0.1 / num_snakes. The game ends when a snake collides with another snake, runs into a wall, eats Food of another color, or all snakes reach a length of 11.

Inheritance

act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) → entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters: actions – Maps the name of each action type to the action to perform.

action_space() → Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]: Defines the types of actions that can be taken in the environment.

obs_space() → entity_gym.env.environment.ObsSpace: Defines the shape of observations returned by the environment.

reset() → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation.

class entity_gym.examples.MultiArmedBandit

Task with single cateorical action with 5 choices which gives a reward of 1 for choosing action 0 and reward of 0 otherwise.

Inheritance

act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) → entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters: actions – Maps the name of each action type to the action to perform.

action_space() → Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]: Defines the types of actions that can be taken in the environment.

obs_space() → entity_gym.env.environment.ObsSpace: Defines the shape of observations returned by the environment.

reset() → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation.

class entity_gym.examples.NotHotdog

On each timestep, there is either a generic “Object” entity with a is_hotdog property, or a “Hotdog” object. The “Player” entity is always present, and has an action to classify the other entity as hotdog or not hotdog.

Inheritance

act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) → entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters: actions – Maps the name of each action type to the action to perform.

action_space() → Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]: Defines the types of actions that can be taken in the environment.

obs_space() → entity_gym.env.environment.ObsSpace: Defines the shape of observations returned by the environment.

reset() → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation.

class entity_gym.examples.Xor

There are three entities types, each with one instance on each timstep. The Bit1 and Bit2 entities are randomly set to 0 or 1. The Output entity has one action that should be set to the output of the XOR between the two bits.

Inheritance

act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) → entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters: actions – Maps the name of each action type to the action to perform.

act_filter(action: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]], obs_filter: entity_gym.env.environment.ObsSpace) → entity_gym.env.environment.Observation: Performs the given action and returns the resulting observation. Any entities or features that are not present in the filter are removed from the observation.

action_space() → Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]: Defines the types of actions that can be taken in the environment.

obs_space() → entity_gym.env.environment.ObsSpace: Defines the shape of observations returned by the environment.

reset() → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation.

reset_filter(obs_space: entity_gym.env.environment.ObsSpace) → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation. Any entities or features that are not present in the filter are removed from the observation.

class entity_gym.examples.Count(masked_choices: int = 10)

There are between 0 and 10 “Bean” entities. The “Player” entity gets 1 reward for counting the correct number of beans and 0 otherwise.

This environment also randomly masks off some of the incorrect answers.

Masking by default allows all actions, which is equivalent to disabling masking.

Inheritance

act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) → entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters: actions – Maps the name of each action type to the action to perform.

act_filter(action: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]], obs_filter: entity_gym.env.environment.ObsSpace) → entity_gym.env.environment.Observation: Performs the given action and returns the resulting observation. Any entities or features that are not present in the filter are removed from the observation.

action_space() → Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]: Defines the types of actions that can be taken in the environment.

obs_space() → entity_gym.env.environment.ObsSpace: Defines the shape of observations returned by the environment.

reset() → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation.

reset_filter(obs_space: entity_gym.env.environment.ObsSpace) → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation. Any entities or features that are not present in the filter are removed from the observation.

class entity_gym.examples.FloorIsLava

The player is surrounded by 8 tiles, 7 of which are lava and 1 of which is high ground. The player must move to one of the tiles. The player receives a reward of 1 if they move to the high ground, and 0 otherwise.

Inheritance

act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) → entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters: actions – Maps the name of each action type to the action to perform.

act_filter(action: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]], obs_filter: entity_gym.env.environment.ObsSpace) → entity_gym.env.environment.Observation: Performs the given action and returns the resulting observation. Any entities or features that are not present in the filter are removed from the observation.

action_space() → Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]: Defines the types of actions that can be taken in the environment.

obs_space() → entity_gym.env.environment.ObsSpace: Defines the shape of observations returned by the environment.

reset() → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation.

reset_filter(obs_space: entity_gym.env.environment.ObsSpace) → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation. Any entities or features that are not present in the filter are removed from the observation.

class entity_gym.examples.MineSweeper(width: int = 6, height: int = 6, nmines: int = 5, nrobots: int = 2, orbital_cannon: bool = False, cooldown_period: int = 5)

The MineSweeper environment contains two types of objects, mines and robots. The player controls all robots in the environment. On every step, each robot may move in one of four cardinal directions, or stay in place and defuse all adjacent mines. If a robot defuses a mine, it is removed from the environment. If a robot steps on a mine, it is removed from the environment and the player loses the game. The player wins the game when all mines are defused.

Inheritance

act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) → entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters: actions – Maps the name of each action type to the action to perform.

action_space() → Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]: Defines the types of actions that can be taken in the environment.

obs_space() → entity_gym.env.environment.ObsSpace: Defines the shape of observations returned by the environment.

reset() → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation.

class entity_gym.examples.RockPaperScissors(cheat: bool = False)

This environment tests giving additional information to the value function which can not be observed by the policy.

On each timestep, the opponent randomly chooses rock, paper or scissors with probability of 50%, 30% and 20% respectively. The value function can observe the opponent’s choice, but the policy can not. The agent must choose either rock, paper or scissors. If the agent beats the opponent, the agent receives a reward of 2.0, otherwise it receives a reward of 0.0. The optimal strategy is to always choose paper for an average reward of 1.0. Since the value function can observe the opponent’s choice, it can perfectly predict reward.

Inheritance

act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) → entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters: actions – Maps the name of each action type to the action to perform.

act_filter(action: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]], obs_filter: entity_gym.env.environment.ObsSpace) → entity_gym.env.environment.Observation: Performs the given action and returns the resulting observation. Any entities or features that are not present in the filter are removed from the observation.

action_space() → Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]: Defines the types of actions that can be taken in the environment.

obs_space() → entity_gym.env.environment.ObsSpace: Defines the shape of observations returned by the environment.

reset() → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation.

reset_filter(obs_space: entity_gym.env.environment.ObsSpace) → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation. Any entities or features that are not present in the filter are removed from the observation.

class entity_gym.examples.TreasureHunt

Inheritance

act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) → entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters: actions – Maps the name of each action type to the action to perform.

action_space() → Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]: Defines the types of actions that can be taken in the environment.

obs_space() → entity_gym.env.environment.ObsSpace: Defines the shape of observations returned by the environment.

reset() → entity_gym.env.environment.Observation: Resets the environment and returns the initial observation.

entity_gym.examples

Classes

`entity_gym.examples`

Classes 