entity_gym.examples

Classes

  • MoveToOrigin: Task with a single Spaceship that is rewarded for moving as close to the origin as possible.

  • CherryPick: The CherryPick environment is initialized with a list of 32 cherries of random quality.

  • PickMatchingBalls: The PickMatchingBalls environment is initialized with a list of 32 balls of different colors.

  • Minefield: Task with a Vehicle entity that has to reach a target point, receiving a reward of 1.

  • MultiSnake: Turn-based version of Snake with multiple snakes.

  • MultiArmedBandit: Task with single cateorical action with 5 choices which gives a reward of 1 for choosing action 0 and reward of 0 otherwise.

  • NotHotdog: On each timestep, there is either a generic “Object” entity with a is_hotdog property, or a “Hotdog” object.

  • Xor: There are three entities types, each with one instance on each timstep.

  • Count: There are between 0 and 10 “Bean” entities.

  • FloorIsLava: The player is surrounded by 8 tiles, 7 of which are lava and 1 of which is high ground.

  • MineSweeper: The MineSweeper environment contains two types of objects, mines and robots.

  • RockPaperScissors: This environment tests giving additional information to the value function

  • TreasureHunt: Abstract base class for all environments.

class entity_gym.examples.MoveToOrigin(x_pos: float = 0.0, y_pos: float = 0.0, x_velocity: float = 0.0, y_velocity: float = 0.0, step: int = 0)

Task with a single Spaceship that is rewarded for moving as close to the origin as possible. The Spaceship has two actions for accelerating the Spaceship in the x and y directions.

Inheritance

Inheritance diagram of MoveToOrigin
act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters

actions – Maps the name of each action type to the action to perform.

action_space() Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]

Defines the types of actions that can be taken in the environment.

obs_space() entity_gym.env.environment.ObsSpace

Defines the shape of observations returned by the environment.

reset() entity_gym.env.environment.Observation

Resets the environment and returns the initial observation.

class entity_gym.examples.CherryPick(num_cherries: int = 32, cherries: typing.List[float] = <factory>, last_reward: float = 0.0, step: int = 0)

The CherryPick environment is initialized with a list of 32 cherries of random quality. On each timestep, the player can pick up one of the cherries. The player receives a reward of the quality of the cherry picked. The environment ends after 16 steps. The quality of the top 16 cherries is normalized so that the maximum total achievable reward is 1.0.

Inheritance

Inheritance diagram of CherryPick
act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters

actions – Maps the name of each action type to the action to perform.

action_space() Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]

Defines the types of actions that can be taken in the environment.

obs_space() entity_gym.env.environment.ObsSpace

Defines the shape of observations returned by the environment.

reset() entity_gym.env.environment.Observation

Resets the environment and returns the initial observation.

class entity_gym.examples.PickMatchingBalls(max_balls: int = 32, balls: typing.List[entity_gym.examples.pick_matching_balls.Ball] = <factory>, one_hot: bool = False, randomize: bool = False)

The PickMatchingBalls environment is initialized with a list of 32 balls of different colors. On each timestamp, the player can pick up one of the balls. The episode ends when the player picks up a ball of a different color from the last one. The player receives a reward equal to the number of balls picked up divided by the maximum number of balls of the same color.

Inheritance

Inheritance diagram of PickMatchingBalls
act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters

actions – Maps the name of each action type to the action to perform.

action_space() Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]

Defines the types of actions that can be taken in the environment.

obs_space() entity_gym.env.environment.ObsSpace

Defines the shape of observations returned by the environment.

reset() entity_gym.env.environment.Observation

Resets the environment and returns the initial observation.

class entity_gym.examples.Minefield(vehicle: entity_gym.examples.minefield.Vehicle = <factory>, target: entity_gym.examples.minefield.Target = <factory>, mine: entity_gym.examples.minefield.Mine = <factory>, max_mines: int = 10, max_steps: int = 200, translate: bool = False, width: float = 200.0)

Task with a Vehicle entity that has to reach a target point, receiving a reward of 1. If the vehicle collides with any of the randomly placed mines, the episode ends without reward. The available actions either turn the vehicle left, right, or go straight.

Inheritance

Inheritance diagram of Minefield
act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters

actions – Maps the name of each action type to the action to perform.

act_filter(action: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]], obs_filter: entity_gym.env.environment.ObsSpace) entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation. Any entities or features that are not present in the filter are removed from the observation.

action_space() Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]

Defines the types of actions that can be taken in the environment.

obs_space() entity_gym.env.environment.ObsSpace

Defines the shape of observations returned by the environment.

reset() entity_gym.env.environment.Observation

Resets the environment and returns the initial observation.

reset_filter(obs_space: entity_gym.env.environment.ObsSpace) entity_gym.env.environment.Observation

Resets the environment and returns the initial observation. Any entities or features that are not present in the filter are removed from the observation.

class entity_gym.examples.MultiSnake(board_size: int = 10, num_snakes: int = 2, num_players: int = 1, max_snake_length: int = 11, max_steps: int = 180)

Turn-based version of Snake with multiple snakes. Each snake has a different color. For each snake, Food of that color is placed randomly on the board. Snakes can only eat Food of their color. When a snake eats Food of the same color, it grows by one unit. When a snake grows and it’s length was less than 11, the player receives a reward of 0.1 / num_snakes. The game ends when a snake collides with another snake, runs into a wall, eats Food of another color, or all snakes reach a length of 11.

Inheritance

Inheritance diagram of MultiSnake
act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters

actions – Maps the name of each action type to the action to perform.

action_space() Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]

Defines the types of actions that can be taken in the environment.

obs_space() entity_gym.env.environment.ObsSpace

Defines the shape of observations returned by the environment.

reset() entity_gym.env.environment.Observation

Resets the environment and returns the initial observation.

class entity_gym.examples.MultiArmedBandit

Task with single cateorical action with 5 choices which gives a reward of 1 for choosing action 0 and reward of 0 otherwise.

Inheritance

Inheritance diagram of MultiArmedBandit
act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters

actions – Maps the name of each action type to the action to perform.

action_space() Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]

Defines the types of actions that can be taken in the environment.

obs_space() entity_gym.env.environment.ObsSpace

Defines the shape of observations returned by the environment.

reset() entity_gym.env.environment.Observation

Resets the environment and returns the initial observation.

class entity_gym.examples.NotHotdog

On each timestep, there is either a generic “Object” entity with a is_hotdog property, or a “Hotdog” object. The “Player” entity is always present, and has an action to classify the other entity as hotdog or not hotdog.

Inheritance

Inheritance diagram of NotHotdog
act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters

actions – Maps the name of each action type to the action to perform.

action_space() Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]

Defines the types of actions that can be taken in the environment.

obs_space() entity_gym.env.environment.ObsSpace

Defines the shape of observations returned by the environment.

reset() entity_gym.env.environment.Observation

Resets the environment and returns the initial observation.

class entity_gym.examples.Xor

There are three entities types, each with one instance on each timstep. The Bit1 and Bit2 entities are randomly set to 0 or 1. The Output entity has one action that should be set to the output of the XOR between the two bits.

Inheritance

Inheritance diagram of Xor
act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters

actions – Maps the name of each action type to the action to perform.

act_filter(action: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]], obs_filter: entity_gym.env.environment.ObsSpace) entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation. Any entities or features that are not present in the filter are removed from the observation.

action_space() Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]

Defines the types of actions that can be taken in the environment.

obs_space() entity_gym.env.environment.ObsSpace

Defines the shape of observations returned by the environment.

reset() entity_gym.env.environment.Observation

Resets the environment and returns the initial observation.

reset_filter(obs_space: entity_gym.env.environment.ObsSpace) entity_gym.env.environment.Observation

Resets the environment and returns the initial observation. Any entities or features that are not present in the filter are removed from the observation.

class entity_gym.examples.Count(masked_choices: int = 10)

There are between 0 and 10 “Bean” entities. The “Player” entity gets 1 reward for counting the correct number of beans and 0 otherwise.

This environment also randomly masks off some of the incorrect answers.

Masking by default allows all actions, which is equivalent to disabling masking.

Inheritance

Inheritance diagram of Count
act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters

actions – Maps the name of each action type to the action to perform.

act_filter(action: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]], obs_filter: entity_gym.env.environment.ObsSpace) entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation. Any entities or features that are not present in the filter are removed from the observation.

action_space() Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]

Defines the types of actions that can be taken in the environment.

obs_space() entity_gym.env.environment.ObsSpace

Defines the shape of observations returned by the environment.

reset() entity_gym.env.environment.Observation

Resets the environment and returns the initial observation.

reset_filter(obs_space: entity_gym.env.environment.ObsSpace) entity_gym.env.environment.Observation

Resets the environment and returns the initial observation. Any entities or features that are not present in the filter are removed from the observation.

class entity_gym.examples.FloorIsLava

The player is surrounded by 8 tiles, 7 of which are lava and 1 of which is high ground. The player must move to one of the tiles. The player receives a reward of 1 if they move to the high ground, and 0 otherwise.

Inheritance

Inheritance diagram of FloorIsLava
act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters

actions – Maps the name of each action type to the action to perform.

act_filter(action: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]], obs_filter: entity_gym.env.environment.ObsSpace) entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation. Any entities or features that are not present in the filter are removed from the observation.

action_space() Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]

Defines the types of actions that can be taken in the environment.

obs_space() entity_gym.env.environment.ObsSpace

Defines the shape of observations returned by the environment.

reset() entity_gym.env.environment.Observation

Resets the environment and returns the initial observation.

reset_filter(obs_space: entity_gym.env.environment.ObsSpace) entity_gym.env.environment.Observation

Resets the environment and returns the initial observation. Any entities or features that are not present in the filter are removed from the observation.

class entity_gym.examples.MineSweeper(width: int = 6, height: int = 6, nmines: int = 5, nrobots: int = 2, orbital_cannon: bool = False, cooldown_period: int = 5)

The MineSweeper environment contains two types of objects, mines and robots. The player controls all robots in the environment. On every step, each robot may move in one of four cardinal directions, or stay in place and defuse all adjacent mines. If a robot defuses a mine, it is removed from the environment. If a robot steps on a mine, it is removed from the environment and the player loses the game. The player wins the game when all mines are defused.

Inheritance

Inheritance diagram of MineSweeper
act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters

actions – Maps the name of each action type to the action to perform.

action_space() Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]

Defines the types of actions that can be taken in the environment.

obs_space() entity_gym.env.environment.ObsSpace

Defines the shape of observations returned by the environment.

reset() entity_gym.env.environment.Observation

Resets the environment and returns the initial observation.

class entity_gym.examples.RockPaperScissors(cheat: bool = False)

This environment tests giving additional information to the value function which can not be observed by the policy.

On each timestep, the opponent randomly chooses rock, paper or scissors with probability of 50%, 30% and 20% respectively. The value function can observe the opponent’s choice, but the policy can not. The agent must choose either rock, paper or scissors. If the agent beats the opponent, the agent receives a reward of 2.0, otherwise it receives a reward of 0.0. The optimal strategy is to always choose paper for an average reward of 1.0. Since the value function can observe the opponent’s choice, it can perfectly predict reward.

Inheritance

Inheritance diagram of RockPaperScissors
act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters

actions – Maps the name of each action type to the action to perform.

act_filter(action: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]], obs_filter: entity_gym.env.environment.ObsSpace) entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation. Any entities or features that are not present in the filter are removed from the observation.

action_space() Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]

Defines the types of actions that can be taken in the environment.

obs_space() entity_gym.env.environment.ObsSpace

Defines the shape of observations returned by the environment.

reset() entity_gym.env.environment.Observation

Resets the environment and returns the initial observation.

reset_filter(obs_space: entity_gym.env.environment.ObsSpace) entity_gym.env.environment.Observation

Resets the environment and returns the initial observation. Any entities or features that are not present in the filter are removed from the observation.

class entity_gym.examples.TreasureHunt

Inheritance

Inheritance diagram of TreasureHunt
act(actions: Mapping[str, Union[entity_gym.env.action.CategoricalAction, entity_gym.env.action.SelectEntityAction, entity_gym.env.action.GlobalCategoricalAction]]) entity_gym.env.environment.Observation

Performs the given action and returns the resulting observation.

Parameters

actions – Maps the name of each action type to the action to perform.

action_space() Dict[str, Union[entity_gym.env.action.CategoricalActionSpace, entity_gym.env.action.SelectEntityActionSpace, entity_gym.env.action.GlobalCategoricalActionSpace]]

Defines the types of actions that can be taken in the environment.

obs_space() entity_gym.env.environment.ObsSpace

Defines the shape of observations returned by the environment.

reset() entity_gym.env.environment.Observation

Resets the environment and returns the initial observation.