AFABL Agents

An AFABL agent acts in a particular world, is composed of independent behavior modules pursuing their own continuing goals, and has a central command arbitrator that uses an agent level reward function to learn when it should listen to each module. As the code above shows, AFABL allows programmers to express these components concisely, with very little cognitive distance between the concepts that make up the agent and the code that represents them. As with modules, AfablAgent uses features of the Scala programming language to make the syntax more convenient. For example, there is only one explicit type annotation in the AFABL bunny agent code below, but behind the scenes a carefully written factory method in the companion object allows Scala’s static type inferencer to infer type parameters of the AfablAgent constructor, return types for anonymous functions, and assign a concrete type value to a path-dependent abstract type variable. Figuring out all this stuff and wrestling with Scala’s type checker directly is not easy. Writing an AFABL agent is easy.

The final step in creating a Pac Man agent in AFABL is to combine the modules we created above and add an agent level reward function.

val pacMan = AfablAgent(
  world = new PacManWorld,

  modules = Seq(findFood, avoidGhost, findCherries),

  agentLevelReward = (state: PacManState) => {
    if (state.pacMan == state.ghost) 0.0
    else if (state.food.contains(state.pacMan)) 1.0
    else if (state.cherries.contains(state.pacMan)) 1.0
    else if (state.cherryActive && (state.pacMan == state.ghost)) 10.0
    else 0.1
  }
)

The agent level reward function helps the AFABL agent learn a control policy akin to the brain’s executive function. Each module has a selfish preferred action in each state, and often these modules are at odds with one another. The command arbitrator – or executive function – uses the agent level reward function to learn how to prioritize these modules in each state. For example, the Pac Man agent would learn to listen to the avoidGhost module when the ghost is near, the findFood module most of the time, and it would learn to listen to findCherries in states where it’s possible to eat a cherry and then eat a ghost.

The complete code for an AFABL Pac Man agent is given below. This code would normally be in a single file.

case class FindFoodState(pacMan: Location, food: Seq[Location])

val findFood = AfablModule(
  world = new PacManWorld,

  stateAbstraction = (worldState: PacManState) => {
    FindFoodState(worldState.pacMan, worldState.food)
  },

  moduleReward = (moduleState: FindFoodState) => {
    if (moduleState.food.contains(moduleState.pacMan)) 1.0
    else -0.1
  }
)

case class AvoidGhostState(pacMan: Location, ghost: Location)

val avoidGhost = AfablModule(
  world = new AvoidGhostWorld,

  stateAbstraction = (worldState: AvoidGhostState) => {
    AvoidGhostState(worldState.pacMan, worldState.ghost)
  },

  moduleReward = (moduleState: AvoidGhostState) => {
    if (moduleState.pacMan == moduleState.ghost) -1.0
    else 0.5
  }
)

case class FindCherriesState(pacMan: Location, cherries: Seq[Location])

val findCherries = AfablModule(
  world = new PacManWorld,

  stateAbstraction = (worldState: PacManState) => {
    FindCherriesState(worldState.pacMan, worldState.cherries)
  },

  moduleReward = (moduleState: FindCherriesState) => {
    if (moduleState.cherries.contains(moduleState.pacMan)) 1.0
    else if (state.cherryActive && (state.pacMan == state.ghost)) 10.0
    else -0.1
  }
)

val pacMan = AfablAgent(
  world = new PacManWorld,

  modules = Seq(findFood, avoidGhost, findCherries),

  agentLevelReward = (state: PacManState) => {
    if (state.pacMan == state.ghost) 0.0
    else if (state.food.contains(state.pacMan)) 1.0
    else if (state.cherries.contains(state.pacMan)) 1.0
    else if (state.cherryActive && (state.pacMan == state.ghost)) 10.0
    else 0.1
  }
)