赞
踩
这回从解读step函数中的这两句代码开始,返回的action是真正做出的行为
- gen_action = self.priority[action_type]
- action = gen_action(obj_id, valid_actions[action_type])
追到self.priority 结果是一套定义
- self.priority = {
- ActionType.Occupy: self.gen_occupy,
- ActionType.Shoot: self.gen_shoot,
- ActionType.GuideShoot: self.gen_guide_shoot,
- ActionType.JMPlan: self.gen_jm_plan,
- ActionType.LayMine: self.gen_lay_mine,
- ActionType.ActivateRadar: self.gen_activate_radar,
- ActionType.ChangeAltitude: self.gen_change_altitude,
- ActionType.GetOn: self.gen_get_on,
- ActionType.GetOff: self.gen_get_off,
- ActionType.Fork: self.gen_fork,
- ActionType.Union: self.gen_union,
- ActionType.EnterFort: self.gen_enter_fort,
- ActionType.ExitFort: self.gen_exit_fort,
- ActionType.Move: self.gen_move,
- ActionType.RemoveKeep: self.gen_remove_keep,
- ActionType.ChangeState: self.gen_change_state,
- ActionType.StopMove: self.gen_stop_move,
- ActionType.WeaponLock: self.gen_WeaponLock,
- ActionType.WeaponUnFold: self.gen_WeaponUnFold,
- ActionType.CancelJMPlan: self.gen_cancel_JM_plan
- } # choose action by priority
仔细看一下,原来是类似于函数指针的写法,将一堆变量指向了一堆函数,然后在代码里定义了诸多的函数。
比如……gen_move函数,就是得到一个路径列表的返回值。
- def gen_move(self, obj_id, candidate):
- """Generate move action to a random city."""
- bop = self.get_bop(obj_id)
- if bop["sub_type"] == 3:
- return
- destination = random.choice(
- [city["coord"] for city in self.observation["cities"]]
- )
- if self.my_direction:
- destination = self.my_direction["info"]["target_pos"]
- if bop and bop["cur_hex"] != destination:
- move_type = self.get_move_type(bop)
- route = self.map.gen_move_route(bop["cur_hex"], destination, move_type)
- return {
- "actor": self.seat,
- "obj_id": obj_id,
- "type": ActionType.Move,
- "move_path": route,
- }
- 获取实体的当前位置(
bop
)。- 如果实体的子类型为3,则直接返回一个空操作,因为该实体无法执行移动操作。
- 随机选择一个城市作为目的地。
- 如果机器人和目的地之间存在路径,则生成一个移动操作,其中
actor
表示执行该操作的实体(即self.seat
),obj_id
表示执行该操作的实体ID,type
表示动作类型为ActionType.Move
,move_path
表示实体的移动路径。
这里map.gen_move_route函数和self.get_move_type函数又引用自其他地方编写的。
——
上一篇已经写了,调用起来就是遍历单位、找到合理的动作,再去使用编写的获取具体哪个动作的函数。
- # loop all bops and their valid actions
- for obj_id, valid_actions in observation["valid_actions"].items():
- if obj_id not in self.controllable_ops:
- continue
- for (
- action_type
- ) in self.priority: # 'dict' is order-preserving since Python 3.6
- if action_type not in valid_actions:
- continue
- # find the action generation method based on type
- gen_action = self.priority[action_type]
- action = gen_action(obj_id, valid_actions[action_type])
- if action:
- total_actions.append(action)
- break # one action per bop at a time
- # if total_actions:
- # print(
- # f'{self.color} actions at step: {observation["time"]["cur_step"]}', end='\n\t')
- # print(total_actions)
- return total_actions
就是上面这段,重点是本篇博文开始时提到的那两行。
基本流程至此都看明白了,那么如何编写一个策略呢?
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。