相关文章推荐
有腹肌的大白菜  ·  使用curl / ...·  1 年前    · 
呐喊的洋葱  ·  mysql ...·  1 年前    · 
玩足球的企鹅  ·  7.6.4 ...·  1 年前    · 

对比学习+多智能体强化学习 相关工作总结

Consensus Learning for Cooperative Multi-Agent Reinforcement Learning

AAAI 2023

different agents can infer the same consensus in discrete space

Contrastive Identity-Aware Learning for Multi-Agent Value Decomposition

AAAI 2023

leverages contrastive learning to maximize the mutual information between the temporal credits and identity representations of different agents

论文代码相比pymarl的QMIX,只有两处改动:(代码风格不错,而且有注释,赞一个)

learner文件中,在过完mixing网络新增了计算对比学习loss部分;注意最后一行的loss是nn.CrossEntropyLoss()

arange([start,] stop[, step,], dtype=None) 用于生成等差数列

在qmix文件中,返回值多了一项梯度(temporal credit attribution)

Learning to Ground Decentralized Multi-Agent Communication with Contrastive Learning

arxiv挂出来的文章,应该还不是最终版

consider the communicative messages sent between agents as different incomplete views of the environment state.

下面这篇笔记分类汇总了我在知乎上分享过的有价值的资料,主要是关于多智能体(深度)强化学习的内容。

编辑于 2023-04-27 17:39 ・IP 属地北京

文章被以下专栏收录