Matrix Stochastic Game with Q-learning for Multi-agent Systems Article uri icon