### Abstract

We consider an agent-target assignment problem in an unknown environment modeled as an undirected graph. Agents incur cost or reward while traveling on the edges of this graph. Agents do not know the graph or the locations of the targets on it. However, they can obtain local information about these by local sensing and communicating with other agents within a limited range. To solve this problem, we come up with a new distributed algorithm that integrates Q-Learning and a distributed auction. The Q-Learning part helps estimate the assignment benefits calculated by summing up rewards over the graph edges for each agent-target pair, while the auction part takes care of assigning agents to targets in a distributed fashion. The algorithm is shown to terminate with a near-optimal assignment in a finite time. Optimality refers to the assignment benefit maximization, which can depend on a target-agent pair value, and the routing cost of the agent to visit the target.

Original language | English (US) |
---|---|

Title of host publication | 2010 49th IEEE Conference on Decision and Control, CDC 2010 |

Pages | 5450-5455 |

Number of pages | 6 |

DOIs | |

State | Published - Dec 1 2010 |

Event | 2010 49th IEEE Conference on Decision and Control, CDC 2010 - Atlanta, GA, United States Duration: Dec 15 2010 → Dec 17 2010 |

### Publication series

Name | Proceedings of the IEEE Conference on Decision and Control |
---|---|

ISSN (Print) | 0191-2216 |

### Other

Other | 2010 49th IEEE Conference on Decision and Control, CDC 2010 |
---|---|

Country | United States |

City | Atlanta, GA |

Period | 12/15/10 → 12/17/10 |

### Fingerprint

### All Science Journal Classification (ASJC) codes

- Control and Systems Engineering
- Modeling and Simulation
- Control and Optimization

### Cite this

*2010 49th IEEE Conference on Decision and Control, CDC 2010*(pp. 5450-5455). [5718180] (Proceedings of the IEEE Conference on Decision and Control). https://doi.org/10.1109/CDC.2010.5718180

}

*2010 49th IEEE Conference on Decision and Control, CDC 2010.*, 5718180, Proceedings of the IEEE Conference on Decision and Control, pp. 5450-5455, 2010 49th IEEE Conference on Decision and Control, CDC 2010, Atlanta, GA, United States, 12/15/10. https://doi.org/10.1109/CDC.2010.5718180

**A distributed joint-learning and auction algorithm for target assignment.** / Sadikhov, Teymur; Zhu, Minghui; Martínez, Sonia.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

TY - GEN

T1 - A distributed joint-learning and auction algorithm for target assignment

AU - Sadikhov, Teymur

AU - Zhu, Minghui

AU - Martínez, Sonia

PY - 2010/12/1

Y1 - 2010/12/1

N2 - We consider an agent-target assignment problem in an unknown environment modeled as an undirected graph. Agents incur cost or reward while traveling on the edges of this graph. Agents do not know the graph or the locations of the targets on it. However, they can obtain local information about these by local sensing and communicating with other agents within a limited range. To solve this problem, we come up with a new distributed algorithm that integrates Q-Learning and a distributed auction. The Q-Learning part helps estimate the assignment benefits calculated by summing up rewards over the graph edges for each agent-target pair, while the auction part takes care of assigning agents to targets in a distributed fashion. The algorithm is shown to terminate with a near-optimal assignment in a finite time. Optimality refers to the assignment benefit maximization, which can depend on a target-agent pair value, and the routing cost of the agent to visit the target.

AB - We consider an agent-target assignment problem in an unknown environment modeled as an undirected graph. Agents incur cost or reward while traveling on the edges of this graph. Agents do not know the graph or the locations of the targets on it. However, they can obtain local information about these by local sensing and communicating with other agents within a limited range. To solve this problem, we come up with a new distributed algorithm that integrates Q-Learning and a distributed auction. The Q-Learning part helps estimate the assignment benefits calculated by summing up rewards over the graph edges for each agent-target pair, while the auction part takes care of assigning agents to targets in a distributed fashion. The algorithm is shown to terminate with a near-optimal assignment in a finite time. Optimality refers to the assignment benefit maximization, which can depend on a target-agent pair value, and the routing cost of the agent to visit the target.

UR - http://www.scopus.com/inward/record.url?scp=79953150786&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79953150786&partnerID=8YFLogxK

U2 - 10.1109/CDC.2010.5718180

DO - 10.1109/CDC.2010.5718180

M3 - Conference contribution

AN - SCOPUS:79953150786

SN - 9781424477456

T3 - Proceedings of the IEEE Conference on Decision and Control

SP - 5450

EP - 5455

BT - 2010 49th IEEE Conference on Decision and Control, CDC 2010

ER -