RS-Agent employs an LLM to understand the user's requirements.
RS-Agent can utilize multiple tools and engage in multi-turn conversations.
RS-Agent is capable of answering questions in specialized fields.
The RS-Agent integrates existing high-performance remote sensing tools. It can understand user intentions like a Central Controller, and solve user needs through planning, reasoning and action. It is also Capable of handling professional and technical knowledge in remote sensing.
When ${M}_{c}$ receives query $Q$ and image $I$, ${M}_{c}$ will transmit the solution requirement ${r}_{s}$ to ${M}_{s}$ . ${M}_{s}$ employs the FIASS algorithm to derive solution guidance ${g}_{s}$ , which assists ${M}_{c}$ in selecting the appropriate tools $\hat{T}$ after dispatching the tool requirement ${r}_{t}$ to the tool space $T$. If ${M}_{c}$ requires additional knowledge guidance ${g}_{k}$, ${M}_{k}$ will provide it from ${D}_{k}$ according to the knowledge requirement ${k}_{s}$. ${M}_{c}$ will then invoke $\hat{T}$ and produce the final answer $A$ along with the processed image $\hat{I}$ .
Here is a demonstration of the RS-Agent in action.
Watch the video below to see how RS-Agent automates remote sensing tasks.
@misc{xu2024rsagentautomatingremotesensing,
title={RS-Agent: Automating Remote Sensing Tasks through Intelligent Agents},
author={Wenjia Xu and Zijian Yu and Yixu Wang and Jiuniu Wang and Mugen Peng},
year={2024},
eprint={2406.07089},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2406.07089},
}
This website is adapted from Nerfies, licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. We are thankful to LLaVA, Qwen, DeepSeek, GeoChat and LHRS-Bot for releasing their models and code as open-source contributions.