AI Agent Architecture — ReAct, Memory, Planning and Multi-Agent Systems · ReAct — the Reasoning and Acting Loop

ReAct on knowledge-intensive tasks — HotpotQA and FEVER as case studies

ReAct — the Reasoning and Acting Loop

Introduction

Two key benchmarks from the original ReAct paper: HotpotQA — multi-step questions requiring Wikipedia searches across 2+ documents, and FEVER — claim verification (SUPPORTS/REFUTES/NOT ENOUGH INFO) based on Wikipedia facts. This lesson analyses the experimental architecture of Yao et al. 2023, metrics (EM accuracy, F1, evidence accuracy), comparative results for ReAct vs CoT vs CoT-SC vs Act-Only, and task-specific error mechanisms.