Agentic Research Assistant
A Research assistant built with FastAPI, LangChain, Mistral Small, Tavily Search, web scraping, streaming responses, source quality scoring, citation-aware reports, critic feedback, and SQLite research history.
Why This Project Matters
This project demonstrates practical agentic AI engineering rather than a single prompt call. The system coordinates multiple stages, uses external tools, streams progress to the UI, preserves partial outputs, scores sources, asks the writer to cite evidence, critiques the generated report, and stores completed runs for later review.
Agent Workflow
1. Search: fetches reliable web sources through Tavily. 2. Source scoring: ranks results using position, trusted domains, snippet depth, and recency signals. 3. Scrape: retries the highest-scored sources and falls back to snippets if scraping fails. 4. Draft: generates a Markdown report plus a structured citation-aware report object. 5. Critique: reviews clarity, depth, completeness, flow, and source quality. 6. Persist: saves topic, sources, report, feedback, status, and timestamp to SQLite.
Features
- Streaming pipeline updates over
POST /api/research/stream. - Job-based execution with persisted job and step state.
- Step-by-step frontend progress states.
- Citation-aware report prompting.
- Structured report extraction with citation coverage metadata.
- Source quality scores displayed in the UI.
- SQLite research history in
research_history.db. - Retry and fallback strategy for scraping.
- Partial failure handling, so completed stages are not lost if a later stage fails.
- Clean research-console frontend focused on readability.
Tech Stack
- FastAPI
- LangChain
- Mistral Small via
langchain-mistralai - Tavily Search
- BeautifulSoup and Requests
- SQLite
- Vanilla HTML, CSS, and JavaScript
Project Structure
app/
main.py FastAPI app, API routes, streaming endpoint
pipeline.py Research workflow and step events
agents.py Mistral writer and critic chains
schemas.py Typed source, scrape, finding, report contracts
tools.py Tavily search, source scoring, web scraping
storage.py SQLite research history
static/
index.html Frontend shell
styles.css Research workspace styling
app.js Streaming UI and result rendering
main.py Uvicorn compatibility shim
pipeline.py CLI compatibility shim
evals/
cases.json Evaluation topics and expectations
run_evals.py Lightweight report quality evaluatorSetup
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txtCreate a .env file with your keys:
TAVILY_API_KEY=your_tavily_key
MISTRAL_API_KEY=your_mistral_key
MISTRAL_MODEL=mistral-small-latestRun the Web App
.\.venv\Scripts\python.exe -m uvicorn main:app --reloadOpen http://127.0.0.1:8000.
Run the CLI
.\.venv\Scripts\python.exe pipeline.pyAPI
GET /api/healthchecks server status.POST /api/researchruns the full pipeline and returns JSON.POST /api/research/streamstreams newline-delimited JSON events as each stage completes.POST /api/research/jobscreates a persisted research job.GET /api/research/jobs/{job_id}returns job and step state.GET /api/research/jobs/{job_id}/streamstreams and persists a created job.GET /api/research/historylists saved research runs.GET /api/research/history/{run_id}returns a saved run.
Example request:
{
"topic": "AI in drug discovery"
}Evals
After saving at least one run, execute:
.\.venv\Scripts\python.exe evals\run_evals.pyThe evaluator checks report sections, citation count, source count, and critic feedback.
Future Improvements
- Add user accounts and per-user histories.
- Add automated evals for citation coverage, source relevance, and hallucination risk.
- Add export to PDF or Markdown.
- Add fallback scraping when the selected source fails.
- Add deployment with a live demo URL.