# plan-for-custom-deep-research
awesome—let’s stand up a “DIY Deep Research” pipeline you can own and tune for your domains. below is a **practical, shippable blueprint** with JS-first code you can paste into a repo and iterate. it assumes the OpenAI **Responses API**, plus **your own web/search adapters** (you can swap in any search/crawl stack later).
---
# DIY Deep Research (owner-operated)
## Goals
* **High-quality, citable briefs** from web + docs, with uncertainty handling.
* **Composable orchestration**: planner → searchers → readers → extractors → synthesizer → critic.
* **Domain-aware prompts** tuned for your three tracks and their intersections.
* **Reproducible outputs** (Markdown + JSON sidecar) with **traceable citations**.
---
## Architecture (one process to start)
**Roles (agents)**
1. **Planner** – decomposes the query into sub-questions & evidence plan.
2. **Searcher** – web & catalog queries; returns ranked candidates (URLs, metadata).
3. **Reader** – fetch + sanitize + chunk; extracts facts/claims with per-claim provenance.
4. **Extractor** – turns raw text into structured notes (claims, quotes, stats, definitions).
5. **Synthesizer** – writes the brief; cites every non-trivial claim.
6. **Critic/Verifier** – spot-checks: contradiction, missing perspectives, stale links, overclaiming; pushes revisions.
7. **Formatter** – outputs Markdown report + JSON manifest (citations, confidence, lineage).
8. **Indexer (optional)** – stores notes & embeddings for reuse (GraphRAG/ontology later).
**Data flow (happy path)**
```
user query
-> Planner.plan()
-> Searcher.search() per sub-question
-> Reader.read() + Extractor.extract()
-> Synthesizer.write()
-> Critic.review() -> Synthesizer.revise()
-> Formatter.emit()
-> Indexer.ingest() (optional)
```
**Storage**
* `/runs/<timestamp>/`
* `report.md`
* `report.json` (citations, claims, confidence)
* `notes.ndjson` (per-source notes)
* `raw/` (source HTML/text snapshots, hashed filenames)
* `cache.sqlite` (URL → normalized\_text, fetched\_at)
---
## Output contracts
### `report.md` (human-readable)
* Executive summary (bulleted, ≤10 lines)
* Answer sections, organized by sub-question
* Domain intersection callouts (AI × Higher-Ed, AI × Arts, etc.)
* “What we’re confident about” / “Open questions”
* Full citations (numbered, inline like \[12])
### `report.json` (machine-readable)
```json
{
"query": "How are top U.S. universities integrating AI in humanities pedagogy?",
"run_id": "2025-09-06T13-04-18Z",
"model": "gpt-4o",
"claims": [
{
"id": "clm_001",
"text": "Several R1 institutions ran AI-assisted close-reading pilots in 2024–25.",
"citation_ids": ["src_03","src_07"],
"confidence": 0.72,
"domain_tags": ["higher-ed","humanities","ai-teaching"],
"risk_flags": ["generalization"]
}
],
"citations": [
{
"id": "src_03",
"url": "https://example.edu/center/ai-humanities-pilot",
"title": "AI & Humanities Pilot 2025",
"author": "Center for Teaching",
"published": "2025-03-11",
"accessed": "2025-09-06",
"hash": "sha256:…"
}
]
}
```
---
## Domain lenses (prompt adapters)
We’ll **prepend** one of these lenses to the Planner & Synthesizer prompts depending on topic classification (or combine two for intersections).
### Lens A — AI research & strategy
* Emphasize: method provenance, benchmarks, scaling laws, org impact, governance.
* Penalize: vendor hype, uncited claims, outdated references.
* Require: dates on findings; diff between *model capability* and *productized feature*.
### Lens B — Higher Ed & Humanities pedagogy
* Emphasize: assessment design, learning outcomes, academic integrity, accessibility, equity.
* Require: concrete course designs, assignment archetypes, workload/latency tradeoffs, faculty development pathways.
### Lens C — Media/Arts/Multimodal
* Emphasize: creative workflow diagrams, pipeline steps (capture → ingest → edit → render → publish), IP/ethics, performance/latency constraints.
* Require: tool-agnostic descriptions; reproducible patterns; examples.
---
## Prompts (starter)
### `prompts/planner.txt`
```
You are a research planner. Given a query, produce:
1) 3–7 sub-questions that, if answered, fully address the query.
2) For each, the evidence types needed (news, peer-reviewed, policy, case study, syllabi, GitHub, videos).
3) Disambiguation notes & high-risk pitfalls (stale claims, vendor hype, ambiguous metrics).
4) A brief plan for triangulation & contradiction checks.
Domain lens:
{{DOMAIN_LENS}}
Query:
{{QUERY}}
Return JSON with keys: sub_questions[], evidence_plan{}, pitfalls[], triangulation[].
```
### `prompts/extractor.txt`
```
You are an evidence extractor. Given source text + metadata, output:
- atomic claims with quoted support (≤30 words per quote),
- the exact span(s) used,
- normalized entities (orgs, people, models, courses),
- dates found, and
- uncertainty markers in the source.
Return NDJSON where each line is a JSON object with:
{claim, quotes[], spans[], entities[], dates[], source_id, confidence}
```
### `prompts/synthesizer.txt`
```
Write a rigorous, neutral report that answers the user query.
Rules:
- Every nontrivial claim must cite [#].
- Separate 'We’re confident' vs 'Open questions'.
- Highlight intersections across domains explicitly.
- Keep hype in check; state limitations.
- Prefer specifics (course patterns, workflow steps, evaluation rubrics).
Domain lens:
{{DOMAIN_LENS}}
Use this JSON notes bundle:
{{NOTES_JSON_CHUNKED}}
Output Markdown only.
```
### `prompts/critic.txt`
```
Act as a critical reviewer.
Find:
- overclaims (claims exceeding source support),
- missing perspectives (e.g., humanities equity angle),
- contradictions across sources,
- stale references (>18 months) if recency is critical.
Return JSON with {issues[], required_fixes[], confidence_adjustments{}}.
```
---
## Repo skeleton
```
diy-deep-research/
.env
package.json
src/
index.js
orchestrator.js
planner.js
searcher.js
reader.js
extractor.js
synthesizer.js
critic.js
formatter.js
utils/
cache.js
chunk.js
normalize.js
cite.js
clock.js
prompts/
planner.txt
extractor.txt
synthesizer.txt
critic.txt
runs/
```
**`package.json`**
```json
{
"name": "diy-deep-research",
"type": "module",
"scripts": {
"start": "node src/index.js",
"run": "node src/index.js --q"
},
"dependencies": {
"openai": "^4.56.0",
"node-html-parser": "^6.1.12",
"undici": "^6.19.8",
"p-limit": "^6.1.0",
"better-sqlite3": "^9.6.0",
"yargs": "^17.7.2",
"uuid": "^9.0.1"
}
}
```
**`.env`**
```
OPENAI_API_KEY=sk-…
OPENAI_MODEL=gpt-4o
MAX_TOKENS=4000
```
---
## Core modules (minimal versions)
**`src/index.js`**
```js
import yargs from "yargs";
import { hideBin } from "yargs/helpers";
import { runResearch } from "./orchestrator.js";
const argv = yargs(hideBin(process.argv))
.option("q", { type: "string", describe: "query to research" })
.option("lens", { type: "string", default: "higher-ed", choices: ["ai", "higher-ed", "media-arts", "intersection"] })
.demandOption("q")
.help().argv;
runResearch({ query: argv.q, lens: argv.lens }).then(p => {
console.log("✓ report:", p.reportPath);
console.log("✓ manifest:", p.jsonPath);
}).catch(e => {
console.error("run failed:", e);
process.exit(1);
});
```
**`src/orchestrator.js`**
```js
import { plan } from "./planner.js";
import { searchAll } from "./searcher.js";
import { readAll } from "./reader.js";
import { extractAll } from "./extractor.js";
import { synthesize } from "./synthesizer.js";
import { review } from "./critic.js";
import { formatOut } from "./formatter.js";
import { uidRunFolder } from "./utils/clock.js";
export async function runResearch({ query, lens }) {
const run = uidRunFolder();
const planOut = await plan({ query, lens, run });
const hits = await searchAll({ plan: planOut, run });
const sources = await readAll({ hits, run });
const notes = await extractAll({ sources, run, lens });
let draft = await synthesize({ query, lens, notes, run });
const critique = await review({ draft, notes, lens, run });
if (critique?.required_fixes?.length) {
draft = await synthesize({ query, lens, notes, run, critique });
}
const out = await formatOut({ query, lens, draft, notes, run });
return out;
}
```
**`src/planner.js`** (Responses API call; reads `prompts/planner.txt`)
```js
import fs from "fs/promises";
import OpenAI from "openai";
const client = new OpenAI();
const LENSES = {
"ai": "Focus on AI research, strategy, governance, benchmarks.",
"higher-ed": "Focus on pedagogy, assessment, humanities, equity.",
"media-arts": "Focus on creative pipelines, rights/ethics, latency.",
"intersection": "Surface intersections among AI, higher-ed, and media arts."
};
export async function plan({ query, lens, run }) {
const tmpl = await fs.readFile("prompts/planner.txt", "utf8");
const prompt = tmpl.replace("{{DOMAIN_LENS}}", LENSES[lens] ?? LENSES["intersection"])
.replace("{{QUERY}}", query);
const resp = await client.responses.create({
model: process.env.OPENAI_MODEL,
input: prompt,
response_format: { type: "json_object" }
});
const json = JSON.parse(resp.output_text);
await fs.mkdir(`runs/${run}`, { recursive: true });
await fs.writeFile(`runs/${run}/plan.json`, JSON.stringify(json, null, 2));
return json;
}
```
**`src/searcher.js`** (plug your own providers; return ranked hits)
```js
import { normalizeUrl } from "./utils/normalize.js";
// stub: replace with your web/search APIs
async function webSearch(q) {
// return [{url, title, snippet, score}]
return [];
}
export async function searchAll({ plan, run }) {
const results = [];
for (const sq of plan.sub_questions) {
const q = `${sq} site:.edu OR site:.org OR site:.gov OR (peer-reviewed)`;
const hits = await webSearch(q);
hits.forEach(h => results.push({ ...h, url: normalizeUrl(h.url), for: sq }));
}
// dedupe by URL
const seen = new Set(); const dedup = [];
for (const h of results) {
if (seen.has(h.url)) continue;
seen.add(h.url); dedup.push(h);
}
return dedup.slice(0, 60);
}
```
**`src/reader.js`** (fetch + sanitize + cache)
```js
import { request } from "undici";
import fs from "fs/promises";
import crypto from "crypto";
function sha(s){ return crypto.createHash("sha256").update(s).digest("hex"); }
export async function readAll({ hits, run }) {
const out = [];
await fs.mkdir(`runs/${run}/raw`, { recursive: true });
for (const h of hits) {
try {
const res = await request(h.url);
const html = await res.body.text();
const id = sha(h.url);
await fs.writeFile(`runs/${run}/raw/${id}.html`, html);
out.push({ id, ...h, html });
} catch {}
}
return out;
}
```
**`src/extractor.js`**
```js
import OpenAI from "openai";
import fs from "fs/promises";
import { htmlToText } from "./utils/normalize.js";
const client = new OpenAI();
export async function extractAll({ sources, run, lens }) {
const notes = [];
const tmpl = await fs.readFile("prompts/extractor.txt", "utf8");
for (const s of sources) {
const text = htmlToText(s.html).slice(0, 150_000); // guardrails
const prompt = `${tmpl}\n\nSOURCE_META:\n${JSON.stringify({url:s.url,title:s.title},null,2)}\n\nSOURCE_TEXT:\n${text}`;
const resp = await client.responses.create({
model: process.env.OPENAI_MODEL,
input: prompt
});
// each line is JSON (NDJSON style)
const lines = resp.output_text.split("\n").filter(Boolean);
for (const line of lines) {
try { notes.push({ ...JSON.parse(line), source_id: s.id }); } catch {}
}
}
await fs.writeFile(`runs/${run}/notes.ndjson`, notes.map(n=>JSON.stringify(n)).join("\n"));
return notes;
}
```
**`src/synthesizer.js`**
```js
import OpenAI from "openai";
import fs from "fs/promises";
const client = new OpenAI();
export async function synthesize({ query, lens, notes, run, critique }) {
const tmpl = await fs.readFile("prompts/synthesizer.txt", "utf8");
// chunk notes to fit model context
const notesChunk = JSON.stringify(notes.slice(0, 600)); // TODO: paginate if huge
const lensNote = lens;
const prompt = tmpl
.replace("{{DOMAIN_LENS}}", lensNote)
.replace("{{NOTES_JSON_CHUNKED}}", notesChunk)
+ (critique ? `\n\nREVIEW_ISSUES:\n${JSON.stringify(critique, null, 2)}` : "");
const resp = await client.responses.create({
model: process.env.OPENAI_MODEL,
input: prompt
});
const md = resp.output_text;
await fs.writeFile(`runs/${run}/draft.md`, md);
return md;
}
```
**`src/critic.js`**
```js
import OpenAI from "openai";
import fs from "fs/promises";
const client = new OpenAI();
export async function review({ draft, notes, lens, run }) {
const tmpl = await fs.readFile("prompts/critic.txt", "utf8");
const input = `${tmpl}\n\nDRAFT:\n${draft}\n\nNOTES:\n${JSON.stringify(notes.slice(0,500))}`;
const resp = await client.responses.create({
model: process.env.OPENAI_MODEL,
input,
response_format: { type: "json_object" }
});
const json = JSON.parse(resp.output_text);
await fs.writeFile(`runs/${run}/critic.json`, JSON.stringify(json, null, 2));
return json;
}
```
**`src/formatter.js`**
```js
import fs from "fs/promises";
import { extractCitations } from "./utils/cite.js";
export async function formatOut({ query, lens, draft, notes, run }) {
const { refs, manifest } = extractCitations(draft, notes);
const report = `${draft}\n\n---\n## References\n` + refs.map((r,i)=>`[${i+1}] ${r.title ?? r.url} — ${r.url}`).join("\n");
const reportPath = `runs/${run}/report.md`;
const jsonPath = `runs/${run}/report.json`;
const json = { query, lens, run_id: run, citations: manifest };
await fs.writeFile(reportPath, report);
await fs.writeFile(jsonPath, JSON.stringify(json, null, 2));
return { reportPath, jsonPath };
}
```
**`src/utils/cite.js`** (very simple placeholder)
```js
export function extractCitations(md, notes){
// find [#] patterns and map to sources present in notes.source_id
// for now, collapse to unique URLs in order of first appearance
const urls = new Map();
for (const n of notes) {
if (!urls.has(n.source_id)) urls.set(n.source_id, n.url || n.source_url || "");
}
const refs = [...urls.values()].map(u => ({ url: u, title: "" }));
const manifest = refs.map((r, idx) => ({ id: `src_${String(idx+1).padStart(2,"0")}`, url: r.url, title: r.title }));
return { refs, manifest };
}
```
**`src/utils/normalize.js`**
```js
import { parse } from "node-html-parser";
export function normalizeUrl(u){ try { return new URL(u).toString(); } catch { return u; } }
export function htmlToText(html){
const root = parse(html);
// remove script/style/nav
root.querySelectorAll("script,style,nav,footer,header").forEach(n=>n.remove());
return root.text.replace(/\s+/g," ").trim();
}
```
**`src/utils/clock.js`**
```js
export function uidRunFolder(){
const d = new Date().toISOString().replace(/[:.]/g,"-");
return d;
}
```
---
## Domain-aware search scaffolds (queries to seed Searcher)
* **AI research & strategy**
* “site:.org OR site:.edu Transformer ‘evaluation’ 2024..2025 filetype\:pdf”
* “capability eval ‘hallucination penalty’ 2024..2025”
* “enterprise AI governance rubric case study 2024..2025”
* **Higher-ed & humanities**
* “site:.edu syllabus ‘AI policy’ humanities 2024..2025”
* “writing pedagogy ‘LLM’ assignment redesign”
* “academic integrity generative AI statement 2024..2025”
* **Media/arts/multimodal**
* “film studies ‘AI video essay’ workflow”
* “theatre performance AI ‘interactive’ case study”
* “game dev education ‘procedural generation’ syllabus”
* **Intersections**
* “studio classroom ‘AI’ multimodal assignment rubric”
* “critical making AI humanities lab case study”
*(Wire these patterns into `searcher.js` as seed templates; rotate & mix.)*
---
## Quality controls & latency knobs
* **Penalize confident errors** in prompts (“award partial credit for uncertainty; penalize confident wrong answers”).
* **Contradiction passes** in Critic: if claim A vs B conflict, require resolve or mark “Open question”.
* **Freshness policy**: for time-sensitive topics (models, policies), down-rank >18 months old unless it’s theory/history.
* **Diversity constraint**: at least 1 primary source, 1 critical perspective, 1 practitioner example per sub-question.
* **Latency slider**: `max_sources_per_sq` and `extract_depth` to trade speed vs rigor.
---
## Running it
```bash
node src/index.js --q "How should humanities courses redesign research assignments to productively include AI while safeguarding academic integrity?" --lens higher-ed
```
Produces:
```
runs/2025-09-06T13-04-18Z/report.md
runs/2025-09-06T13-04-18Z/report.json
runs/…/raw/*.html
runs/…/notes.ndjson
```
---
## Extensions you’ll probably want (given your stack)
* **Ontology tags** (Palantir-style): add `facet_tags[]` to claims (e.g., governance|assessment|workflow|equity).
* **GraphRAG**: ingest `notes.ndjson` into a small graph (entities ←→ claims ←→ sources) and query across sub-graphs (AI × Higher-Ed × Arts).
* **Airtable/Supabase**: push `claims`, `citations`, and `notes` to tables; link to your Prism app for browsing & cross-run diffing.
* **Source adapters**: JSTOR/CORE/ArXiv/YouTube transcripts, library catalogs, institutional policy hubs; plus a **syllabus scraper**.
* **Evaluator harness**: a small test set of queries with expected “key points” and a rubric (coverage, citation sufficiency, recency, balance).
---
## Minimal evaluator rubric (drop in a JSON file)
```json
{
"criteria": [
{"name": "Coverage", "weight": 0.3, "desc": "Addresses all sub-questions."},
{"name": "Citation Sufficiency", "weight": 0.25, "desc": "Every nontrivial claim cited; primary sources present."},
{"name": "Recency & Relevance", "weight": 0.2},
{"name": "Balance/Perspectives", "weight": 0.15},
{"name": "Clarity/Actionability", "weight": 0.1}
]
}
```
---
## What you’ll customize first
1. **`searcher.js`** – plug in your preferred search/crawl stack and institute whitelists (e.g., .edu/.org, society journals).
2. **Prompts** – tighten each lens with your institutional priorities (AI & Civil Discourse, integrity, accessibility).
3. **Formatter** – match your house Markdown & add Harvard-style references if desired.
4. **Cache & rate limits** – add retries/backoff; persist URL fetches.
---
if you want, I can turn this into a ready-to-run repo (with functioning web search adapters you can later swap) or tailor the prompts to a first real query in your pipeline (e.g., “design an AI-inclusive research essay assignment for Expos that resists shortcutting”).