APM

>Agent Skill

@willoscar/citation-verifier

skilldevelopment

Generate and verify BibTeX entries from paper notes, writing `citations/ref.bib` and `citations/verified.jsonl`. **Trigger**: citation, BibTeX, ref.bib, verified.jsonl, references, 引用, 参考文献. **Use when**: 已有 `papers/paper_notes.jsonl`,需要为 prose/LaTeX 准备可追溯的引用(每条都有 url/date/title 验证记录)。 **Skip if**: 还没有 paper notes(或本次产出不需要引用/参考文献)。 **Network**: 自动验证通常需要网络;无网络时可先 record,再标注 needs manual verification。 **Guardrail**: 每个 BibTeX entry 必须对应一条 `citations/verified.jsonl` 记录;prose 只能使用已存在于 `citations/ref.bib` 的 citation keys。

development
apm::install
$apm install @willoscar/citation-verifier
apm::skill.md
---
name: citation-verifier
description: |
  Generate and verify BibTeX entries from paper notes, writing `citations/ref.bib` and `citations/verified.jsonl`.
  **Trigger**: citation, BibTeX, ref.bib, verified.jsonl, references, 引用, 参考文献.
  **Use when**: 已有 `papers/paper_notes.jsonl`,需要为 prose/LaTeX 准备可追溯的引用(每条都有 url/date/title 验证记录)。
  **Skip if**: 还没有 paper notes(或本次产出不需要引用/参考文献)。
  **Network**: 自动验证通常需要网络;无网络时可先 record,再标注 needs manual verification。
  **Guardrail**: 每个 BibTeX entry 必须对应一条 `citations/verified.jsonl` 记录;prose 只能使用已存在于 `citations/ref.bib` 的 citation keys。
---

# Citation Verifier

Generate `citations/ref.bib` and ensure every entry has a traceable verification record in `citations/verified.jsonl`.

When network access is restricted, prefer a “record now, verify later” workflow: keep URLs/titles consistent and leave a clear verification note.

## Input

- `papers/paper_notes.jsonl`

## Outputs

- `citations/ref.bib`
- `citations/verified.jsonl`

## Workflow (heuristic)

1. Collect `bibkey`, `title`, `url`, `year`, `authors` from `papers/paper_notes.jsonl`.
2. Write/refresh `citations/ref.bib`:
   - Prefer arXiv-style fields when `arxiv_id` / `primary_category` exist (`eprint`, `archivePrefix`, `primaryClass`).
3. Write one verification record per BibTeX entry to `citations/verified.jsonl` with at least:
   - `bibkey`, `title`, `url`, `date`
4. If you cannot verify via network, record a clear `notes` field (e.g., “auto-generated; needs manual verification”) and/or request human confirmation depending on your policy.

## Quality checklist

- [ ] Every BibTeX entry has a corresponding `verified.jsonl` record.
- [ ] No missing `url`/`date`/`title` in verification records.

## Offline Mode

When network access is restricted, run in offline mode to produce auditable records now, then verify later.

- Generate offline records: `verification_status: offline_generated`
- Verify later (when network is available): `--verify-only`

### `verification_status`

- `offline_generated`: record was generated without network verification (needs later verification)
- `verified_online`: URL/title verified successfully by the script
- `verify_failed`: verification was attempted but failed (network error or title mismatch)
- `needs_manual_verification`: missing/ambiguous fields (e.g., empty `url`/`title`)

## Script

### Quick Start

- `python .codex/skills/citation-verifier/scripts/run.py --help`
- Offline (record now, verify later): `python .codex/skills/citation-verifier/scripts/run.py --workspace <workspace_dir> --offline`

### All Options

- `--offline`: do not attempt network verification; write `verification_status=offline_generated`
- `--verify-only`: verify existing `citations/verified.jsonl` records (does not rewrite BibTeX)
- `--verification-note <text>`: stored in `citations/verified.jsonl` `notes`

### Examples

- Generate BibTeX + offline verification records:
  - `python .codex/skills/citation-verifier/scripts/run.py --workspace <ws> --offline --verification-note "auto-generated; needs manual verification"`
- Later, verify-only (when network is available):
  - `python .codex/skills/citation-verifier/scripts/run.py --workspace <ws> --verify-only`

### Notes

- Minimal requirement for every verification record: `url`, `date`, `title`.
- The script sanitizes stray/unbalanced `{}` in titles to keep `bibtex` parsing robust.
- The script escapes LaTeX special chars in text fields (`& % $ # _`) and rewrites superscript patterns like `X^N` or `X$^N$` as `X\textsuperscript{N}` to keep LaTeX builds stable.
- URLs are kept raw in BibTeX `url` fields (BibTeX styles wrap them with `\url{...}`); `@misc` uses `howpublished=\url{...}`.
- In offline mode, records are *not* truly verified; treat `offline_generated` as a to-do for human/network verification.

## Troubleshooting

### Common Issues

#### Issue: Missing `bibkey` / missing `url` in notes

**Symptom**:
- `citations/ref.bib` is missing entries, or `verified.jsonl` has empty `url/title`.

**Causes**:
- `papers/paper_notes.jsonl` lacks `bibkey`/`url` fields.

**Solutions**:
- Ensure each core paper note has a stable `bibkey` and a canonical `url`.
- Rerun citation generation after fixing notes.

#### Issue: `verification_status=offline_generated`

**Symptom**:
- Records exist but are not truly verified.

**Causes**:
- `--offline` was used, or network verification was unavailable.

**Solutions**:
- When network is available, run `--verify-only` to upgrade records.
- Or manually verify and update `citations/verified.jsonl` with notes.

### Recovery Checklist

- [ ] Every BibTeX entry has a matching `citations/verified.jsonl` record.
- [ ] Verification records include `url`, `date`, `title`.