kanji-index
skillGuidelines for maintaining the kanji index feature. Covers kanji ID assignment, index updates, and troubleshooting.
apm::install
apm install @tkgally/kanji-indexapm::skill.md
---
name: kanji-index
description: Guidelines for maintaining the kanji index feature. Covers kanji ID assignment, index updates, and troubleshooting.
---
# Kanji Index Maintenance
The kanji index allows users to click on any kanji in a dictionary headword to find all other entries containing that same kanji.
## How It Works
1. **Headword kanji** are linked to kanji index pages
2. **Kanji index pages** list all entries containing that kanji
3. **Entry lists** are sorted by reading (hiragana order)
## Directory Structure
```
kanji/
├── kanji_list.json # Master list: kanji → kanji_id mapping
├── kanji_extracted.json # Temporary: extracted kanji needing IDs
├── 00001_jin_hito_person.json # Entry list for 人
├── 00002_nichi_hi_day.json # Entry list for 日
└── ...
docs/kanji/
├── 00001_jin_hito_person.html # HTML page for 人
├── 00002_nichi_hi_day.html # HTML page for 日
└── ...
```
## Kanji ID Format
Format: `{5-digit}_{onyomi}_{kunyomi}_{gloss}`
- **5-digit**: Sequential number (00001, 00002, ...)
- **onyomi**: Most common on'yomi in romaji (or "none")
- **kunyomi**: Most common kun'yomi in romaji without okurigana (or "none")
- **gloss**: Single English word for primary meaning
### Examples
| Kanji | Kanji ID |
|-------|----------|
| 人 | 00001_jin_hito_person |
| 日 | 00002_nichi_hi_day |
| 大 | 00003_dai_oo_big |
| 畑 | 00004_none_hatake_field |
| 茶 | 00005_cha_none_tea |
### Romaji Rules
- Long vowels: "ou" not "ō" (e.g., 高 → "kou")
- Voiced: "ga", "za", "da", "ba" (e.g., 学 → "gaku")
- No okurigana in kun'yomi (e.g., 高い → "taka", not "takai")
## Assigning New Kanji IDs
When new entries introduce kanji not in `kanji_list.json`:
1. **Detect new kanji**:
```bash
python3 build/update_kanji_index.py --check-new
```
2. **Assign readings and gloss** using your knowledge:
- Most common on'yomi
- Most common kun'yomi (without okurigana)
- Single-word English gloss
3. **Update kanji_list.json**:
```json
{
"新": {
"kanji_id": "00123_shin_atara_new",
"onyomi": "shin",
"kunyomi": "atara",
"gloss": "new"
}
}
```
4. **Rebuild**:
```bash
python3 build/build_flat.py
```
## Common Tasks
### Check for New Kanji
```bash
python3 build/update_kanji_index.py --check-new
```
### Rebuild All Kanji JSON Files
```bash
python3 build/update_kanji_index.py --rebuild-all
```
### Rebuild Kanji HTML Pages
```bash
python3 build/build_kanji_html.py
```
### Full Site Build (includes kanji)
```bash
python3 build/build_flat.py
```
## Troubleshooting
### "Warning: X kanji need IDs assigned"
New kanji were found in entries. Assign IDs manually:
1. Run `--check-new` to see the full list
2. For each kanji, determine on'yomi, kun'yomi, gloss
3. Add to `kanji/kanji_list.json`
4. Rebuild
### Missing kanji index page
Check that:
1. Kanji is in `kanji/kanji_list.json`
2. JSON file exists: `kanji/{kanji_id}.json`
3. Run `python3 build/build_kanji_html.py`
### Kanji link not appearing in headword
Check that:
1. Kanji is in `kanji/kanji_list.json`
2. Entry HTML was rebuilt after kanji was added
### Entry count wrong on kanji page
Rebuild the kanji JSON file:
```bash
python3 build/update_kanji_index.py --rebuild-all
python3 build/build_kanji_html.py
```
## File Formats
### kanji_list.json
```json
{
"metadata": {
"description": "Index mapping kanji characters to their kanji index IDs",
"generated": "2026-01-22T10:30:00Z",
"total_kanji": 1500
},
"kanji": {
"人": {
"kanji_id": "00001_jin_hito_person",
"onyomi": "jin",
"kunyomi": "hito",
"gloss": "person"
}
}
}
```
### Individual kanji JSON
```json
{
"metadata": {
"kanji": "人",
"kanji_id": "00001_jin_hito_person",
"onyomi": "jin",
"kunyomi": "hito",
"gloss": "person",
"entry_count": 245,
"generated": "2026-01-22T10:30:00Z"
},
"entries": [
{
"id": "01234_akunin",
"headword": "{悪|あく}{人|にん}",
"reading": "あくにん",
"gloss": "villain, bad person"
}
]
}
```
## Design Decisions
### Why invisible links?
- Preserves clean headword appearance
- Users discover feature through tooltip
- No visual clutter
### Why romaji in kanji IDs?
- ASCII-safe file names
- Human-readable
- Easy to search and sort
### Why sort by reading?
- Natural Japanese ordering (gojuon)
- Consistent with how dictionaries organize entries
- Helps users find related words