Course Builder

Name: Course Builder
Author: TaewoooPark
ASecurity
Use whenever the user wants to ingest a new course's materials (lecture notes, textbook chapters, HW problems, HW solutions) and build the course-specific knowledge base — patterns.md (recurring solution techniques), coverage.md (HW-to-section map with blind spots), and summary.md (topic tree). Invoked by `/ingest` and `/analyze` slash commands. Designed to be domain-general across math and physics courses (calculus, linear algebra, real/complex analysis, classical mechanics, E&M, thermodynam...
55 stars
0 votes
0 copies
5 views
Added 5/28/2026
ai-agentsgoangularaws
Security Analysis

A100/100
Scanned 5/28/2026
Install via CLI
$openskills install TaewoooPark/PAIDEIA
Files
SKILL.md
---
name: course-builder
description: Use whenever the user wants to ingest a new course's materials (lecture notes, textbook chapters, HW problems, HW solutions) and build the course-specific knowledge base — patterns.md (recurring solution techniques), coverage.md (HW-to-section map with blind spots), and summary.md (topic tree). Invoked by `/ingest` and `/analyze` slash commands. Designed to be domain-general across math and physics courses (calculus, linear algebra, real/complex analysis, classical mechanics, E&M, thermodynamics, quantum, etc.).
---

# Course Builder

## Overview

This skill turns raw course materials into a structured knowledge base that downstream drilling commands (`/twin`, `/blind`, `/chain`, `/pattern`, `/hwmap`) can query. It is **domain-general** — the same pipeline works for a Linear Algebra course as for a Quantum Mechanics course.

Two-phase pipeline:

```
Phase 1: /ingest
  materials/**/*.pdf  →  converted/**/*.md      (via pdf skill)
  materials/**/*.md   →  (copied as-is)

Phase 2: /analyze
  converted/** + materials/*.md  →  course-index/patterns.md
                                     course-index/coverage.md
                                     course-index/summary.md
```

## When to load

- User runs `/ingest` or `/analyze`
- User mentions adding new course materials
- User asks "what does this course cover" or "what are the key techniques"
- Downstream commands (`/twin`, `/blind`, `/pattern`, `/hwmap`) need `course-index/` data that doesn't exist yet

## Phase 1: Ingest

### Discovery
Scan `materials/` recursively. Classify each file by path and extension:
- `materials/lectures/*.pdf|.md` — lecture notes
- `materials/textbook/*.pdf|.md` — textbook chapters
- `materials/homework/*.pdf|.md` — HW problem sets (rename for consistency: `hw1.pdf`, `hw2.pdf`, ...)
- `materials/solutions/*.pdf|.md` — HW solutions (`hw1_sol.pdf`, etc.) or worked examples

Ambiguous location (e.g., a PDF in `materials/` root)? Ask user once to categorize, then remember.

### Conversion

**All `.pdf` files in `materials/**` go through the vision pipeline.** `pdfplumber` was tried as a fast path and proved unreliable on course materials — even prose-heavy textbook pages silently word-salad when they mix equations or multi-column figures. Routing everything uniformly through vision is simpler than maintaining per-category heuristics with fallbacks. Full pipeline in `skills/pdf/VISION.md`; the short form:

1. Load `skills/pdf/SKILL.md` and `skills/pdf/VISION.md`.
2. Render each PDF to PNG at `dpi=160` (via `pdf2image`) into `converted/<category>/_pages/<stem>/`.
3. Resize all rendered PNGs to ≤1800 px on the long edge **before** any agent starts reading — this is the hard 2000 px many-image limit; violating it wastes entire agent runs.
4. Spawn one parallel `general-purpose` agent per PDF. Each agent reads its own pages **sequentially** (not in parallel batches — same dimension limit) and transcribes to clean LaTeX markdown (`$...$` / `$$...$$`). Unreadable symbols get `[?]`.
5. Write `converted/<category>/<stem>.md` with provenance: `<!-- SOURCE: materials/<category>/<stem>.pdf, extracted <YYYY-MM-DD>, method: vision -->`.
6. After all agents finish, delete the `_pages/` scratch dirs.

For each `.md` already in `materials/`: copy to `converted/<category>/<stem>.md` unchanged with a `method: passthrough` provenance comment.

### Idempotence
If `converted/X.md` exists and is newer than source, skip unless user passes `--force`. Log skip count.

### Output
After ingest completes, print a summary table:

| Category | Converted | Skipped (already done) | Failed |
|---|---|---|---|
| lectures | N | M | F |
| textbook | ... | ... | ... |
| homework | ... | ... | ... |
| solutions | ... | ... | ... |

And (in `INTERFACE_LANG` from `.course-meta`, default `en`): "Next: run `/analyze` to generate the patterns / coverage indexes."

## Phase 2: Analyze

This is the core generalization. Given `converted/**/*.md`, produce three index files.

### `course-index/summary.md`

Topic tree of the course. Structure:
```markdown
# Course Summary

## Scope
Inferred from lecture notes: <one paragraph>.

## Topic tree
- §1 <topic>
  - §1.1 <subtopic> — covered in: lectures/ch01.md, textbook/ch01.md
  - §1.2 ...
- §2 <topic>
  ...

## Difficulty ordering (inferred from lecture progression)
Early → foundational definitions. Middle → core theorems. Late → applications/advanced.
```

**How to build.** Parse section headers (`##`, `###`) from lecture notes, in order. Cross-reference with textbook headers. Use section numbers if present; if not, auto-number by order of appearance.

### `course-index/patterns.md`

Recurring solution techniques extracted from HW solutions and worked examples.

**How to extract.** For each solution (`converted/solutions/*.md` and examples in lecture notes):
1. Identify the "key move" — the step where a reusable technique is applied (e.g., "integration by parts", "change of variable", "Cauchy's integral formula", "Lagrange multipliers", "separation of variables", "Green's function", "diagonalization").
2. Check whether the same move appears in 2+ other problems. If yes, it's a pattern.
3. Number patterns P1, P2, ... in order of first appearance.

Format each pattern card:
```markdown
### Pk. <short name>

**Recognition signal.** <1-2 lines: what triggers this pattern>

**Move.** <1-3 lines: the operation>

**Appears in.** <HW problem IDs, textbook example numbers>

**Topic.** <§ numbers from summary.md>
```

Target pattern count: 15–30 (too few misses important ones; too many becomes noise). If you find <10, the course is too small or you missed patterns — re-scan. If you find >40, merge similar patterns.

### `course-index/coverage.md`

Bidirectional map between HW/example problems and course sections.

**Core premise (do not break).** HW coverage is a **signal of exam probability**, not a completeness metric. The professor has already told you, via HW, where the exam will be drawn from: sections with heavy HW emphasis are where the exam points live. Sections with no HW are unlikely to produce problems worth drilling — they become reference-only.

Structure:
```markdown
## Forward map: problem → sections

| Problem | Primary § | Secondary § | Patterns |
|---|---|---|---|
| HW1-P1 | §2.3 | §2.1 | P1, P3 |
| ...

## Reverse map: section → exam-probability (from HW density)

| § | Title | HW coverage | Exam tier |
|---|---|---|---|
| §2 | ... | HW1-P1, HW2-P3, HW3-P1 | 🔥🔥 Exam-primary |
| §1 | ... | HW1-P2, HW2-P1           | 🔥 Exam-likely |
| §4 | ... | HW3-P5                    | 🟡 Exam-possible |
| §5 | ... | —                         | ⚪ Low-risk (reference only) |
```

Exam tiers (based on HW problem count targeting the section):
- 🔥🔥 **Exam-primary** — 3+ HW instances. Highest exam probability. Drill hardest.
- 🔥 **Exam-likely** — 2 HW instances. High exam probability.
- 🟡 **Exam-possible** — 1 HW instance. Moderate probability; warm-pass review.
- ⚪ **Low-risk** — no HW coverage. Treat as reference; do not spend drill time here unless the user explicitly asks. (Optional asterisk if it falls in a user-declared weak zone — but do not upgrade the exam tier on that basis alone.)

**Do not invert this.** Sections with no HW are NOT "blind spots that the exam will bite" — they are sections the professor chose not to test, by omission. Drilling them steals time from exam-primary sections.

### Summary of analysis output

At end of analyze, print to chat:
- Number of patterns extracted
- Number of sections in summary
- Count of 🔥🔥 / 🔥 / 🟡 / ⚪ sections
- Top 3 **exam-primary** sections and their recommended drills (most HW-dense first)

## Domain-general hints

When analyzing, watch for common **mathematical** patterns (applicable broadly):
- Integration techniques (substitution, parts, partial fractions, contour)
- Linear algebra moves (diagonalization, Gram-Schmidt, rank-nullity)
- Series manipulations (telescoping, generating functions, asymptotics)
- Induction structures (strong, transfinite, well-ordering)
- Function-space methods (orthogonality, completeness, eigenexpansions)

And common **physics** patterns:
- Conservation laws invocation (energy, momentum, charge, angular momentum)
- Symmetry arguments (Noether, parity, gauge)
- Perturbation theory (regular, singular, Rayleigh-Schrödinger)
- Boundary condition matching (continuity of ψ, ψ', field components)
- Change of reference frame (Galilean, Lorentz, rotating)
- Maxwell-style relations (any variable-swap via second mixed derivative)

These are hints — only add a pattern if it actually appears ≥2 times in the user's solutions.

## Files produced (summary)

After a full ingest + analyze run, the `paideia` directory contains:

```
converted/                    ← all PDFs as MD
course-index/
├── summary.md               ← topic tree
├── patterns.md              ← P1..Pk recognition cards
└── coverage.md              ← HW↔§ map, blind spots flagged
```

All downstream commands (`/twin`, `/blind`, `/chain`, `/pattern`, `/hwmap`) read from these three index files, not from the raw materials. This makes re-analysis cheap (edit index manually if needed) and keeps commands domain-agnostic.
Course Builder

Security Analysis

Attribution

Comments (0)