MiniBook Suite

A dual-LLM pipeline that turns a single idea into a finished, illustrated, character-consistent storybook — automatically.

ComfyUI · SDXL · IPAdapter FaceID · LLM Orchestration · AI-Assisted Development

Anyone can ask an AI to "draw a picture." A storybook is a much harder — and much more human — problem.

A real picture book needs a story that holds together across pages, illustrations where the same character looks like the same character on page 1 and page 20, text laid out so a poem's stanza never gets orphaned, and an art style that feels intentional. Today that means stitching five tools together by hand over a weekend, or hiring an illustrator most people can't afford. MiniBook Suite collapses that into a single prompt — so a parent, a teacher, or an indie author can make something that used to require a studio.

How it works

One idea goes in. Two AIs and a stack of my own Python do the rest. Scroll to watch the pipeline assemble.

  1. One prompt in“A brave kitten in a noir city.”
  2. LLM · The AuthorWrites the full story.
  3. Python · The TypesetterPaginates · merges orphans · detects prose/poetry/chapters.
  4. LLM · The Art DirectorWrites a tailored prompt per page.
  5. SDXL + IPAdapter FaceIDConsistent characters · custom Noir LoRA.
  6. Text overlay + PDFComposites and binds the finished book.
A finished, character-consistent spread.

I built it wrong first. That's the most useful part.

My first attempt was a monolith called StorybookForge — one process trying to do everything at once. It was brittle and impossible to debug: a failure anywhere killed the entire generation. So I made the call to tear it down and rebuild it as a modular pipeline of custom ComfyUI nodes, where each stage does one job and hands clean output to the next.

That decision is the whole project — it's why MiniBook ships and StorybookForge didn't. I work through AI-assisted development: I specify the logic and architecture, direct LLMs to implement it, then test, debug, and iterate on what comes back until it's right. The work split naturally into specialists: an Author LLM writes the story; a custom Python Typesetter I designed and directed paginates it; an Art Director LLM writes a tailored image prompt per page; SDXL + IPAdapter FaceID illustrate it with consistent characters; and a final stage binds everything into a PDF. I orchestrated the LLM work across GPT, Claude, and GLM-4.2 via OpenRouter — routing each task to the model best suited to it.

The hard part wasn't writing code. It was knowing what the code needed to do.

The AI generation makes the impressive demo. But the part I'm proudest of is the Typesetter — the logic that makes the output feel like a book instead of a slideshow. I specified that it should detect whether the input is prose, poetry, or a chaptered story and paginate accordingly; that it needed a forward-merging pass to absorb "orphan" pages so you never get a page with one stray word; that illustrations should place dynamically; and that it had to guard against malformed AI output and fail gracefully instead of crashing. I directed an LLM to implement each of those, then debugged the results — including a regex loop that was misnumbering chapters on empty input. The engineering judgment — knowing what to build and why — is mine. The implementation was AI-assisted. That's how I work, and it's why I can ship.

An accident worth keeping

I never told the system to make a noir book. I gave the Art Director LLM an open artistic framework — "Line art illustration with gentle, broad sweeping ink washes" — used deliberately to steer the reader's eye toward what matters or away from what misdirects. I specified the function; I left the look open. Reading the story's theme ("a burned-out detective piecing together one last case in a rain-soaked city"), the model chose film-noir as the style that fit — and transformed my framework into "high-contrast black and white ink illustration in the style of Frank Miller's Sin City, deep noir shadows, single amber-warm light source accent." The result surprised me, and it taught me something about directing these systems: give a model a clear intent and the right freedom, and it fills the gaps with coherent, often inspired choices. Designing that — the framework and the freedom — is the real creative work.

Why a dancer builds AI pipelines

I spent 12 years as a professional dancer with the National Ballet of Canada. A great performance is precise, repeated structure in service of making someone feel something — hundreds of variables executing flawlessly, no rollback once the curtain rises. I learned the same lesson at a jewellery bench and building Hackintoshes: understand a system deeply, then make it produce something both beautiful and reliable.

I don't write code from scratch — I direct it, and I know what good output looks like because I understand the system I'm building. The tools changed across my life. The craftsmanship didn't.