Style-LoRA — AI Learning Course

§ 01

The lesson

Locking a visual identity across hundreds of generations · data prep, training, deployment

WHEN A STYLE-LORA EARNS ITS WEIGHT

Brand identity, recurring character, distinct illustration aesthetic

A style-LoRA is worth training when you need to generate hundreds of images that share a specific visual identity — a studio's house aesthetic, a comic series' linework, a brand's product photography. The investment (couple hours of GPU time, 20-50 curated images) pays back the moment you'd otherwise be prompt-engineering identical adjectives across every generation.

DATASET CURATION

20-50 images is a sweet spot — consistency matters more than count

Quality > quantity. Pick 20-50 images that genuinely share the style you want to capture. Crop them to a square (or matching aspect ratio). Caption each with a short consistent prefix (a trigger word like "mclndo style") followed by free-form description. The trigger word becomes how you invoke the style at inference time.

TRAINING RECIPE

Diffusers' Dreambooth-LoRA script · ~1 hour on a 24GB GPU

python train_dreambooth_lora_sdxl.py --instance_prompt "mclndo style" --resolution 1024 --learning_rate 1e-4 --max_train_steps 1500 --rank 32

— pure Diffusers flags. (Kohya's sd-scripts is the other common toolchain; its equivalents are --network_dim and --network_alpha — don't mix the two flag sets.) Rank 32 is a balance — higher captures more nuance but risks overfitting. ~1500 steps on 25 images takes about an hour on an A6000 or 4090. Output is a ~150MB .safetensors file.

INFERENCE & DEPLOYMENT

Load the LoRA at runtime, combine with other LoRAs

Loading adapters and triggering them are two separate steps. First load each LoRA with its own strength — say the style LoRA at 0.8 and a watercolor LoRA at 0.6. Then put each adapter's trigger word in the prompt: "a portrait, mclndo style, wtrclr wash". Typing "watercolor LoRA" into a prompt does nothing by itself — the prompt only carries trigger words; the adapters and weights are set in code or in the UI. The Macalinao Studio pattern: one Style-LoRA per brand, one Character-LoRA per persona, combined at runtime.

2026 NOTES

Modern bases, captions, licensing, overfitting

Modern bases: the growth is in FLUX and Qwen-Image LoRA training via ai-toolkit or SimpleTuner; this SDXL recipe's conventions differ. Captions: for T5/LLM-encoder models, natural-language captions work better than token-prefix triggers like "mclndo style". Licensing: training on an artist's or a brand's images raises copyright and trademark questions — secure permission before commercial use. Overfitting check: if every generation reproduces your training compositions (same pose, same crop), the LoRA has burned in composition, not just style — reduce steps, rank, or dataset redundancy.

§ 02

Style-LoRA
training.