~/prompts/schema-design-prompt-normalized-postgres-from-220-lines-of-prose
§ PROMPT · APR 23, 2026 ALL · POSTGRES · SCHEMA v1.0

Schema design prompt: normalized Postgres from 220 lines of prose

The prompt that turns 220 lines of product requirements into a normalized Postgres schema with indexes, constraints, and migration order. Tested on 4 models.
Adrian MarcusAdrian Marcus. Working engineer. Reviews AI-coding tools on real codebases, scored on a fixed 14-task suite, rerun weekly.
  8 min read
# SCHEMA · claude-sonnet-4-6 · gpt-5.4
You are designing a Postgres schema.
Input: prose requirements.
Output: SQL DDL (3NF minimum), followed by a short rationale per table.

The recurring r/PostgreSQL “AI keeps generating a 30-column users table with comma-separated tags” thread keeps landing on the same fix: the prompt has to demand 3NF, ban two specific anti-patterns, and force an AMBIGUITIES section. The 6-section prompt below moves Claude Opus 4.7 to a 3NF schema on 5 of 5 runs on the TCC editorial schema fixture (220-line product spec, 8 entities, two intentionally ambiguous requirements), with indexing within spec on 4 of 5. GPT-5.3-Codex hits the same. The bare “design me a Postgres schema” prompt produces a denormalized table on 3 of 5 runs.

Why AI generates bad schemas by default

AI models are trained on millions of code examples, and a large fraction of those are tutorials, quick demos, and prototypes. Tutorial code optimizes for getting something working quickly, not for production readiness. So when you ask an AI to “create a database for a SaaS app,” you get exactly what a tutorial would show you: everything in a single table, no indexes, string fields where enums should be, no foreign key constraints.

The most common AI schema problems: denormalized monoliths (a “users” table with 30 columns spanning identity, billing, and preferences), comma-separated lists in text fields instead of join tables, missing indexes on foreign keys, and timestamps without timezone awareness. The AI is not being lazy. It is pattern-matching against the most common examples in its training data.

The fix is to prompt with the constraints that production databases actually need. Five sections eliminate five categories of mistakes.

The prompt

You are a database engineer designing a Postgres schema from product requirements.

Inputs:
- Target: Postgres 16+.
- Requirements (prose, may contain ambiguity): <paste>
- Existing schema, if any: <paste or "none">
- Expected scale: <rows/table/year and QPS ballpark>

Output the following, in this order, nothing else:

1. ENTITIES: a numbered list. Each entry: `<name>, <one-line purpose>, <lifecycle: append|mutable|transient>`.
2. RELATIONSHIPS: one line per relationship: `<a> <cardinality> <b> via <fk|join-table>`.
3. ERD: a fenced mermaid `erDiagram` block with the entities and relationships above.
4. DDL: a single Postgres DDL file, in migration order. Rules:
   - All tables in 3NF unless you document the denormalization.
   - Primary keys are uuid v7 (gen_random_uuid via pgcrypto, or uuid_generate_v7 via an
     extension you name).
   - Timestamps: created_at, updated_at, deleted_at (nullable). created_at default now().
     updated_at managed by trigger.
   - Indexes: one per foreign key, one per common query pattern from the requirements,
     one partial index for soft-deleted rows.
   - Constraints: named, including check constraints for enums-as-text.
   - Never use text without a length constraint on short fields (emails, ids, codes).
5. AMBIGUITIES: a list of the product statements you had to interpret, with the
   interpretation you chose.
6. NEXT_SIZING: one paragraph on the first indexing decision that will need to change at
   10x the expected scale.

Rules:
- Do not produce a monolithic "users" table with 30 columns. Split by lifecycle.
- Do not store comma-separated lists in text fields.
- Do not invent requirements. If unclear, put the assumption in AMBIGUITIES and continue.

Example: prompt applied to a B2B SaaS spec

A one-paragraph spec: “Organizations can invite members. Members can be owners, admins, or regular members. Each organization subscribes to a plan. Plans have a seat limit. Subscriptions can be active, past due, trialing, or canceled. Organizations get a slug for their dashboard URL.”

ENTITIES output:

1. users, identity and auth record, mutable
2. organizations, tenant account, mutable
3. memberships, links users to organizations with roles, mutable
4. plans, subscription tier definitions, append
5. subscriptions, active plan assignment per org, mutable
6. invitations, pending org invites, transient

Key DDL excerpt:

CREATE EXTENSION IF NOT EXISTS pgcrypto;

CREATE TYPE membership_role AS ENUM ('owner', 'admin', 'member');
CREATE TYPE subscription_status AS ENUM ('active', 'past_due', 'trialing', 'canceled');

CREATE TABLE organizations (
    id          uuid        DEFAULT gen_random_uuid() PRIMARY KEY,
    name        varchar(120) NOT NULL,
    slug        varchar(63)  NOT NULL,
    created_at  timestamptz  DEFAULT now() NOT NULL,
    updated_at  timestamptz  DEFAULT now() NOT NULL,
    deleted_at  timestamptz,
    CONSTRAINT organizations_slug_unique UNIQUE (slug),
    CONSTRAINT organizations_slug_format CHECK (slug ~ '^[a-z0-9-]+$')
);

CREATE TABLE memberships (
    id              uuid            DEFAULT gen_random_uuid() PRIMARY KEY,
    user_id         uuid            NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    organization_id uuid            NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
    role            membership_role NOT NULL DEFAULT 'member',
    created_at      timestamptz     DEFAULT now() NOT NULL,
    updated_at      timestamptz     DEFAULT now() NOT NULL,
    CONSTRAINT memberships_unique_user_org UNIQUE (user_id, organization_id)
);

-- Partial index: only one owner per org
CREATE UNIQUE INDEX memberships_one_owner_per_org
    ON memberships (organization_id)
    WHERE role = 'owner';

CREATE INDEX idx_memberships_user_id         ON memberships (user_id);
CREATE INDEX idx_memberships_organization_id ON memberships (organization_id);
CREATE INDEX idx_memberships_deleted         ON memberships (deleted_at) WHERE deleted_at IS NOT NULL;

AMBIGUITIES output:

1. "Seat limit" — interpreted as plans.max_seats (integer). Enforcement assumed at the
   application layer; not modeled as a DB constraint because it depends on active membership
   count which requires a count query.
2. "Organizations get a slug" — interpreted as required at creation, not optional.
   Assumed lowercase alphanumeric with hyphens (URL-safe).

Why it works, in 5 bullets

Including query patterns in the input

The “one index per common query pattern” rule only works if the input tells the model which queries matter. Add a QUERY PATTERNS section to your requirements paste:

QUERY PATTERNS:
- Look up user by email (authentication, high frequency)
- List all members of an organization with their roles
- Get the active subscription for an organization
- List all organizations a user belongs to
- Filter subscriptions by status across all organizations (admin panel, low frequency)

Without this, the model generates tables with no indexes. With it, every query pattern listed above maps to at least one index in the DDL output. The “high frequency” and “low frequency” labels let the model prioritize composite indexes correctly.

Migration-aware prompting

When the schema needs to evolve (and it will), add migration context:

MIGRATION CONTEXT:
- This is an ALTER migration, not a fresh schema
- The users and organizations tables already exist (paste current DDL)
- Add a new "invitations" table for pending org invites
- Add an "invited_by" column to memberships
- Preserve all existing data
- Include both UP and DOWN migration SQL
- Use IF NOT EXISTS and IF EXISTS guards

The key phrase is “preserve all existing data.” Without it, the model sometimes generates DROP TABLE statements or column changes that would destroy data. Specifying UP and DOWN migrations gives you rollback capability.

Failure modes

Tested on (TCC editorial scoring)

Methodology on the 14-task scorecard. The cross-model pattern (Opus and Codex tied on the top, Gemini softer on rule adherence) tracks the public SWE-bench Verified scores for schema-design subtasks.

Frequently asked questions

Should I use the schema the AI generates directly in production?
Treat it as a first draft from a competent junior engineer. Review the AMBIGUITIES section with your product owner, check the indexing against your actual query patterns, and run EXPLAIN ANALYZE on the critical paths before deploying. The AI gets structure and normalization right most of the time; cardinality edge cases and business logic constraints need a human review pass.

What if my schema already exists?
Paste the existing DDL in the “Existing schema, if any” field. The model will output only the delta and put existing table assumptions in AMBIGUITIES. Combine with the migration context section for ALTER migrations.

How do I handle Supabase or PlanetScale?
For Supabase, add a SUPABASE section to the input specifying Row Level Security policies per table and auth.uid() as the reference for current user. For PlanetScale (MySQL-compatible, no foreign keys at the DB layer), add “do not emit FOREIGN KEY constraints; enforce referential integrity at the application layer”.

Can the model handle JSONB columns correctly?
Yes, but you have to tell it when JSONB is appropriate. Add a note like “the features column on plans stores a variable set of feature flags; JSONB is acceptable here” and the model will add a GIN index on the JSONB column for containment queries. Without the note, it defaults to text or a separate features table.

The migration-with-backfill scores by model are on the Gemini 3.1 Pro review and the Claude Opus 4.7 review. The strict-JSON wrapper for turning the DDL output into machine-readable form is on the strict-JSON prompt. The agent loop that wraps this prompt for a multi-step design review is on the agent loop retry policy post.

One-line takeaway

Tag every entity with a lifecycle, include your query patterns in the input, force three indexing rules, demand an AMBIGUITIES section, and ban the 30-column users table. Then the schema that lands in the migration PR looks like something a senior engineer would have written on a good day.

esc