Contents

How to Automate Database Migrations with Alembic and SQLAlchemy

Alembic is the standard migration tool for SQLAlchemy projects. You run alembic init, point it at your SQLAlchemy models, and use alembic revision --autogenerate to produce migration scripts. Alembic then applies those scripts in order with alembic upgrade head, giving you repeatable, reviewable schema changes that work identically everywhere your application runs. As of March 2026, the latest stable release is Alembic 1.18.4, which supports SQLAlchemy 2.0 (currently at 2.0.48) and its modern declarative and typed APIs natively.

If you have ever relied on Base.metadata.create_all(engine) to set up your tables, you already know the problem: it creates tables that do not exist but does nothing for tables that already exist and need modification. It cannot add columns, change types, or create indexes on live tables. The moment you have a production database with real data, manual ALTER TABLE statements become your only option - and they come with no rollback plan, no audit trail, and no guarantee that your staging database matches production. Alembic solves this by maintaining a linear history of schema changes, each identified by a revision ID, stored in an alembic_version table in the database itself.

Setting Up Alembic in a SQLAlchemy Project

The initial setup is straightforward. Start by installing the packages:

pip install alembic sqlalchemy

Or if you use uv :

uv add alembic sqlalchemy

Alembic pulls in Mako for template rendering and uses your existing SQLAlchemy engine configuration. Initialize the migration environment:

alembic init alembic

This creates an alembic/ directory with several key files:

  • env.py - runtime configuration that connects Alembic to your database and models
  • script.py.mako - the Mako template used to generate new migration scripts
  • versions/ - the directory where migration scripts accumulate over time
  • alembic.ini at the project root - main configuration file

A typical project structure with Alembic alongside a FastAPI or Flask application looks like this:

myproject/
    alembic/
        env.py
        script.py.mako
        versions/
            001_initial_schema.py
            002_add_orders_table.py
    app/
        __init__.py
        models.py
        main.py
    alembic.ini
    pyproject.toml

Configuring the Database Connection

Open alembic.ini and set the database URL:

sqlalchemy.url = postgresql+psycopg://user:pass@localhost/mydb

Hardcoding credentials in a config file is a bad idea for anything beyond local development. A better approach is to read the URL from an environment variable in env.py:

import os
from alembic import context

config = context.config
config.set_main_option(
    "sqlalchemy.url",
    os.environ["DATABASE_URL"]
)

Connecting to Your Models

In env.py, import your declarative base and set the target metadata. This is what enables autogeneration - Alembic compares your Python model definitions against the actual database schema to detect differences:

from app.models import Base

target_metadata = Base.metadata

Generating and Applying the First Migration

With the configuration in place, generate your first migration:

alembic revision --autogenerate -m "initial schema"

This creates a Python file in alembic/versions/ with upgrade() and downgrade() functions containing the detected op.create_table() and op.create_index() calls. Apply it:

alembic upgrade head

A few other commands you will use constantly:

alembic current    # shows the current revision
alembic history    # shows the full migration chain
alembic heads      # shows all branch heads (should be exactly one)

Writing and Reviewing Migration Scripts

Autogeneration is a starting point, not a finished product. It detects a lot - new tables, removed tables, added or removed columns, changed column types (with caveats), added or removed indexes and unique constraints, and added or removed foreign keys. But it has blind spots.

What Autogenerate Cannot Detect

Autogenerate does not detect table renames (it sees a drop and a create), column renames (it sees a drop and an add, which loses data), changes to check constraints, changes to server defaults on existing columns, or PostgreSQL-specific features like ENUM type modifications.

For column renames, write the migration manually:

def upgrade():
    op.alter_column("users", "name", new_column_name="full_name")

def downgrade():
    op.alter_column("users", "full_name", new_column_name="name")

Data Migrations

Because Alembic migrations are Python scripts rather than raw SQL files, you can include data migrations alongside schema changes in the same revision. This is one of Alembic’s advantages over SQL-based tools like Flyway . For example, backfilling a column before adding a NOT NULL constraint:

def upgrade():
    op.add_column("users", sa.Column("role", sa.String(50), nullable=True))
    op.execute("UPDATE users SET role = 'member' WHERE role IS NULL")
    op.alter_column("users", "role", nullable=False)

Safe Column Addition for Production

Adding columns to production tables with existing data requires care. The safe pattern is:

  1. Add the column as nullable with a default value
  2. Deploy and backfill the data in application code or a data migration
  3. In a separate, subsequent migration, add the NOT NULL constraint

Never add a NOT NULL column without a default to a table that already contains rows. The migration will fail outright, or worse, it will succeed on your empty dev database and fail in production.

Always Review Before Committing

Always read the upgrade() and downgrade() functions before committing a migration. Test both directions:

alembic upgrade head && alembic downgrade -1 && alembic upgrade head

This round-trip catches issues where the downgrade function does not properly reverse the upgrade - a common source of trouble when you need to roll back in production.

Branching, Merging, and Team Workflows

Each migration has a revision ID and a down_revision pointing to its parent, forming a linked list. When two developers create migrations from the same parent independently, Alembic detects a branch: two revisions sharing the same down_revision.

Resolving Branches

Resolve branches with the merge command:

alembic merge -m "merge feature_x and feature_y" <rev1> <rev2>

This creates a merge migration with down_revision = (<rev1>, <rev2>) and empty upgrade()/downgrade() functions. The merge migration simply tells Alembic that both branches must be applied before anything that comes after.

Preventing Branch Conflicts in CI

A practical prevention strategy is to add a CI check that runs alembic heads and fails if more than one head exists. This forces developers to merge branches before merging their pull request:

#!/bin/bash
# ci-check-migration-heads.sh
HEADS=$(alembic heads | wc -l)
if [ "$HEADS" -gt 1 ]; then
    echo "ERROR: Multiple migration heads detected. Run 'alembic merge' first."
    exit 1
fi

Naming Conventions

Use descriptive messages when creating revisions:

alembic revision --autogenerate -m "add_orders_table"

The revision filename includes the message, making alembic/versions/ browsable and the migration history meaningful in alembic history output. Avoid generic messages like “migration” or “update”.

Stamping Existing Databases

If you are retrofitting Alembic onto a database that was originally created with create_all(), use alembic stamp head. This marks the database as being at the latest revision without running any migrations, which is exactly what you want when the schema already matches your models.

Offline SQL Generation

For environments where the migration tool cannot connect directly to the database, or where a DBA must review raw SQL before it runs:

alembic upgrade head --sql > migration.sql

This generates the SQL statements without executing them.

Using Alembic with Async SQLAlchemy

If your application uses async SQLAlchemy with asyncpg or another async driver, Alembic supports this through a dedicated async template. Initialize with:

alembic init -t async alembic

This generates an env.py that uses async_engine_from_config and connection.run_sync to bridge the async engine with Alembic’s synchronous migration runner:

from sqlalchemy.ext.asyncio import async_engine_from_config
from sqlalchemy import pool

async def run_async_migrations():
    connectable = async_engine_from_config(
        config.get_section(config.config_ini_section, {}),
        prefix="sqlalchemy.",
        poolclass=pool.NullPool,
    )

    async with connectable.connect() as connection:
        await connection.run_sync(do_run_migrations)

    await connectable.dispose()

Alembic does not provide an async API directly. The migration operations themselves always run synchronously - the async template just handles creating the database connection asynchronously and then hands a synchronous connection object to the migration context.

Common Pitfalls and Production Safety

Migrations that pass on an empty dev database can fail badly on a production database with millions of rows.

Table Locking

ALTER TABLE on PostgreSQL acquires an ACCESS EXCLUSIVE lock on the entire table. Adding a column to a table with 50 million rows can lock it for minutes, blocking all reads and writes. Set a lock timeout to fail fast instead of blocking your entire application:

def upgrade():
    op.execute("SET lock_timeout = '5s'")
    op.add_column("users", sa.Column("email", sa.Text()))

Index Creation

CREATE INDEX also locks the table. On PostgreSQL, use CREATE INDEX CONCURRENTLY to build the index without holding a lock:

op.create_index(
    "ix_users_email",
    "users",
    ["email"],
    postgresql_concurrently=True,
)

Note that concurrent index creation cannot run inside a transaction. You need to configure the migration to run outside a transaction block by setting autocommit=True in the migration context.

Column Drops and Rolling Deployments

During rolling deployments, old application instances still reference the column you just dropped. Always deploy a code change that removes all references to the column first, wait for all old instances to drain, and then drop the column in a subsequent migration. Otherwise, the old application code will throw errors on the missing column during the rollover window.

Server Defaults vs Python Defaults

server_default=sa.text("'active'") creates a database-level default that applies to all inserts, including raw SQL and other applications hitting the same database. default="active" only works within SQLAlchemy’s Python layer. For columns that might be populated outside your application, always use server_default.

Testing Migrations Properly

Run alembic upgrade head, then alembic downgrade base, then alembic upgrade head again in CI against a real PostgreSQL (or MySQL) instance. Testcontainers or a Docker service in your CI pipeline works well for this. SQLite is not a valid proxy for testing PostgreSQL migrations - the DDL behavior, type system, and locking semantics are fundamentally different.

Transactional DDL

PostgreSQL supports transactional DDL, meaning a failed migration rolls back automatically. MySQL does not - a failed migration leaves the schema in a partial state, requiring manual cleanup. Know your database engine’s behavior and plan accordingly.

Rollback Strategies for Destructive Migrations

Not every migration can be cleanly reversed. Dropping a column destroys data - there is no downgrade() function that can recreate it. For these cases:

  • Take a database snapshot or logical dump before applying destructive migrations in production. This is your actual safety net, not the downgrade() function.
  • Use a two-phase approach: stop writing to the column first, migrate the data elsewhere if needed, then drop the column in a separate migration after confirming the data is no longer required.
  • Rather than a single migration that drops five columns, write five separate migrations. This lets you revert individual changes without affecting the rest.
  • Some downgrades are simply not possible. It is better to write a downgrade() function that raises NotImplementedError with a clear message than to write a broken downgrade that silently corrupts data.

Alembic vs Other Migration Tools

FeatureAlembicDjango MigrationsFlywaygolang-migrate
LanguagePythonPython (Django)Java/SQLGo
Migration formatPython scriptsPython scriptsSQL filesSQL files
AutogenerationYesYesNoNo
ORM integrationSQLAlchemyDjango ORMNoneNone
Data migrationsBuilt-in (Python)Built-in (Python)SQL onlySQL only
Async supportYes (template)NoN/AN/A
Multi-databaseManual setupBuilt-in routingPer-configPer-config
Branching/MergingBuilt-inBuilt-inLinear onlyLinear only

Choose Alembic if you are already using SQLAlchemy. Choose Django Migrations if you are in a Django project - they are built in and tightly integrated with the Django ORM. Flyway is a strong option for JVM-based projects or teams that prefer writing raw SQL migrations. golang-migrate fills the same role for Go applications.

Integrating Alembic into CI/CD Pipelines

Running migrations as part of your deployment pipeline is the standard approach. A typical flow looks like this:

  1. CI validates migrations by running alembic upgrade head && alembic downgrade base && alembic upgrade head against a disposable test database, and checking that alembic heads returns exactly one head.
  2. Before deploying new application code, run alembic upgrade head against the production database. This ensures the schema is ready before new code tries to use it.
  3. After deployment, run alembic current to confirm the database is at the expected revision.

For containerized deployments, a common pattern is an init container or a pre-start script that runs alembic upgrade head before the main application starts:

# Dockerfile entrypoint script
#!/bin/bash
alembic upgrade head
exec uvicorn app.main:app --host 0.0.0.0 --port 8000

Migrations should be backward-compatible with the currently running application code. Deploy schema changes that add new structures first, then deploy the code that uses them, then deploy migrations that remove old structures. This “expand-contract” pattern prevents downtime during rolling deployments.