Spring Boot/Flyway migration — DB schema versioning
~22 phútSpring Data JPAMiễn phí

Flyway migration — DB schema versioning

Flyway tự động version + apply migration script. Bài này bóc cơ chế Flyway, naming convention V/U/R, autoconfig Boot, schema_history table, rollback strategy, baseline existing DB, và pattern multi-environment migration.

Bài 02 đã chỉ ra ddl-auto=update không an toàn production. Bài 05 mention ddl-auto=validate + Flyway. Bài này bóc cụ thể Flyway: setup, naming, internals, rollback strategy, multi-env.

Schema migration là 1 trong 3 trụ cột production-grade Spring app (cùng với observability + test). Skip Flyway = production data risk.

1. Vì sao schema migration

# Anti-pattern
spring.jpa.hibernate.ddl-auto: update

Hibernate update:

  • Add column missing.
  • Không drop column unused.
  • Không rename column (= drop + add → data loss).
  • Không change column type (Postgres ALTER semantic phức tạp).
  • Race condition: 2 pod startup concurrent → conflict.
  • Inconsistent giữa Hibernate version.

Production cần:

  • Versioned: each schema change = version number, applied once.
  • Repeatable: same script → same schema across env.
  • Rollback-able: ability to undo.
  • Audit: history of changes in DB.
  • Concurrent-safe: 100 pod startup OK.

→ Migration tool. FlywayLiquibase là 2 lựa chọn chính. Boot 3 support cả 2. Khoá này dùng Flyway (đơn giản hơn).

2. Flyway core concepts

2.1 Migration script

src/main/resources/db/migration/
├── V1__init_schema.sql              # versioned migration
├── V2__add_project_priority.sql
├── V3__create_tasks_table.sql
├── V20260415_1430__add_index.sql
├── R__create_views.sql              # repeatable migration
└── U2__rollback_priority.sql        # undo (Flyway Teams only)

Naming convention:

PrefixTypeRun when
V<version>__<desc>.sqlVersionedOnce, in order
R__<desc>.sqlRepeatableWhen checksum changes
U<version>__<desc>.sqlUndoManual rollback (Teams edition)

Version format:

  • V1 simple integer.
  • V1.1 decimal (apply after V1, before V2).
  • V20260415_1430 timestamp (multi-team safe — no version conflict).
  • V001, V002 zero-padded (sortable string).

Recommend timestamp format cho team lớn — tránh conflict khi 2 dev cùng add migration.

2.2 schema_history table

Flyway tự tạo table tracking:

CREATE TABLE flyway_schema_history (
    installed_rank INT NOT NULL,
    version VARCHAR(50),
    description VARCHAR(200) NOT NULL,
    type VARCHAR(20) NOT NULL,
    script VARCHAR(1000) NOT NULL,
    checksum INT,
    installed_by VARCHAR(100) NOT NULL,
    installed_on TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    execution_time INT NOT NULL,
    success BOOLEAN NOT NULL,
    PRIMARY KEY (installed_rank)
);

App startup:

  1. Connect DB.
  2. Read flyway_schema_history.
  3. Find pending migration in db/migration/ not yet applied.
  4. Apply in version order.
  5. Insert row in flyway_schema_history per migration.

Concurrent-safe: Flyway acquire DB lock during migration → 100 pod startup, only 1 applies, others wait.

Idempotent: restart app → check history → no pending → no-op.

3. Setup Boot

3.1 Dependency

<dependency>
    <groupId>org.flywaydb</groupId>
    <artifactId>flyway-core</artifactId>
</dependency>

<!-- Postgres support (Boot 3.4+) -->
<dependency>
    <groupId>org.flywaydb</groupId>
    <artifactId>flyway-database-postgresql</artifactId>
</dependency>

flyway-core là Apache 2.0 (free). flyway-database-* cho specific DB Postgres/Oracle/MySQL — also free for Postgres.

3.2 Configuration

spring:
  flyway:
    enabled: true
    locations: classpath:db/migration              # default
    baseline-on-migrate: true                       # for existing DB
    validate-on-migrate: true                       # checksum verify
    out-of-order: false                             # enforce order

  jpa:
    hibernate:
      ddl-auto: validate                            # combine with Flyway

ddl-auto: validate + Flyway = recommended pattern:

  • Flyway apply migration.
  • Hibernate validate entity match schema → fail fast nếu mismatch.

3.3 First migration

-- V1__init_schema.sql

CREATE TABLE projects (
    id BIGSERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    description VARCHAR(500),
    status VARCHAR(20) NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    CONSTRAINT uk_project_name UNIQUE (name),
    CONSTRAINT chk_project_status CHECK (status IN ('PLANNING', 'ACTIVE', 'DONE', 'ARCHIVED'))
);

CREATE INDEX idx_projects_status ON projects(status);
CREATE INDEX idx_projects_created ON projects(created_at DESC);

App startup → Flyway detect V1, run script, record in history.

3.4 Add column migration

-- V2__add_project_priority.sql

ALTER TABLE projects ADD COLUMN priority VARCHAR(20) NOT NULL DEFAULT 'MEDIUM';

CREATE INDEX idx_projects_priority ON projects(priority);

Best practice:

  • DEFAULT value cho NOT NULL column — backward compat with existing rows.
  • Add index in same migration nếu liên quan.
  • 1 migration = 1 logical change. Avoid multi-purpose migration.

3.5 Multi-statement caveat

-- V3__multi_change.sql

ALTER TABLE projects ADD COLUMN owner_id BIGINT;
ALTER TABLE projects ADD CONSTRAINT fk_owner FOREIGN KEY (owner_id) REFERENCES users(id);
UPDATE projects SET owner_id = 1 WHERE owner_id IS NULL;

Postgres execute trong 1 tx. Fail giữa chừng → rollback all. OK.

MySQL InnoDB không transactional DDL → fail giữa chừng = state corrupt. Use Flyway mixed = false enforce.

4. Rollback strategy

Flyway free không support undo migration. 3 strategy:

4.1 Forward-only (recommend)

Rollback bug = forward migration:

-- V5__add_buggy_column.sql
ALTER TABLE projects ADD COLUMN buggy_field INT;

-- Bug discovered, fix:
-- V6__remove_buggy_column.sql
ALTER TABLE projects DROP COLUMN buggy_field;

Pros:

  • Audit trail clear.
  • Same workflow apply / rollback.
  • Production-safe (no manual undo script).

Cons:

  • Verbose: 2 migration cho rollback.
  • Data loss: drop column = lose data.

Quy tắc: test migration kỹ trên staging trước production.

4.2 Backup + restore (for catastrophic)

# Before deploy
pg_dump prod > backup-2026-04-15.sql

# Deploy fail
psql prod < backup-2026-04-15.sql

Last resort. Used cho schema disaster (rare). Operational overhead.

4.3 Flyway Teams undo (paid)

-- V5__add_priority.sql
ALTER TABLE projects ADD COLUMN priority VARCHAR(20);

-- U5__add_priority.sql      (Teams edition only)
ALTER TABLE projects DROP COLUMN priority;

flyway undo command rollback last applied migration. Useful nhưng cost license.

5. Baseline existing DB

App existing với schema chưa có Flyway. Migrate to Flyway:

spring:
  flyway:
    baseline-on-migrate: true
    baseline-version: 1
    baseline-description: "Existing schema as of 2026-04-15"

Flyway:

  1. Detect flyway_schema_history không có.
  2. Insert baseline row với version 1.
  3. Apply pending migration V2+ on top.

Workflow:

  1. Snapshot current DB schema → V1__baseline.sql (informational).
  2. Add V2+ for future changes.
  3. Deploy với baseline-on-migrate: true.
  4. Existing DB jump to Flyway-managed.

6. Multi-environment pattern

src/main/resources/db/migration/         # default (all env)
├── V1__init_schema.sql
├── V2__add_project.sql

src/main/resources/db/migration/dev/     # dev-only
├── V100__sample_data.sql                # seed dev data

src/main/resources/db/migration/prod/    # prod-only
├── V200__create_admin_user.sql

Configuration:

# application-dev.yml
spring.flyway.locations:
  - classpath:db/migration
  - classpath:db/migration/dev

# application-prod.yml
spring.flyway.locations:
  - classpath:db/migration
  - classpath:db/migration/prod

Common migration apply mọi env. Profile-specific cho seed data, env-specific config.

7. Java-based migration

SQL không đủ → Java migration:

package db.migration;

public class V3__migrate_legacy_format extends BaseJavaMigration {

    @Override
    public void migrate(Context context) throws Exception {
        try (Connection conn = context.getConnection();
             Statement st = conn.createStatement()) {

            ResultSet rs = st.executeQuery("SELECT id, legacy_data FROM projects");
            while (rs.next()) {
                Long id = rs.getLong("id");
                String legacy = rs.getString("legacy_data");
                String migrated = parseAndTransform(legacy);

                try (PreparedStatement update = conn.prepareStatement(
                        "UPDATE projects SET migrated_data = ? WHERE id = ?")) {
                    update.setString(1, migrated);
                    update.setLong(2, id);
                    update.executeUpdate();
                }
            }
        }
    }
}

Use case:

  • Data transformation phức tạp (parse JSON, decrypt).
  • External call (lookup table from API).
  • Multi-step logic.

Discouraged unless really needed — SQL migration easier review/audit.

8. Repeatable migration (R__)

-- R__update_views.sql
DROP VIEW IF EXISTS active_projects;

CREATE VIEW active_projects AS
SELECT id, name, status, created_at
FROM projects
WHERE status IN ('ACTIVE', 'PLANNING');

Flyway compute checksum của file. Change file content → checksum change → re-run.

Use case:

  • Database views.
  • Stored procedures.
  • Functions.

Versioned migration (V) for table/column DDL — applied once. Repeatable migration (R) for views/procedures — applied when changed.

9. CI/CD pipeline

Production deployment workflow:

flowchart LR
    Dev[Dev local]
    Branch["Add V_xxx.sql<br/>commit + PR"]
    CI[CI pipeline]
    Test["Test migration<br/>(create temp DB, apply all)"]
    Staging["Deploy staging<br/>app applies migration"]
    Prod["Deploy prod<br/>app applies migration"]

    Dev --> Branch --> CI --> Test
    Test -->|"OK"| Staging
    Staging -->|"verify"| Prod

CI test migration:

# .github/workflows/ci.yml
jobs:
  test-migration:
    services:
      postgres:
        image: postgres:16
        env:
          POSTGRES_PASSWORD: test
        ports: ["5432:5432"]
    steps:
      - uses: actions/checkout@v4
      - run: |
          mvn flyway:migrate \
            -Dflyway.url=jdbc:postgresql://localhost/test \
            -Dflyway.user=postgres \
            -Dflyway.password=test
      - run: mvn test

CI verify:

  • Migration apply clean (fresh DB).
  • Test pass on migrated schema.
  • Migration not break Hibernate validate.

10. Vận hành production — zero-downtime, runbook, monitoring

Schema migration là 1 trong 3 nguồn outage thường gặp nhất (cùng deploy bug + DB resource). Section này cover quy trình production: zero-downtime patterns, runbook khi migration fail, monitoring drift.

10.1 Zero-downtime — expand-contract pattern

App chạy 100 pod, không thể stop để migrate. Schema change phải backwards compatible giữa old + new code coexist trong rolling deploy.

StepActionCode stateSchema state
1Expand: thêm column nullableOld code unchangedOld + new column
2Deploy code dual-write (write old + new)Old và new coexistOld + new column
3Backfill data old → newBackground jobOld + new + data
4Deploy code read new onlyNew codeOld + new column
5Contract: drop old columnNew codeNew column

Mỗi migration deploy riêng, cách nhau ít nhất 1 release cycle. Skip step → schema mismatch giữa pod blue/green → 500 random.

10.2 DDL lock — bảng nguy hiểm

Postgres DDL acquire ACCESS EXCLUSIVE lock — block mọi query trên table.

DDLPostgres behaviorRisk
ALTER TABLE ADD COLUMN (no default)Instant — chỉ catalogSafe
ALTER TABLE ADD COLUMN ... DEFAULT 0 NOT NULLPostgres 11+ instant; trước rewriteSafe Postgres 11+
CREATE INDEXBlock writesDangerCREATE INDEX CONCURRENTLY
ALTER TABLE ADD CONSTRAINT NOT VALIDInstantSafe — validate sau
UPDATE TABLE SET col = ... (backfill)Lock row, slowRisk — chunk

CONCURRENTLY không thể trong tx → tách migration:

-- V20260415_1__add_index_concurrent.sql
-- flyway:executeInTransaction=false
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_orders_status ON orders (status);

10.3 Failure runbook

Mode 1 — Migration fail tại startup:

Diagnose: SELECT * FROM flyway_schema_history WHERE success = false. Reproduce locally.

Remediate:

  • Dev: flyway repair → fix → retry.
  • Production: manually rollback DDL → DELETE FROM flyway_schema_history WHERE version = '...' → fix script → redeploy. Document trong runbook.

Mode 2 — Schema drift (ai đó chạy SQL manual):

Diagnose: pg_dump --schema-only so với output expected. Tool Atlas, Schemaflow.

Remediate: tạo migration "catch-up" bring schema vào trạng thái expected. Document lý do drift.

Mode 3 — Long migration block startup:

Triệu chứng: pod startup vượt 5 phút, K8s liveness probe fail → restart loop.

Remediate: tách long migration ra job riêng (manual run hoặc Spring Batch). Boot startup chỉ chạy migration ngắn.

10.4 Monitoring + alerts

Boot 3 expose /actuator/flyway. Alert pattern:

- alert: FlywayMigrationFailed
  expr: up{job="app"} == 0 AND flyway_migration_state{state="failed"} > 0
  for: 1m
  labels:
    severity: critical

- alert: FlywayPendingMigration
  expr: flyway_migrations_pending > 0
  for: 30m
  labels:
    severity: warning

10.5 Pre-production safeguards

Checklist trước khi merge migration:

CheckTool / Method
Migration syntax validflyway validate trong CI
Apply trên staging clone of prodRestore prod backup, run migration
Estimate execution timeRun trên prod-sized clone, time it
No DDL block-writes trên large tableCode review checklist
Backup taken before deployAutomated pre-deploy hook
Rollback plan documentedPR template field "Rollback steps"
Both old + new code work với new schemaContract test

Tool: pg-osc, pg_repack cho table 100M+ row online schema change.

11. Pitfall tổng hợp

Nhầm 1: Modify already-applied migration.

-- V5__add_field.sql (already in production)
ALTER TABLE projects ADD COLUMN priority VARCHAR(20);
ALTER TABLE projects ADD COLUMN owner_id BIGINT;     -- ADDED LATER

Flyway checksum mismatch → app fail to start. ✅ Add new V6 migration thay vì modify V5. Never modify applied migration.

Nhầm 2: ddl-auto=update + Flyway. ✅ ddl-auto=validate only. Mix = race condition.

Nhầm 3: Migration không idempotent (use IF NOT EXISTS).

CREATE TABLE projects (...);                          -- fail nếu re-run
CREATE TABLE IF NOT EXISTS projects (...);            -- safe
CREATE INDEX idx_status ON projects(status);          -- fail nếu exists
CREATE INDEX IF NOT EXISTS idx_status ON projects(status);

✅ Defensive — IF NOT EXISTS cho safety re-run.

Nhầm 4: Long-running migration block startup.

-- V10__rebuild_index.sql
DROP INDEX idx_old;
CREATE INDEX idx_new ON huge_table (...);             -- 1 hour on 1B row

✅ Online migration: CREATE INDEX CONCURRENTLY (Postgres). Hoặc separate maintenance window. Module 09 đào sâu.

Nhầm 5: Quên backup trước production migration. ✅ pg_dump snapshot trước mọi migration risky.

Nhầm 6: Naming V1, V2 cho team lớn.

Dev A: V5__feature_a.sql (commit 10:00)
Dev B: V5__feature_b.sql (commit 10:05)
# Conflict — same version

✅ Timestamp: V20260415_1000__feature_a.sql, V20260415_1005__feature_b.sql.

Nhầm 7: Hardcode env-specific value trong migration.

INSERT INTO config VALUES ('admin_email', '[email protected]');

✅ Use Flyway placeholder hoặc separate dev migration.

12. 📚 Deep Dive Spring Reference

📚 Tài liệu chính chủ

13. Tóm tắt

  • Flyway = schema migration tool — versioned, repeatable, audit-ed.
  • 3 type migration: V<version>__ (versioned, once), R__ (repeatable on checksum change), U<version>__ (undo, Teams edition).
  • Boot autoconfig với flyway-core + flyway-database-* dependency. Default location db/migration/.
  • ddl-auto=validate + Flyway = pattern chuẩn production.
  • flyway_schema_history table track applied migrations. Concurrent-safe via DB lock.
  • Forward-only rollback: bug = new forward migration. Free Flyway không support undo.
  • Baseline existing DB với baseline-on-migrate: true migrate legacy schema sang Flyway.
  • Multi-environment: common db/migration/ + per-env subfolder dev/, prod/.
  • Java migration cho complex transform — discouraged unless needed.
  • Repeatable migration (R__) cho views, procedures, functions.
  • CI test migration trên fresh DB trước deploy.
  • Naming timestamp (V20260415_1430__) cho team lớn tránh version conflict.
  • Never modify applied migration — checksum mismatch fail app.
  • IF NOT EXISTS cho idempotent.
  • Long migration: CONCURRENTLY index, separate maintenance window.

14. Tự kiểm tra

Tự kiểm tra
Q1
Vì sao `ddl-auto=update` không phù hợp production? Flyway giải quyết những vấn đề cụ thể nào?

Vấn đề `ddl-auto=update`:

  1. Không version: Hibernate diff entity vs DB tại runtime — không có audit trail "schema này version N".
  2. Không drop / rename: add column OK, drop column / rename không. Schema dirty.
  3. Không change type safely: VARCHAR(50) → VARCHAR(100) ok. INTEGER → BIGINT phức tạp tuỳ DB.
  4. Race condition: 100 pod startup concurrent → multiple ALTER TABLE → conflict.
  5. Inconsistent giữa Hibernate version: behavior thay đổi unpredictably.
  6. Khó rollback: không có concept "undo".
  7. Test khó: schema phụ thuộc Hibernate parsing entity — không reproducible.

Flyway giải quyết:

  1. Versioned migration: mỗi script = 1 version. Apply once, in order, recorded in flyway_schema_history.
  2. Concurrent-safe: Flyway acquire DB lock — 100 pod startup, chỉ 1 apply, rest wait.
  3. Audit trail: flyway_schema_history log every change with timestamp + user.
  4. Reproducible: SQL script = source of truth. Apply same → same result mọi env.
  5. Pre-deploy verify: CI run migration trên fresh DB → catch error trước production.
  6. Forward-only rollback: bug = new migration revert. Audit clear.
  7. Multi-DB support: Flyway abstract dialect — dev local SQLite, prod Postgres.

Pattern combine:

spring:
flyway:
  enabled: true
  locations: classpath:db/migration
  baseline-on-migrate: true
jpa:
  hibernate:
    ddl-auto: validate            # KHONG update — rely Flyway

Workflow:

  1. Dev local add entity field → write Flyway migration.
  2. Apply local Postgres → entity validate ok.
  3. Commit + PR review.
  4. CI: fresh Postgres + apply all migration + run test.
  5. Deploy staging: Flyway apply pending migration on startup.
  6. Verify staging.
  7. Deploy production: same flow.

Production safe + auditable + rollback-friendly.

Q2
Đoạn migration sau có 3 vấn đề. Liệt kê + sửa.
-- V5__update_schema.sql

ALTER TABLE projects ADD COLUMN priority VARCHAR(20);
ALTER TABLE projects ADD COLUMN owner_id BIGINT;
ALTER TABLE projects ADD CONSTRAINT fk_owner FOREIGN KEY (owner_id) REFERENCES users(id);
UPDATE projects SET priority = 'MEDIUM';

CREATE TABLE audit_log (
  id BIGINT PRIMARY KEY,
  action VARCHAR(50),
  timestamp TIMESTAMPTZ
);

INSERT INTO config VALUES ('admin_email', 'admin@local');
  1. Multiple unrelated changes: add column projects + create new table audit_log + insert config — 3 logical change. Fail debug khó.

    Fix: tách thành 3 migration:

    V5__add_project_priority_owner.sql      # column + constraint cho projects
    V6__create_audit_log_table.sql           # table audit_log
    V7__seed_initial_config.sql              # data seed
    1 migration = 1 logical change. Easy review, easy rollback.
  2. Add NOT NULL column với existing rows:
    ALTER TABLE projects ADD COLUMN priority VARCHAR(20);  -- nullable, default null
    UPDATE projects SET priority = 'MEDIUM';                -- backfill
    Workflow OK trong 1 migration nhưng risk nếu fail giữa chừng. Better: NOT NULL với default value:
    ALTER TABLE projects ADD COLUMN priority VARCHAR(20) NOT NULL DEFAULT 'MEDIUM';
    Postgres backfill atomic + immediate. No two-step.
  3. Hardcode env-specific data:
    INSERT INTO config VALUES ('admin_email', 'admin@local');
    Local email apply prod = wrong. Should env-specific.

    Fix: tách dev seed:

    # db/migration/         (chung)
    # db/migration/dev/V100__seed_dev_config.sql
    INSERT INTO config VALUES ('admin_email', '[email protected]');
    
    # db/migration/prod/V100__seed_prod_config.sql
    INSERT INTO config VALUES ('admin_email', '${ADMIN_EMAIL_PLACEHOLDER}');
    
    # application-dev.yml
    spring.flyway.locations:
    - classpath:db/migration
    - classpath:db/migration/dev
    Hoặc dùng Flyway placeholder substitution.

Bonus issues:

  • FK constraint fail nếu users table chưa có: verify migration order (V<5 must create users).
  • audit_log không có index: add CREATE INDEX idx_audit_timestamp ON audit_log(timestamp DESC) cho query log gần đây.
  • id BIGINT PRIMARY KEY without auto-gen: change to BIGSERIAL.

Code đúng (3 migration):

-- V5__add_project_priority_owner.sql
ALTER TABLE projects ADD COLUMN priority VARCHAR(20) NOT NULL DEFAULT 'MEDIUM';
ALTER TABLE projects ADD COLUMN owner_id BIGINT;
ALTER TABLE projects ADD CONSTRAINT fk_owner FOREIGN KEY (owner_id) REFERENCES users(id);
CREATE INDEX idx_projects_priority ON projects(priority);

-- V6__create_audit_log_table.sql
CREATE TABLE audit_log (
  id BIGSERIAL PRIMARY KEY,
  action VARCHAR(50) NOT NULL,
  timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_audit_timestamp ON audit_log(timestamp DESC);

-- db/migration/dev/V100__seed_dev_config.sql
INSERT INTO config VALUES ('admin_email', '[email protected]');
Q3
Team 3 dev cùng add Flyway migration. V5 conflict. Cách tránh?

Vấn đề:

# Dev A: 10:00 commit
db/migration/V5__add_priority.sql

# Dev B: 10:05 commit
db/migration/V5__add_owner.sql

# Merge conflict — same filename

Hoặc tệ hơn: merge có 2 file V5 khác nhau → Flyway error "Duplicate version 5".

3 strategy fix:

1. Timestamp naming (recommend cho team lớn):

V20260415_1000__add_priority.sql
V20260415_1005__add_owner.sql

Pros:

  • No conflict — timestamp unique.
  • Order natural — chronological.
  • Version evident — "this migration from 2026-04-15".

Cons:

  • Filename longer.
  • Out-of-order risk: dev local apply 10:05 first, 10:00 sau → Flyway warn "out of order".

Configuration:

spring:
flyway:
  out-of-order: true     # cho phep apply out of order (DEV only)

2. Sequential numbering với coordination:

Team agree "next available version", communicate qua Slack/Linear:

# Dev A: claim V5
git checkout -b feature-priority
echo "V5__add_priority.sql" > db/migration/V5__add_priority.sql

# Dev B: claim V6 (after A merges hoặc after channel agree)
git checkout -b feature-owner
echo "V6__add_owner.sql" > db/migration/V6__add_owner.sql

Pros: clean sequential. Cons: communication overhead, race condition vẫn có thể.

3. Pre-merge hook check:

# .github/workflows/migration-check.yml
- name: Check duplicate Flyway version
run: |
  files=$(ls src/main/resources/db/migration/V*__*.sql | sed 's/.*V\([0-9]*\)__.*/\1/')
  duplicates=$(echo "$files" | sort | uniq -d)
  if [ -n "$duplicates" ]; then
    echo "Duplicate versions: $duplicates"
    exit 1
  fi

CI fail merge với duplicate version. Force resolve trước merge.

Recommend cho TaskFlow capstone:

  • Solo dev / small team (1-3): sequential V1, V2, V3.
  • Team lớn (5+) hoặc microservices: timestamp.

Workflow merge conflict (sequential):

  1. Dev A merge first → V5 in main.
  2. Dev B rebase: file conflict → rename V5 → V6.
  3. Push, PR, merge.

Manual rebase 30 giây — acceptable cost.

Pattern industry:

  • GitHub, GitLab use sequential.
  • Stripe, Shopify use timestamp.
  • Atlas (Liquibase competitor) auto-detect và sort.
Q4
Production app có 100M row trong table `orders`. Bạn cần add index `idx_orders_status`. Migration `CREATE INDEX idx_orders_status ON orders(status)` block startup 30 phút. Cách giải?

Vấn đề:

  • CREATE INDEX Postgres default acquire SHARE lock — block writes.
  • 30 phút lock → INSERT/UPDATE order block → app downtime.
  • Pod startup hold migration lock → 99 pod khác wait.
  • K8s readiness probe fail → restart loop.

Fix 1 — Postgres `CREATE INDEX CONCURRENTLY`:

-- V10__add_status_index.sql
CREATE INDEX CONCURRENTLY idx_orders_status ON orders(status);

Postgres `CONCURRENTLY` mode:

  • Không acquire lock writes.
  • Build index trong background — slow hơn (~2x duration).
  • Insert/update vẫn run concurrent.

Caveat Flyway: `CONCURRENTLY` không thể chạy trong transaction. Mặc định Flyway wrap mỗi migration trong tx.

-- Flyway error: "CREATE INDEX CONCURRENTLY cannot run inside a transaction block"

Solution:

spring:
flyway:
  postgresql:
    transactional-lock: false   # Postgres-specific config Flyway 8+

# Hoac dung Flyway placeholder
-- /* @flyway:transactional false */
CREATE INDEX CONCURRENTLY idx_orders_status ON orders(status);

Fix 2 — Maintenance window:

  1. Schedule maintenance: Saturday 02:00 (low traffic).
  2. Disable writes: read-only mode app.
  3. Run migration manually qua psql:
    psql prod -f V10__add_status_index.sql
  4. Mark Flyway history insert manually:
    INSERT INTO flyway_schema_history (...) VALUES (10, '10', 'add status index', 'SQL', ...);
  5. Deploy app — Flyway thấy V10 đã applied, skip.

Pros: full control. Cons: manual, error-prone.

Fix 3 — Online migration tool:

  • gh-ost (GitHub for MySQL) — schema change online.
  • pg_repack (Postgres) — reorganize table without lock.
  • Liquibase Pro — automated zero-downtime.

Tools build for large-scale schema change. Overhead setup.

Recommend cho TaskFlow:

  • Default Postgres: use `CONCURRENTLY` cho all index creation in production migration.
  • Maintenance window: for major refactor (drop/rename column with data) on huge tables.
  • Plan migration carefully: measure on staging với production-size data.

Ngừa từ đầu:

  • Add index trong V1 init schema — no data, fast.
  • Monitor query performance từ early — add index trước table grow lớn.
  • Use partial index nếu phù hợp: CREATE INDEX ... WHERE status = 'ACTIVE' — smaller, faster build.
Q5
Bạn modify `V5__add_priority.sql` đã apply production. Restart app fail "checksum mismatch". Vì sao? Cách recover?

Vì sao:

Flyway compute MD5 checksum của mỗi migration script lúc apply. Lưu vào flyway_schema_history.checksum:

SELECT version, description, checksum FROM flyway_schema_history;
-- 5  add priority  -1234567890

Restart app:

  1. Flyway scan db/migration/ → V5 file.
  2. Compute checksum hiện tại.
  3. Compare với history — checksum khác (do file modified).
  4. Throw FlywayException: Migration checksum mismatch for migration version 5.
  5. App fail to start.

Vì sao Flyway strict:

  • Modified migration → app A start ok (apply original), app B start fail (load modified).
  • Inconsistent state across env.
  • Audit trail broken — DB schema không match commit history.

Cách recover (3 options):

Option 1 (correct) — Revert + new migration:

# Git revert V5 to original content
git checkout HEAD~1 -- src/main/resources/db/migration/V5__add_priority.sql

# Add new migration with desired change
echo "ALTER TABLE projects ADD COLUMN priority_extra VARCHAR(20);" > V6__add_priority_extra.sql

# Commit + deploy

Pattern: forward-only. Original V5 stay, new V6 add change. Production safe.

Option 2 — Flyway repair (production careful):

# 1. Export current DB schema (backup)
pg_dump prod > backup.sql

# 2. Update flyway_schema_history checksum manually
mvn flyway:repair \
-Dflyway.url=jdbc:postgresql://prod-db/app

# Hoac SQL direct:
UPDATE flyway_schema_history SET checksum = NULL WHERE version = '5';
-- Then Flyway recompute on next run

Recompute checksum để match modified file. **Risky** — DB state có thể không match script content.

Option 3 — Modify history table directly (NEVER recommend):

UPDATE flyway_schema_history SET checksum = <new_checksum> WHERE version = '5';

Hack quick fix. Lose audit. Don't.

Pre-deploy check (CI):

# .github/workflows/ci.yml
- name: Verify migration not modified
run: |
  # Run migration on fresh DB
  mvn flyway:migrate
  # Run on existing DB (production-like)
  mvn flyway:validate -Dflyway.url=staging-db   # fail nếu checksum mismatch

Quy tắc vàng:

  1. Never modify migration đã merge to main.
  2. Modify migration trên feature branch (chưa merge) — OK, applied chỉ local.
  3. Need change after merge → forward migration mới.
  4. Disaster recovery: option 1 hoặc 2 với backup.

Pre-merge checklist:

  • Migration tested local: apply, rollback (manual), reapply.
  • Migration tested staging: real data, Hibernate validate ok.
  • PR review by 2 dev: SQL review + impact assessment.
  • Merge → no modification thereafter.
Q6
Repeatable migration `R__update_views.sql` khác versioned migration `V` thế nào? Use case cụ thể?
AspectVersioned (V)Repeatable (R)
NamingV<version>__<desc>.sqlR__<desc>.sql (no version)
Run whenOnce, in version orderWhen checksum changes
OrderStrict (V1, V2, V3, ...)After all V applied, alphabetical
Modify fileForbidden (checksum mismatch)Encouraged — re-run on change
Use caseSchema change (table, column)Views, procedures, functions

Use case repeatable — DB views:

-- R__create_active_projects_view.sql
DROP VIEW IF EXISTS active_projects;

CREATE VIEW active_projects AS
SELECT
  p.id, p.name, p.status, p.created_at,
  COUNT(t.id) AS task_count,
  SUM(CASE WHEN t.status = 'DONE' THEN 1 ELSE 0 END) AS done_count
FROM projects p
LEFT JOIN tasks t ON t.project_id = p.id
WHERE p.status IN ('ACTIVE', 'PLANNING')
GROUP BY p.id, p.name, p.status, p.created_at;

Workflow:

  1. Initial deploy: R script apply, view created.
  2. Need add column to view: edit R file.
  3. Next deploy: Flyway compute checksum changed → DROP + CREATE.
  4. No new V version needed — same file modified.

Use case stored procedure:

-- R__function_update_audit.sql

CREATE OR REPLACE FUNCTION log_project_change()
RETURNS TRIGGER AS $$
BEGIN
  INSERT INTO audit_log (table_name, operation, project_id, changed_at)
  VALUES ('projects', TG_OP, NEW.id, NOW());
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

DROP TRIGGER IF EXISTS trg_project_change ON projects;
CREATE TRIGGER trg_project_change
  AFTER INSERT OR UPDATE OR DELETE ON projects
  FOR EACH ROW EXECUTE FUNCTION log_project_change();

Modify function logic → re-deploy → new function definition.

Lợi ích:

  • Source of truth: view/procedure define trong git, applied automatically.
  • No version proliferation: 100 lần modify view = 100 V migration if not repeatable. R = 1 file.
  • Reload automatic: deploy → checksum compare → re-run if changed.

Caveat:

  • R cần idempotent: DROP IF EXISTS + CREATE hoặc CREATE OR REPLACE.
  • R run AFTER all V applied — order: V1, V2, ..., Vn, R1, R2, R3 (alphabetical).
  • R không change schema_history (no version column).

Pattern enterprise:

  • Schema (tables, columns, indexes) → V migration.
  • Computed views, materialized views → R migration.
  • Stored procedure, function, trigger → R migration.
  • Reference data (seed) → V migration (immutable history).
Q7
App existing 5 năm chưa dùng Flyway. Schema hiện tại 100 table. Migration chuyển sang Flyway thế nào?

Workflow baseline existing DB:

Step 1 — Snapshot current schema:

# Export schema (no data) từ production
pg_dump --schema-only prod > current-schema.sql

Output 100 table DDL, ~5000 dòng.

Step 2 — Tạo V1 baseline migration:

# src/main/resources/db/migration/V1__baseline.sql
# Copy nội dung current-schema.sql

-- This file is informational only
-- Existing schema as of 2026-04-15
-- Flyway baseline mode: DO NOT actually run this on existing DB

CREATE TABLE projects (...);
CREATE TABLE tasks (...);
-- ... 98 more tables

File này document schema cho new env (test, new dev local) — apply từ đầu. Production existing skip qua baseline mechanism.

Step 3 — Configure baseline mode:

spring:
flyway:
  enabled: true
  baseline-on-migrate: true
  baseline-version: 1
  baseline-description: "Existing schema as of 2026-04-15"
  locations: classpath:db/migration

Step 4 — First deploy production:

  1. App start.
  2. Flyway connect DB.
  3. Detect flyway_schema_history không có.
  4. baseline-on-migrate: true → tạo table + insert baseline row:
    INSERT INTO flyway_schema_history (version, description, type, ...)
    VALUES (1, 'Existing schema as of 2026-04-15', 'BASELINE', ...);
  5. Flyway compare: V1 trong files vs V1 trong history (baseline) → V1 marked applied.
  6. No pending migration. App start ok.

Production existing DB chuyển sang Flyway managed — không apply V1 (vì đã exist), nhưng track từ V2+.

Step 5 — Future migration:

# Add new feature: column priority
db/migration/V2__add_project_priority.sql
ALTER TABLE projects ADD COLUMN priority VARCHAR(20) NOT NULL DEFAULT 'MEDIUM';

# Deploy
# Flyway: baseline V1 đã applied, V2 pending → apply V2 → record history

Step 6 — New env (test, fresh dev):

# Empty DB
# Deploy app
# Flyway: no history → no baseline detect → apply V1 (full schema) + V2 + ...

New env apply cả V1 (init schema) lẫn V2+. Eventual consistency.

Caveat — mixed env:

  • Production: V1 baselined, V2+ applied.
  • New dev DB: V1 applied (full DDL), V2+ applied.
  • End state same — schema match.

Pitfall avoid:

  1. V1 không idempotent: mọi CREATE TABLE phải IF NOT EXISTS. Nếu app accidentally run V1 trên existing DB → fail. Defensive.
  2. Schema diff giữa env: production và new env có thể khác micro detail (datetime precision, collation). Snapshot từ production canonical.
  3. Index naming: existing indexes có name auto-generated by ORM. Snapshot include actual names → V1 reproduce.

Verify:

# Test: spin up fresh Postgres, apply all migration
docker run -d -p 5432:5432 -e POSTGRES_PASSWORD=test postgres:16
mvn flyway:migrate -Dflyway.url=jdbc:postgresql://localhost/test

# Compare schema with production
pg_dump --schema-only test > test-schema.sql
diff test-schema.sql current-schema.sql      # nen empty

Identical → baseline + V1 work. Different → fix V1 to match production exactly.

Pattern enterprise migrate to Flyway:

  • Phase 1: V1 baseline + Flyway enabled.
  • Phase 2 (next sprint): start adding V2+ for new changes.
  • Phase 3 (after team comfort): retrofit existing change history if compliance needs.

Bài tiếp theo: Pagination & sorting — Pageable, Sort, projection DTO, cursor scroll

Bài này có giúp bạn hiểu bản chất không?

Bình luận (0)

Đang tải...