Spring Boot/JPA & Hibernate — ORM principle, Spring Data abstraction layers
~24 phútSpring Data JPAMiễn phí

JPA & Hibernate — ORM principle, Spring Data abstraction layers

JPA là spec, Hibernate là implementation, Spring Data JPA là abstraction trên Hibernate. Bài này bóc 3 layer, ORM impedance mismatch, EntityManager lifecycle, persistence context, JPQL vs SQL, và tại sao Spring Data sinh được implementation từ interface.

Module 03 build TaskFlow REST API với in-memory ConcurrentHashMap. Pattern này OK cho demo nhưng không production: data mất khi restart, không query phức tạp, không transaction. Module 04 này thay storage layer bằng PostgreSQL + JPA + Hibernate + Spring Data JPA.

Bài đầu tiên trả lời câu hỏi căn bản: JPA là gì? Hibernate là gì? Spring Data JPA là gì? Tại sao 3 layer? Hiểu rồi, mọi annotation @Entity, @OneToMany, JpaRepository ở các bài sau hiện ra logic, không còn là magic.

1. Bài toán ORM giải quyết

Code thuần JDBC kết nối Postgres lưu Project:

public Long save(Project project) {
    String sql = "INSERT INTO projects (name, description, status, created_at) VALUES (?, ?, ?, ?) RETURNING id";
    try (Connection conn = dataSource.getConnection();
         PreparedStatement ps = conn.prepareStatement(sql)) {
        ps.setString(1, project.name());
        ps.setString(2, project.description());
        ps.setString(3, project.status().name());
        ps.setObject(4, project.createdAt());

        try (ResultSet rs = ps.executeQuery()) {
            if (rs.next()) return rs.getLong(1);
            throw new RuntimeException("INSERT failed");
        }
    } catch (SQLException e) {
        throw new RuntimeException(e);
    }
}

public Optional<Project> findById(Long id) {
    String sql = "SELECT id, name, description, status, created_at FROM projects WHERE id = ?";
    try (Connection conn = dataSource.getConnection();
         PreparedStatement ps = conn.prepareStatement(sql)) {
        ps.setLong(1, id);
        try (ResultSet rs = ps.executeQuery()) {
            if (rs.next()) {
                return Optional.of(new Project(
                    rs.getLong("id"),
                    rs.getString("name"),
                    rs.getString("description"),
                    ProjectStatus.valueOf(rs.getString("status")),
                    rs.getObject("created_at", Instant.class)
                ));
            }
            return Optional.empty();
        }
    } catch (SQLException e) {
        throw new RuntimeException(e);
    }
}

40+ dòng cho 2 method. App có 20 entity → 800+ dòng boilerplate. Pain rõ ràng:

PainHậu quả
Boilerplate80% code là set parameter + map ResultSet — không nghiệp vụ
Type safety yếurs.getString("status") — typo column name → runtime error
Connection managementPhải close manually — leak nếu quên
N+1 queryLoad Project + tasks trong loop → 1 query Project + N query Task
Map ResultSet → ObjectLặp lại pattern cho mọi entity
Transaction managementManual setAutoCommit(false), commit(), rollback() — error-prone
CachingKhông có — query lại DB mỗi lần

ORM (Object-Relational Mapping) là pattern giải quyết: map giữa Java object world và SQL relational world tự động. Developer code POJO + annotation, framework sinh SQL.

2. 3 layer abstraction

Java có 3 layer ORM build trên nhau:

flowchart TB
    App["Application code"]
    SDJ["Spring Data JPA<br/>(Repository abstraction)"]
    JPA["JPA Specification<br/>(jakarta.persistence)"]
    HB["Hibernate<br/>(implementation)"]
    JDBC["JDBC Driver"]
    DB[("PostgreSQL")]

    App -->|"interface OrderRepository extends JpaRepository"| SDJ
    SDJ -->|"EntityManager API"| JPA
    JPA -->|"Hibernate impl"| HB
    HB -->|"PreparedStatement, ResultSet"| JDBC
    JDBC --> DB

    style SDJ fill:#fef3c7
    style JPA fill:#d1fae5
    style HB fill:#fed7aa
LayerLoạiVai trò
JDBCAPI chuẩn JavaConnection, PreparedStatement, ResultSet — low level
JPASpec (interface)EntityManager, @Entity, JPQL — define ORM contract
HibernateImplementationImplement JPA spec với SQL generation, caching, lazy loading
Spring Data JPAAbstractionRepository interface, derived query, paging — sinh impl từ method name

Mỗi layer add abstraction giúp dev viết less code.

2.1 JPA Specification

JPA = Jakarta Persistence API (trước: Java Persistence API). Spec ra 2006 — define standard ORM cho Java. Không phải implementation, chỉ contract.

JPA define:

  • Annotation: @Entity, @Id, @Column, @OneToMany, @ManyToOne, @JoinColumn, ...
  • API: EntityManager, EntityManagerFactory, Query, TypedQuery.
  • JPQL: query language giống SQL nhưng object-oriented (SELECT p FROM Project p WHERE p.status = 'ACTIVE').
  • Lifecycle: persist, merge, remove, refresh.
  • Persistence Context: cache + change tracking.

JPA spec không chạy được — chỉ là interface. Cần implementation.

2.2 Hibernate — implementation

Hibernate là implementation chính của JPA (>90% market). Thực thi mọi annotation/API JPA bằng cách:

  • Generate SQL từ entity annotation (DDL) và JPQL query.
  • Map ResultSet → entity qua reflection.
  • Persistence context cache entity, dirty checking.
  • Lazy loading qua proxy (CGLIB).
  • Caching L1 (per-session) và L2 (cross-session, optional).

Hibernate là project tự sống — có từ 2001 (trước JPA). JPA 1.0 (2006) standardize idea từ Hibernate. Hibernate giờ là JPA-compliant implementation.

Alternatives: EclipseLink (Oracle), OpenJPA (Apache). Hiếm dùng. 99% Spring app dùng Hibernate.

2.3 Spring Data JPA — repository abstraction

Hibernate + JPA vẫn yêu cầu boilerplate:

// Pure JPA + Hibernate
@PersistenceContext
private EntityManager em;

public List<Project> findActiveByCustomer(String customer) {
    return em.createQuery(
        "SELECT p FROM Project p WHERE p.status = 'ACTIVE' AND p.customer = :customer",
        Project.class
    ).setParameter("customer", customer).getResultList();
}

Spring Data JPA elimnate luôn:

// Spring Data JPA — interface only
public interface ProjectRepository extends JpaRepository<Project, Long> {
    List<Project> findByStatusAndCustomer(ProjectStatus status, String customer);
}

Không implement — Spring Data sinh proxy class tại runtime. Method name findByStatusAndCustomer được parse → JPQL SELECT p FROM Project p WHERE p.status = ?1 AND p.customer = ?2.

Spring Data thêm:

  • Repository abstraction: interface, không class.
  • Derived query: parse method name → query.
  • @Query annotation: custom JPQL/SQL khi derived không đủ.
  • Pagination: Pageable + Sort chuẩn.
  • Auditing: @CreatedDate, @LastModifiedDate, @CreatedBy.
  • Specifications: dynamic query với Criteria API.

3 layer cộng dồn: Spring Data JPA dùng JPA API, JPA delegate Hibernate, Hibernate generate SQL → JDBC → DB.

3. ORM impedance mismatch

ORM không hoàn hảo. Java object model + SQL relational model có impedance mismatch — 5 khác biệt cốt lõi:

JavaSQLORM giải quyết qua
Inheritance (extends)Không có@Inheritance strategies (SINGLE_TABLE, JOINED, TABLE_PER_CLASS)
Identity (==, equals)Primary key@Id annotation, equals/hashCode based on ID
Association (object reference)Foreign key + JOIN@ManyToOne, @OneToMany, lazy loading proxy
Navigation (order.customer.name)Multi-step JOINPath expression trong JPQL, fetch joins
Granularity (object có nested object)Flat columns@Embedded, @ElementCollection

5 mismatch → 5 source bug subtle. Ví dụ:

@Entity
public class Order {
    @Id Long id;
    @ManyToOne Customer customer;
}

// Code: navigation 2 level
String name = order.customer.name();

ORM phải:

  1. Generate SQL JOIN: SELECT o.*, c.* FROM orders o JOIN customers c ON o.customer_id = c.id WHERE o.id = ?.
  2. Hoặc lazy load: load Order first, khi access order.customer → query thứ 2 SELECT * FROM customers WHERE id = ?.

Lazy load thuận tiện nhưng gây N+1: load 100 order + access customer → 1 + 100 query (Module 04 bài 04 đào sâu).

4. EntityManager — JPA core API

EntityManager là interface trung tâm JPA — analog Connection của JDBC nhưng cho object world.

5 method chính:

@PersistenceContext
private EntityManager em;

// 1. persist — INSERT
Project project = new Project("Mobile App", ProjectStatus.PLANNING);
em.persist(project);
// Sau commit: SQL INSERT chay

// 2. find — SELECT BY ID
Project loaded = em.find(Project.class, 42L);
// SQL: SELECT * FROM projects WHERE id = 42

// 3. merge — UPDATE
loaded.setStatus(ProjectStatus.ACTIVE);
em.merge(loaded);
// Sau commit: SQL UPDATE chay

// 4. remove — DELETE
em.remove(loaded);
// Sau commit: SQL DELETE chay

// 5. createQuery — JPQL
List<Project> active = em.createQuery(
    "SELECT p FROM Project p WHERE p.status = :status",
    Project.class
).setParameter("status", ProjectStatus.ACTIVE).getResultList();

Spring Data JPA wrap EntityManager qua JpaRepository. Bạn rarely call em trực tiếp — repository đủ 95% case.

4.1 Persistence Context — first-level cache

EntityManager quản 1 persistence context — Map từ ID → entity instance per transaction. Tính năng:

  • Identity guarantee: trong cùng tx, em.find(Project, 42L) == em.find(Project, 42L) — same instance.
  • Dirty checking: thay đổi field entity managed → SQL UPDATE auto chạy khi commit.
  • Cascade: thay đổi Order → cascade Project (nếu config CascadeType.ALL).
  • Lazy loading: proxy load related entity khi access.
@Transactional
public void example() {
    Project p1 = em.find(Project.class, 42L);     // SELECT
    p1.setName("New name");                        // mark dirty
    Project p2 = em.find(Project.class, 42L);     // KHONG SELECT — return p1 from cache
    System.out.println(p1 == p2);                  // true
    // Khi tx commit: UPDATE projects SET name = 'New name' WHERE id = 42 (auto)
}

Persistence context = transaction scope (mặc định). Khi tx kết thúc, context flush tất cả change → commit hoặc rollback.

4.2 Entity lifecycle — 4 state

stateDiagram-v2
    [*] --> Transient
    Transient --> Managed: persist()
    Managed --> Detached: detach() / em close
    Detached --> Managed: merge()
    Managed --> Removed: remove()
    Removed --> [*]: commit
StateDescription
TransientNew object, không quản bởi EM. new Project().
ManagedEM tracking, change auto-sync DB. After persist()/find()/merge().
DetachedEM closed hoặc detach() gọi. Change không sync nữa.
Removedem.remove() gọi. DELETE SQL chạy lúc commit.

State transition critical cho debug bug "tại sao change không save". Đa phần: object detached (out of tx) → modify không sync.

5. JPQL — query language

JPQL = JPA Query Language. Giống SQL nhưng query entity thay table:

SELECT p FROM Project p
WHERE p.status = 'ACTIVE' AND p.createdAt > :since
ORDER BY p.name ASC

Lưu ý:

  • Projectentity class, không phải table name.
  • p.statusfield, không phải column name.
  • :sincenamed parameter.

Hibernate compile JPQL → SQL:

SELECT p.id, p.name, p.description, p.status, p.created_at
FROM projects p
WHERE p.status = 'ACTIVE' AND p.created_at > ?
ORDER BY p.name ASC

Mapping field → column qua @Column annotation hoặc naming strategy default (camelCase → snake_case).

5.1 JPQL vs SQL

JPQL không thay SQL — nó wrap SQL với object semantics. Khi cần SQL native:

@Query(value = "SELECT * FROM projects WHERE created_at > ?1", nativeQuery = true)
List<Project> findRecentNative(Instant since);

Use case native SQL:

  • DB-specific feature (Postgres array, JSON columns, full-text search).
  • Performance critical query (window function, CTE).
  • Complex JOIN không express được trong JPQL.

90% case JPQL đủ.

6. Spring Data Repository — sinh implementation

Magic chính của Spring Data: interface repository sinh implementation tại runtime.

public interface ProjectRepository extends JpaRepository<Project, Long> {
    List<Project> findByStatus(ProjectStatus status);
    Optional<Project> findByName(String name);
    long countByStatus(ProjectStatus status);
    boolean existsByName(String name);
    void deleteByStatus(ProjectStatus status);
}

Spring làm gì lúc startup:

sequenceDiagram
    participant Boot as Spring Boot startup
    participant SDR as Spring Data Repository scanner
    participant Proxy as ProjectRepository$$ProxyByCGLIB
    participant Parser as Method name parser
    participant EM as EntityManager
    participant DB as PostgreSQL

    Boot->>SDR: scan @EnableJpaRepositories packages
    SDR->>SDR: find ProjectRepository extends JpaRepository
    SDR->>Proxy: create proxy implementing interface
    Note over Proxy: For each method, register query

    Proxy->>Parser: parse "findByStatus"
    Parser->>Parser: detect property "status"
    Parser->>Parser: build JPQL "SELECT p FROM Project p WHERE p.status = ?1"

    Proxy->>Parser: parse "findByName"
    Parser->>Parser: detect property "name"
    Parser->>Parser: build JPQL ...

    Boot->>Proxy: register as bean

    Note over Boot: Runtime — service gọi repository
    Boot->>Proxy: findByStatus(ACTIVE)
    Proxy->>EM: createQuery JPQL setParameter
    EM->>DB: SQL query
    DB-->>EM: ResultSet
    EM-->>Proxy: List<Project>
    Proxy-->>Boot: return

Spring Data parse method name theo grammar:

[query verb] [subject] By [property] [keyword] [property] ...

Bài 03 sẽ đào sâu derived queries.

7. Spring Data JPA — autoconfiguration

Boot autoconfig setup mọi thứ:

flowchart TB
    Pull["Pull starter"]
    SBSDJPA["spring-boot-starter-data-jpa"]
    SDJ["spring-data-jpa"]
    HB["hibernate-core"]
    JDBC["spring-jdbc"]
    Tx["spring-tx"]
    SQL["jakarta.persistence-api"]

    Pull --> SBSDJPA
    SBSDJPA --> SDJ
    SBSDJPA --> HB
    SBSDJPA --> JDBC
    SBSDJPA --> Tx
    SBSDJPA --> SQL

spring-boot-starter-data-jpa pull:

  • spring-data-jpa — repository abstraction.
  • hibernate-core — JPA implementation.
  • spring-orm — Spring's ORM integration (transaction, exception translation).
  • spring-jdbc — DataSource, transaction.
  • jakarta.persistence-api — JPA annotation.
  • jakarta.transaction-api@Transactional.

Boot autoconfig kích hoạt:

  • DataSourceAutoConfigurationHikariDataSource từ spring.datasource.*.
  • HibernateJpaAutoConfigurationEntityManagerFactory + JpaTransactionManager.
  • JpaRepositoriesAutoConfiguration — scan @Repository interface, sinh proxy.

Setup tối thiểu:

spring:
  datasource:
    url: jdbc:postgresql://localhost:5432/taskflow
    username: ${DB_USER}
    password: ${DB_PASS}
  jpa:
    hibernate:
      ddl-auto: validate
    properties:
      hibernate:
        format_sql: true
@SpringBootApplication
public class App {
    public static void main(String[] args) {
        SpringApplication.run(App.class, args);
    }
}

@Repository
public interface ProjectRepository extends JpaRepository<Project, Long> { }

Đó là toàn bộ — Boot tự setup connection pool, transaction, EntityManager, repository proxy. Compare với Spring 4 era 50 dòng XML config.

8. ddl-auto — DDL strategy

spring.jpa.hibernate.ddl-auto quản schema generation:

ValueHành vi
noneKhông action — production safe
validateVerify schema match entity, fail nếu mismatch — production recommended
updateAuto add column/table missing — dev-only
createDrop tables + create lại — test only
create-dropSame as create + drop khi shutdown — test only

Cảnh báo: không bao giờ create/update production. Schema change phải qua migration tool (Flyway/Liquibase — bài 06).

Pattern thực tế:

# application.yml
spring:
  jpa:
    hibernate:
      ddl-auto: validate

# application-dev.yml
spring:
  jpa:
    hibernate:
      ddl-auto: update           # dev convenience

# application-test.yml
spring:
  jpa:
    hibernate:
      ddl-auto: create-drop      # ephemeral schema

9. Vận hành production — persistence context, ddl-auto, monitoring

Hibernate persistence context (1st level cache) là source of truth cho dirty checking. Tuning sai → memory leak hoặc slow flush. Section này cover production tuning.

9.1 ddl-auto strategy production

Production rule: ddl-auto: validate + Flyway migration.

ModeUse caseProduction?
noneManual schema controlOK
validateVerify schema match entityRecommend
updateHibernate auto-modifyNever — race condition multi-pod
createDrop + createTest only
create-dropDrop on shutdownTest only

validate fail startup nếu entity không match DB schema → catch bug Flyway migration miss.

9.2 Hibernate Statistics — diagnostic

spring.jpa.properties.hibernate.generate_statistics: true
logging.level.org.hibernate.stat: INFO

Output cuối tx:

Statistics:
  3000000 nanoseconds spent acquiring 1 JDBC connection
  120 nanoseconds spent executing 12 JDBC statements
  4 entities loaded
  0 collections fetched

Production: bật khi diagnose, tắt sau (overhead 5-10%). Hoặc bật trên 1 instance debug mode.

Export Micrometer cho dashboard:

management.metrics.enable.hibernate: true

9.3 Persistence context bloat — heap pressure

Default Hibernate hold mọi entity loaded trong tx → 1st level cache. Long tx + nhiều entity = heap pressure → GC pause lớn.

Pattern batch (Module 04 bài 04 cover sâu):

for (int i = 0; i < records.size(); i++) {
    em.persist(new Project(records.get(i)));
    if (i % 50 == 0) {
        em.flush();
        em.clear();        // detach all → free heap
    }
}

9.4 Failure runbook

Mode 1 — LazyInitializationException production:

  • Cause: serialize entity ngoài tx (OSIV tắt — Module 04 bài 04).
  • Remediate: map sang DTO trong service.

Mode 2 — OptimisticLockException:

  • Cause: 2 user update cùng entity, version field conflict.
  • Remediate: @Version annotation + retry logic, hoặc inform user "data changed, reload".

Mode 3 — Schema validation fail tại startup:

  • Triệu chứng: Boot start fail "missing column" hoặc "wrong type".
  • Diagnose: entity field đổi nhưng Flyway migration chưa apply.
  • Remediate: add migration tương ứng, redeploy.

Mode 4 — Persistence context bloat:

  • Triệu chứng: heap usage tăng dần, GC pause lớn.
  • Diagnose: heap dump → EntityManager instance hold nhiều entity.
  • Remediate: shorter tx, batch processing với em.clear().

10. Pitfall tổng hợp

Nhầm 1: Tin Spring Data JPA "thay" SQL. ✅ Spring Data JPA wrap JPA wrap SQL. Mọi query cuối cùng là SQL. Hiểu SQL → debug query nhanh hơn 10x.

Nhầm 2: Modify entity ngoài transaction.

Project p = repo.findById(42L).orElseThrow();
// tx ket thuc o day
p.setStatus(ACTIVE);              // KHONG sync DB — detached

✅ Modify trong @Transactional method. Hoặc repo.save(p) explicit.

Nhầm 3: Quên transactional read.

public Project findById(Long id) {              // KHONG @Transactional
    Project p = em.find(Project.class, id);
    return p.tasks();                            // LazyInitializationException
}

Lazy load tasks cần persistence context active. ✅ @Transactional (read-only OK) cho method query lazy association.

Nhầm 4: Dùng ddl-auto=update production. ✅ ddl-auto=validate + Flyway migrate. Production never auto-modify schema.

Nhầm 5: Thinking "JPA is slow". ✅ JPA cho phép viết slow code (N+1, lazy in loop) dễ. Đúng cách: fetch joins, projections, batch — hiệu năng comparable JDBC.

Nhầm 6: Mix EntityManager direct + Repository.

@Service
public class S {
    @PersistenceContext EntityManager em;
    @Autowired ProjectRepository repo;
    // hai cách quản entity, dễ confusion
}

✅ Default Repository cho 95% case. EntityManager chỉ khi cần native query phức tạp hoặc bulk operation.

11. 📚 Deep Dive Spring Reference

📚 Tài liệu chính chủ

JPA Spec:

Hibernate:

Spring Data JPA:

Books:

  • "Java Persistence with Spring Data and Hibernate" — Catalin Tudose 2023.
  • "High-Performance Java Persistence" — Vlad Mihalcea — bible cho perf JPA/Hibernate.
  • "Pro JPA 2 in Java EE 8" — Mike Keith.

Source:

Tool:

  • IntelliJ "Persistence" tool window — visualize entity diagram.
  • Hibernate Statistics — log SQL count + time per query.
  • Datasource Proxy — log SQL + parameter.

12. Tóm tắt

  • 3 layer abstraction: Spring Data JPA → JPA spec → Hibernate impl → JDBC → DB.
  • JPA = Jakarta Persistence API, spec define @Entity, EntityManager, JPQL.
  • Hibernate = >90% market JPA implementation. Generate SQL, lazy load proxy, persistence context.
  • Spring Data JPA = repository abstraction. Sinh implementation runtime từ interface method name.
  • ORM impedance mismatch 5 issue: inheritance, identity, association, navigation, granularity.
  • EntityManager core API: persist/find/merge/remove/createQuery. Spring Data wrap qua JpaRepository.
  • Persistence Context = first-level cache + dirty tracking. Scope = transaction.
  • Entity lifecycle 4 state: Transient → Managed → Detached → Removed.
  • JPQL query entity, không table. Hibernate compile JPQL → SQL theo dialect.
  • Spring Data sinh implementation từ interface qua method name parsing — findByStatusAndCustomer → JPQL.
  • Boot autoconfig: DataSourceAutoConfiguration + HibernateJpaAutoConfiguration + JpaRepositoriesAutoConfiguration — tối thiểu cấu hình spring.datasource.url.
  • ddl-auto: production validate, dev update, test create-drop. Production schema change qua Flyway.

13. Tự kiểm tra

Tự kiểm tra
Q1
Vì sao 3 layer JPA / Hibernate / Spring Data JPA tồn tại? Có thể bỏ layer nào không?

Mỗi layer giải quyết 1 vấn đề khác:

  • JPA spec: standardize ORM cho Java — tránh vendor lock-in. Code dùng EntityManager, @Entity work với Hibernate, EclipseLink, OpenJPA.
  • Hibernate: implement spec với SQL generation, lazy loading, caching. Spec không có code chạy được — cần implementation.
  • Spring Data JPA: abstraction trên Hibernate. Sinh repository implementation từ interface — bỏ boilerplate EntityManager.createQuery.

Có thể bỏ layer nào?

  • Bỏ Spring Data JPA: được. Dùng EntityManager trực tiếp. Nhưng phải code mỗi query tay — verbose, ít productive. Use case: framework không Spring.
  • Bỏ JPA spec, dùng Hibernate native API: được. SessionFactory, Session thay EntityManagerFactory, EntityManager. Lock vào Hibernate. Hiếm — không có lý do compelling 2026.
  • Bỏ Hibernate, dùng EclipseLink: được. Spec compatible. Hiếm — Hibernate dominant ecosystem.
  • Bỏ tất cả, dùng JDBC: được. JdbcTemplate hoặc JdbcClient (Boot 3.2+). Phù hợp app simple, performance critical, hoặc DB-specific feature.

Trade-off:

StackLoC ratioPerformanceFlexibility
JDBC raw10xBestManual
JdbcClient (Spring 6.1+)3xExcellentSQL explicit
Hibernate native1.5xGoodHibernate-specific
JPA + Hibernate1.2xGoodStandard
Spring Data JPA1x baselineGood (need tuning)Highest abstraction

Default 2026: Spring Data JPA cho standard CRUD + business logic. Drop xuống EntityManager/JdbcClient khi cần optimize hoặc query phức tạp. Hybrid OK.

Q2
Đoạn sau crash với LazyInitializationException. Vì sao? Cách fix?
@Service
public class ProjectService {

  private final ProjectRepository repo;

  public ProjectDto getProjectWithTasks(Long id) {
      Project p = repo.findById(id).orElseThrow();
      return new ProjectDto(
          p.getId(),
          p.getName(),
          p.getTasks().size()           // CRASH here
      );
  }
}

Vì sao crash:

  1. repo.findById(id) chạy trong tx implicit của repository. Tx kết thúc khi method findById return.
  2. Project entity returned trong state detached (tx ended).
  3. p.getTasks() trigger lazy loading proxy (@OneToMany default LAZY).
  4. Lazy proxy try open Hibernate session để query — không có active session → throw LazyInitializationException.

3 cách fix:

Cách 1 — @Transactional trên service method (recommend):

@Service
public class ProjectService {

  @Transactional(readOnly = true)
  public ProjectDto getProjectWithTasks(Long id) {
      Project p = repo.findById(id).orElseThrow();
      return new ProjectDto(p.getId(), p.getName(), p.getTasks().size());
  }
}

Tx active suốt method → lazy load work. readOnly = true hint cho Hibernate skip dirty checking, optimize.

Cách 2 — Fetch join:

@Query("SELECT p FROM Project p LEFT JOIN FETCH p.tasks WHERE p.id = :id")
Optional<Project> findByIdWithTasks(@Param("id") Long id);

// Service
public ProjectDto get(Long id) {
  Project p = repo.findByIdWithTasks(id).orElseThrow();
  return new ProjectDto(p.getId(), p.getName(), p.getTasks().size());
  // Tasks da load — khong can active session
}

Single SQL JOIN load Project + Tasks. No N+1, no LazyInit.

Cách 3 — Projection DTO query trực tiếp:

@Query("SELECT new com.olhub.dto.ProjectDto(p.id, p.name, COUNT(t.id)) " +
     "FROM Project p LEFT JOIN p.tasks t WHERE p.id = :id GROUP BY p.id, p.name")
Optional<ProjectDto> findDtoById(@Param("id") Long id);

Map trực tiếp DB row → DTO. Không entity nào managed → no LazyInit possible. Performance tốt nhất.

Khuyến nghị 2026:

  • Read-only display: Cách 3 (projection).
  • Operate on entity (modify): Cách 1 (@Transactional).
  • Single entity + 1-2 association cần ngay: Cách 2 (fetch join).
Q3
Spring Data JPA sinh implementation cho findByStatusAndCustomer. Cơ chế cụ thể? Khi nào method name parsing fail?

Cơ chế sinh implementation:

Lúc startup:

  1. JpaRepositoriesAutoConfiguration scan package → tìm interface extends JpaRepository.
  2. Cho mỗi interface (vd ProjectRepository):
    • Tạo RepositoryFactoryBean.
    • Factory tạo JDK dynamic proxy implement interface.
    • Proxy delegate đến SimpleJpaRepository (default impl) cho method built-in (save, findById, ...).
    • Proxy delegate đến custom query handler cho method derived (findByStatus, ...).
  3. Mỗi method derived: PartTreeJpaQuery parse method name lúc startup, build JPQL template.
  4. Register proxy như Spring bean.

Method name parsing:

findByStatusAndCustomer
 ↓
[verb=find] [subject=By] [criteria]
 ↓
criteria: Status AND Customer
 ↓
JPQL: SELECT p FROM Project p WHERE p.status = ?1 AND p.customer = ?2

Grammar (đơn giản):

Method = (find | get | read | query | count | exists | delete) (By | All) Criteria
Criteria = Property [Keyword] [And | Or] Property [Keyword]
Keyword = Equals | Like | StartsWith | Between | LessThan | GreaterThan | IsNull | NotNull | OrderBy ...

Ví dụ method names hợp lệ:

findByName(String n)                                  WHERE name = ?
findByStatusIn(List<Status> s)                        WHERE status IN ?
findByCreatedAtBetween(Instant s, Instant e)          WHERE created_at BETWEEN ? AND ?
findByNameContainingIgnoreCase(String n)              WHERE LOWER(name) LIKE LOWER(?)
findTop10ByStatusOrderByCreatedAtDesc(Status s)       LIMIT 10
countByStatus(Status s)                               SELECT COUNT(*)
existsByName(String n)                                 SELECT 1
deleteByStatus(Status s)                               DELETE

Khi nào fail:

  1. Property không tồn tại:
    findByCustmer(String c)        // typo "Custmer"
    // Startup error: No property 'custmer' found for type 'Project'
    Spring fail tại startup — fail-fast.
  2. Logic phức tạp:
    findByCreatedAtBetweenAndStatusInOrSomething(...)  // qua phuc tap
    Method name 60 ký tự, khó đọc. Switch sang @Query:
    @Query("SELECT p FROM Project p WHERE p.createdAt BETWEEN :start AND :end AND p.status IN :statuses")
    List<Project> findRecent(...);
  3. Multi-entity join:
    findByTasksTitleContaining(String t)       // navigate Project → Task → title
    Work nhưng generate SQL JOIN — verify performance.
  4. Aggregate functions: derived không support SUM, AVG. Phải @Query.

Quy tắc: derived query <5 param. Nhiều hơn → @Query JPQL hoặc Specification.

Q4
Persistence Context = first-level cache. Cho ví dụ minh hoạ identity guarantee + dirty checking. Hậu quả nếu modify entity outside tx?

Identity guarantee:

@Transactional
public void identityExample() {
  Project p1 = repo.findById(42L).orElseThrow();    // SELECT
  Project p2 = repo.findById(42L).orElseThrow();    // KHONG SELECT — cache hit
  System.out.println(p1 == p2);                      // true (same instance)
  System.out.println(p1.hashCode() == p2.hashCode());// true
}

Trong cùng tx, mọi findById(42L) trả cùng instance. Hibernate maintain Map từ entity ID → instance trong persistence context. Lookup lần thứ 2 hit cache, không SQL.

Dirty checking:

@Transactional
public void dirtyExample(Long id, String newName) {
  Project p = repo.findById(id).orElseThrow();
  p.setName(newName);          // KHONG goi save()
  // Method return — tx commit
  // Hibernate phat hien field 'name' thay doi → auto SQL UPDATE
}

Hibernate save snapshot của entity lúc load. Tại commit, compare current state vs snapshot. Field khác → generate UPDATE.

Không cần repo.save(p). Spring Data save() nội bộ check entity managed hay không — managed → return existing instance, transient → call em.persist().

Modify outside tx — hậu quả:

public void outsideTxExample(Long id, String newName) {
  // KHONG @Transactional
  Project p = repo.findById(id).orElseThrow();   // tx cua repo.findById ket thuc
  p.setName(newName);                              // entity detached, KHONG sync DB
  // Method return — KHONG UPDATE
}

Vấn đề:

  • Change không persist: entity detached, dirty checking không active.
  • Lazy loading fail: nếu access lazy association → LazyInitializationException.
  • Equality unstable: 2 lần findById(42L) outside tx có thể trả 2 instance khác nhau (mỗi call mới persistence context).

Fix:

  1. Annotate service method @Transactional:
    @Transactional
    public void update(Long id, String newName) {
      Project p = repo.findById(id).orElseThrow();
      p.setName(newName);          // dirty check active, auto UPDATE
    }
  2. Hoặc explicit save() cuối:
    public void update(Long id, String newName) {
      Project p = repo.findById(id).orElseThrow();
      p.setName(newName);
      repo.save(p);                // explicit — work với detached entity
    }

Recommend: luôn @Transactional trên service write method. Read-only method dùng @Transactional(readOnly = true) để Hibernate optimize (skip dirty checking, flush mode COMMIT).

Q5
Production app `ddl-auto=validate`. Bạn add field mới `Project.priority`. Restart fail với "Schema validation: missing column priority". Quy trình deploy đúng?

Deploy đúng = schema migration đi trước code change.

Quy trình tổng:

  1. Write migration script (Flyway):
    -- src/main/resources/db/migration/V2__add_project_priority.sql
    ALTER TABLE projects ADD COLUMN priority VARCHAR(20) NOT NULL DEFAULT 'MEDIUM';
  2. Update entity:
    @Entity
    public class Project {
      @Enumerated(EnumType.STRING)
      @Column(nullable = false)
      private ProjectPriority priority;
    }
  3. Build app: Flyway tự apply V2 lúc startup, sau đó Hibernate validate schema match.
  4. Deploy production:
    • App rolling update K8s.
    • Pod 1 start: Flyway thấy version 1 trong DB, apply V2 (ALTER TABLE). Hibernate validate OK. Start Tomcat.
    • Pod 2-N start: Flyway check version 2 đã có (concurrent-safe via lock). Skip. Validate OK. Start.

Vì sao ddl-auto=validate + Flyway:

  • Schema source of truth = migration script: versioned, reviewed, rollback-able.
  • Production safe: Hibernate không tự ALTER TABLE — có thể destroy data nếu auto.
  • Rollback friendly: nếu deploy fail, rollback migration (Flyway có repair + downgrade script).
  • Audit: mọi schema change trong git history.

Anti-pattern:

# DUNG dev — KHONG production
spring.jpa.hibernate.ddl-auto=update

Hibernate update mode:

  • Add column missing — work cho add.
  • Không drop column. Không rename column (= drop + add → data loss).
  • Không reorder constraint.
  • Inconsistent giữa Hibernate version → behavior thay đổi unpredictably.
  • Gây race khi multi-pod startup concurrent.

Rule production: ddl-auto=validate + Flyway/Liquibase. Bài 06 đào sâu Flyway setup.

Workflow dev local:

  • ddl-auto=update + skip Flyway — convenience.
  • Hoặc ddl-auto=validate + Flyway luôn — consistent với production.
  • Khuyến nghị: option 2 — quen workflow production từ early.
Q6
App TaskFlow Module 03 dùng in-memory ConcurrentHashMap. Migrate sang JPA — service layer có cần thay đổi nhiều không? Vì sao?

Service layer không cần thay đổi nhiều — nhờ Repository abstraction.

Trước (Module 03):

// Repository interface
public interface ProjectRepository {
  Project save(Project p);
  Optional<Project> findById(Long id);
  List<Project> findByStatus(ProjectStatus status);
  boolean existsByName(String name);
  void delete(Long id);
}

// Implementation in-memory
@Repository
public class InMemoryProjectRepository implements ProjectRepository {
  private final Map<Long, Project> data = new ConcurrentHashMap<>();
  private final AtomicLong sequence = new AtomicLong();

  public Project save(Project p) {
      if (p.id() == null) {
          Long id = sequence.incrementAndGet();
          p = new Project(id, p.name(), ...);
      }
      data.put(p.id(), p);
      return p;
  }
  // ...
}

// Service KHONG biet implementation
@Service
public class ProjectService {
  private final ProjectRepository repo;     // depend interface

  public Project create(...) {
      if (repo.existsByName(name)) throw ...;
      return repo.save(...);
  }
}

Sau (Module 04):

// Same interface
public interface ProjectRepository extends JpaRepository<Project, Long> {
  List<Project> findByStatus(ProjectStatus status);
  boolean existsByName(String name);
}

// Bo InMemoryProjectRepository — Spring Data sinh impl tu dong

// Service KHONG DOI
@Service
public class ProjectService {
  private final ProjectRepository repo;     // van depend interface

  public Project create(...) {
      if (repo.existsByName(name)) throw ...;
      return repo.save(...);
  }
}

Thay đổi cần làm:

  1. Project domain → JPA Entity:
    // Truoc: record (immutable)
    public record Project(Long id, String name, ...) {}
    
    // Sau: class voi JPA annotation
    @Entity
    @Table(name = "projects")
    public class Project {
      @Id @GeneratedValue(strategy = GenerationType.IDENTITY)
      private Long id;
    
      @Column(nullable = false, unique = true, length = 100)
      private String name;
    
      // getters, setters, no-arg constructor — JPA require
    }
    JPA require mutable class với no-arg constructor. Module 04 bài 02 đào sâu.
  2. Repository extends JpaRepository: bỏ in-memory impl, change interface signature.
  3. Service: thêm @Transactional:
    @Service
    @Transactional(readOnly = true)        // class level: mac dinh read-only
    public class ProjectService {
    
      @Transactional                       // override cho method write
      public Project create(...) { ... }
    
      public Project findById(Long id) { ... }    // inherit readOnly = true
    }
  4. Application properties: add spring.datasource.* + spring.jpa.*.
  5. Migration script Flyway: V1__init_schema.sql với CREATE TABLE projects.

Service code không đổi:

  • Method signature giữ nguyên.
  • Logic business giữ nguyên.
  • Test với mock ProjectRepository vẫn work — Mockito mock interface.

Đây là power của Repository pattern + Liskov substitution principle. Service depend interface, không depend implementation. Migrate storage = swap implementation, no business logic change.

Bonus — test infrastructure:

  • Module 03: unit test với in-memory repo direct — fast, no Spring.
  • Module 04: integration test với Testcontainers + real Postgres. Service test mock repo vẫn work.
  • 2-tier testing: unit test fast (mock), integration test slow but real (Testcontainers).
Q7
Có 5 cách query DB trong Spring stack: JDBC raw, JdbcTemplate, JdbcClient (Spring 6.1+), JPA EntityManager, Spring Data Repository. Khi nào dùng cái nào?
ToolAbstractionSQL controlBoilerplateUse case
JDBC rawLowest100% control10xLibrary/framework. Production app rarely.
JdbcTemplateLowSQL explicit3xLegacy code. SQL-heavy app pre-Spring 6.1.
JdbcClient (Spring 6.1+)LowSQL explicit2xModern alternative JdbcTemplate. Fluent API.
JPA EntityManagerMidJPQL/SQL native1.5xComplex query, batch update, when Repository không đủ.
Spring Data RepositoryHighestDerived query/JPQL/SQL1x baselineStandard CRUD + business query. Default 2026.

Decision tree:

Standard CRUD?
Yes → Spring Data Repository (default)
No → Continue

Complex JPQL/JOIN/aggregate?
Yes → @Query annotation in Repository, hoặc EntityManager
No → Continue

DB-specific feature (Postgres array, JSON, full-text)?
Yes → @Query nativeQuery=true, hoặc JdbcClient
No → Continue

Bulk update/delete (millions of rows)?
Yes → JdbcClient (raw SQL, no entity overhead)
No → Continue

Performance ultra-critical (microbenchmark, hot path)?
Yes → JdbcClient hoặc raw JDBC
No → Continue

Default → Spring Data Repository

Mix patterns trong 1 app:

@Service
public class OrderService {

  private final OrderRepository repo;          // Spring Data: 90% case
  private final JdbcClient jdbc;                // Modern Spring 6.1+
  private final EntityManager em;               // JPA fallback

  // 90% method: standard CRUD
  public Order create(...) { return repo.save(...); }
  public List<Order> findActive() { return repo.findByStatus(ACTIVE); }

  // 5% method: complex query không express qua Repository
  public OrderStats getStats() {
      return jdbc.sql("""
          SELECT status, COUNT(*) AS cnt, SUM(total) AS sum
          FROM orders
          WHERE created_at >= :since
          GROUP BY status
          """)
          .param("since", LocalDate.now().minusDays(30))
          .query(OrderStats.class)
          .single();
  }

  // 5% method: bulk operation
  public void archiveOldOrders() {
      em.createQuery("UPDATE Order o SET o.archived = true WHERE o.createdAt < :cutoff")
          .setParameter("cutoff", LocalDate.now().minusYears(2))
          .executeUpdate();
  }
}

Khoá này (TaskFlow): Spring Data Repository default. Module 04 bài 03 introduce @Query. Module 09 (Performance) sẽ dùng JdbcClient cho bulk operation.

Quy tắc: bắt đầu cao nhất (Repository). Drop xuống tầng thấp khi gặp limitation cụ thể, không phải "phòng hờ".

Bài tiếp theo: Entity mapping — @Entity, @Id, @GeneratedValue, naming strategy

Bài này có giúp bạn hiểu bản chất không?

Bình luận (0)

Đang tải...