JPA & Hibernate — ORM principle, Spring Data abstraction layers
JPA là spec, Hibernate là implementation, Spring Data JPA là abstraction trên Hibernate. Bài này bóc 3 layer, ORM impedance mismatch, EntityManager lifecycle, persistence context, JPQL vs SQL, và tại sao Spring Data sinh được implementation từ interface.
Module 03 build TaskFlow REST API với in-memory ConcurrentHashMap. Pattern này OK cho demo nhưng không production: data mất khi restart, không query phức tạp, không transaction. Module 04 này thay storage layer bằng PostgreSQL + JPA + Hibernate + Spring Data JPA.
Bài đầu tiên trả lời câu hỏi căn bản: JPA là gì? Hibernate là gì? Spring Data JPA là gì? Tại sao 3 layer? Hiểu rồi, mọi annotation @Entity, @OneToMany, JpaRepository ở các bài sau hiện ra logic, không còn là magic.
1. Bài toán ORM giải quyết
Code thuần JDBC kết nối Postgres lưu Project:
public Long save(Project project) {
String sql = "INSERT INTO projects (name, description, status, created_at) VALUES (?, ?, ?, ?) RETURNING id";
try (Connection conn = dataSource.getConnection();
PreparedStatement ps = conn.prepareStatement(sql)) {
ps.setString(1, project.name());
ps.setString(2, project.description());
ps.setString(3, project.status().name());
ps.setObject(4, project.createdAt());
try (ResultSet rs = ps.executeQuery()) {
if (rs.next()) return rs.getLong(1);
throw new RuntimeException("INSERT failed");
}
} catch (SQLException e) {
throw new RuntimeException(e);
}
}
public Optional<Project> findById(Long id) {
String sql = "SELECT id, name, description, status, created_at FROM projects WHERE id = ?";
try (Connection conn = dataSource.getConnection();
PreparedStatement ps = conn.prepareStatement(sql)) {
ps.setLong(1, id);
try (ResultSet rs = ps.executeQuery()) {
if (rs.next()) {
return Optional.of(new Project(
rs.getLong("id"),
rs.getString("name"),
rs.getString("description"),
ProjectStatus.valueOf(rs.getString("status")),
rs.getObject("created_at", Instant.class)
));
}
return Optional.empty();
}
} catch (SQLException e) {
throw new RuntimeException(e);
}
}
40+ dòng cho 2 method. App có 20 entity → 800+ dòng boilerplate. Pain rõ ràng:
| Pain | Hậu quả |
|---|---|
| Boilerplate | 80% code là set parameter + map ResultSet — không nghiệp vụ |
| Type safety yếu | rs.getString("status") — typo column name → runtime error |
| Connection management | Phải close manually — leak nếu quên |
| N+1 query | Load Project + tasks trong loop → 1 query Project + N query Task |
| Map ResultSet → Object | Lặp lại pattern cho mọi entity |
| Transaction management | Manual setAutoCommit(false), commit(), rollback() — error-prone |
| Caching | Không có — query lại DB mỗi lần |
ORM (Object-Relational Mapping) là pattern giải quyết: map giữa Java object world và SQL relational world tự động. Developer code POJO + annotation, framework sinh SQL.
2. 3 layer abstraction
Java có 3 layer ORM build trên nhau:
flowchart TB
App["Application code"]
SDJ["Spring Data JPA<br/>(Repository abstraction)"]
JPA["JPA Specification<br/>(jakarta.persistence)"]
HB["Hibernate<br/>(implementation)"]
JDBC["JDBC Driver"]
DB[("PostgreSQL")]
App -->|"interface OrderRepository extends JpaRepository"| SDJ
SDJ -->|"EntityManager API"| JPA
JPA -->|"Hibernate impl"| HB
HB -->|"PreparedStatement, ResultSet"| JDBC
JDBC --> DB
style SDJ fill:#fef3c7
style JPA fill:#d1fae5
style HB fill:#fed7aa| Layer | Loại | Vai trò |
|---|---|---|
| JDBC | API chuẩn Java | Connection, PreparedStatement, ResultSet — low level |
| JPA | Spec (interface) | EntityManager, @Entity, JPQL — define ORM contract |
| Hibernate | Implementation | Implement JPA spec với SQL generation, caching, lazy loading |
| Spring Data JPA | Abstraction | Repository interface, derived query, paging — sinh impl từ method name |
Mỗi layer add abstraction giúp dev viết less code.
2.1 JPA Specification
JPA = Jakarta Persistence API (trước: Java Persistence API). Spec ra 2006 — define standard ORM cho Java. Không phải implementation, chỉ contract.
JPA define:
- Annotation:
@Entity,@Id,@Column,@OneToMany,@ManyToOne,@JoinColumn, ... - API:
EntityManager,EntityManagerFactory,Query,TypedQuery. - JPQL: query language giống SQL nhưng object-oriented (
SELECT p FROM Project p WHERE p.status = 'ACTIVE'). - Lifecycle: persist, merge, remove, refresh.
- Persistence Context: cache + change tracking.
JPA spec không chạy được — chỉ là interface. Cần implementation.
2.2 Hibernate — implementation
Hibernate là implementation chính của JPA (>90% market). Thực thi mọi annotation/API JPA bằng cách:
- Generate SQL từ entity annotation (DDL) và JPQL query.
- Map ResultSet → entity qua reflection.
- Persistence context cache entity, dirty checking.
- Lazy loading qua proxy (CGLIB).
- Caching L1 (per-session) và L2 (cross-session, optional).
Hibernate là project tự sống — có từ 2001 (trước JPA). JPA 1.0 (2006) standardize idea từ Hibernate. Hibernate giờ là JPA-compliant implementation.
Alternatives: EclipseLink (Oracle), OpenJPA (Apache). Hiếm dùng. 99% Spring app dùng Hibernate.
2.3 Spring Data JPA — repository abstraction
Hibernate + JPA vẫn yêu cầu boilerplate:
// Pure JPA + Hibernate
@PersistenceContext
private EntityManager em;
public List<Project> findActiveByCustomer(String customer) {
return em.createQuery(
"SELECT p FROM Project p WHERE p.status = 'ACTIVE' AND p.customer = :customer",
Project.class
).setParameter("customer", customer).getResultList();
}
Spring Data JPA elimnate luôn:
// Spring Data JPA — interface only
public interface ProjectRepository extends JpaRepository<Project, Long> {
List<Project> findByStatusAndCustomer(ProjectStatus status, String customer);
}
Không implement — Spring Data sinh proxy class tại runtime. Method name findByStatusAndCustomer được parse → JPQL SELECT p FROM Project p WHERE p.status = ?1 AND p.customer = ?2.
Spring Data thêm:
- Repository abstraction: interface, không class.
- Derived query: parse method name → query.
@Queryannotation: custom JPQL/SQL khi derived không đủ.- Pagination:
Pageable+Sortchuẩn. - Auditing:
@CreatedDate,@LastModifiedDate,@CreatedBy. - Specifications: dynamic query với Criteria API.
3 layer cộng dồn: Spring Data JPA dùng JPA API, JPA delegate Hibernate, Hibernate generate SQL → JDBC → DB.
3. ORM impedance mismatch
ORM không hoàn hảo. Java object model + SQL relational model có impedance mismatch — 5 khác biệt cốt lõi:
| Java | SQL | ORM giải quyết qua |
|---|---|---|
| Inheritance (extends) | Không có | @Inheritance strategies (SINGLE_TABLE, JOINED, TABLE_PER_CLASS) |
| Identity (==, equals) | Primary key | @Id annotation, equals/hashCode based on ID |
| Association (object reference) | Foreign key + JOIN | @ManyToOne, @OneToMany, lazy loading proxy |
Navigation (order.customer.name) | Multi-step JOIN | Path expression trong JPQL, fetch joins |
| Granularity (object có nested object) | Flat columns | @Embedded, @ElementCollection |
5 mismatch → 5 source bug subtle. Ví dụ:
@Entity
public class Order {
@Id Long id;
@ManyToOne Customer customer;
}
// Code: navigation 2 level
String name = order.customer.name();
ORM phải:
- Generate SQL JOIN:
SELECT o.*, c.* FROM orders o JOIN customers c ON o.customer_id = c.id WHERE o.id = ?. - Hoặc lazy load: load
Orderfirst, khi accessorder.customer→ query thứ 2SELECT * FROM customers WHERE id = ?.
Lazy load thuận tiện nhưng gây N+1: load 100 order + access customer → 1 + 100 query (Module 04 bài 04 đào sâu).
4. EntityManager — JPA core API
EntityManager là interface trung tâm JPA — analog Connection của JDBC nhưng cho object world.
5 method chính:
@PersistenceContext
private EntityManager em;
// 1. persist — INSERT
Project project = new Project("Mobile App", ProjectStatus.PLANNING);
em.persist(project);
// Sau commit: SQL INSERT chay
// 2. find — SELECT BY ID
Project loaded = em.find(Project.class, 42L);
// SQL: SELECT * FROM projects WHERE id = 42
// 3. merge — UPDATE
loaded.setStatus(ProjectStatus.ACTIVE);
em.merge(loaded);
// Sau commit: SQL UPDATE chay
// 4. remove — DELETE
em.remove(loaded);
// Sau commit: SQL DELETE chay
// 5. createQuery — JPQL
List<Project> active = em.createQuery(
"SELECT p FROM Project p WHERE p.status = :status",
Project.class
).setParameter("status", ProjectStatus.ACTIVE).getResultList();
Spring Data JPA wrap EntityManager qua JpaRepository. Bạn rarely call em trực tiếp — repository đủ 95% case.
4.1 Persistence Context — first-level cache
EntityManager quản 1 persistence context — Map từ ID → entity instance per transaction. Tính năng:
- Identity guarantee: trong cùng tx,
em.find(Project, 42L) == em.find(Project, 42L)— same instance. - Dirty checking: thay đổi field entity managed → SQL UPDATE auto chạy khi commit.
- Cascade: thay đổi Order → cascade Project (nếu config
CascadeType.ALL). - Lazy loading: proxy load related entity khi access.
@Transactional
public void example() {
Project p1 = em.find(Project.class, 42L); // SELECT
p1.setName("New name"); // mark dirty
Project p2 = em.find(Project.class, 42L); // KHONG SELECT — return p1 from cache
System.out.println(p1 == p2); // true
// Khi tx commit: UPDATE projects SET name = 'New name' WHERE id = 42 (auto)
}
Persistence context = transaction scope (mặc định). Khi tx kết thúc, context flush tất cả change → commit hoặc rollback.
4.2 Entity lifecycle — 4 state
stateDiagram-v2
[*] --> Transient
Transient --> Managed: persist()
Managed --> Detached: detach() / em close
Detached --> Managed: merge()
Managed --> Removed: remove()
Removed --> [*]: commit| State | Description |
|---|---|
| Transient | New object, không quản bởi EM. new Project(). |
| Managed | EM tracking, change auto-sync DB. After persist()/find()/merge(). |
| Detached | EM closed hoặc detach() gọi. Change không sync nữa. |
| Removed | em.remove() gọi. DELETE SQL chạy lúc commit. |
State transition critical cho debug bug "tại sao change không save". Đa phần: object detached (out of tx) → modify không sync.
5. JPQL — query language
JPQL = JPA Query Language. Giống SQL nhưng query entity thay table:
SELECT p FROM Project p
WHERE p.status = 'ACTIVE' AND p.createdAt > :since
ORDER BY p.name ASC
Lưu ý:
Projectlà entity class, không phải table name.p.statuslà field, không phải column name.:sincelà named parameter.
Hibernate compile JPQL → SQL:
SELECT p.id, p.name, p.description, p.status, p.created_at
FROM projects p
WHERE p.status = 'ACTIVE' AND p.created_at > ?
ORDER BY p.name ASC
Mapping field → column qua @Column annotation hoặc naming strategy default (camelCase → snake_case).
5.1 JPQL vs SQL
JPQL không thay SQL — nó wrap SQL với object semantics. Khi cần SQL native:
@Query(value = "SELECT * FROM projects WHERE created_at > ?1", nativeQuery = true)
List<Project> findRecentNative(Instant since);
Use case native SQL:
- DB-specific feature (Postgres array, JSON columns, full-text search).
- Performance critical query (window function, CTE).
- Complex JOIN không express được trong JPQL.
90% case JPQL đủ.
6. Spring Data Repository — sinh implementation
Magic chính của Spring Data: interface repository sinh implementation tại runtime.
public interface ProjectRepository extends JpaRepository<Project, Long> {
List<Project> findByStatus(ProjectStatus status);
Optional<Project> findByName(String name);
long countByStatus(ProjectStatus status);
boolean existsByName(String name);
void deleteByStatus(ProjectStatus status);
}
Spring làm gì lúc startup:
sequenceDiagram
participant Boot as Spring Boot startup
participant SDR as Spring Data Repository scanner
participant Proxy as ProjectRepository$$ProxyByCGLIB
participant Parser as Method name parser
participant EM as EntityManager
participant DB as PostgreSQL
Boot->>SDR: scan @EnableJpaRepositories packages
SDR->>SDR: find ProjectRepository extends JpaRepository
SDR->>Proxy: create proxy implementing interface
Note over Proxy: For each method, register query
Proxy->>Parser: parse "findByStatus"
Parser->>Parser: detect property "status"
Parser->>Parser: build JPQL "SELECT p FROM Project p WHERE p.status = ?1"
Proxy->>Parser: parse "findByName"
Parser->>Parser: detect property "name"
Parser->>Parser: build JPQL ...
Boot->>Proxy: register as bean
Note over Boot: Runtime — service gọi repository
Boot->>Proxy: findByStatus(ACTIVE)
Proxy->>EM: createQuery JPQL setParameter
EM->>DB: SQL query
DB-->>EM: ResultSet
EM-->>Proxy: List<Project>
Proxy-->>Boot: returnSpring Data parse method name theo grammar:
[query verb] [subject] By [property] [keyword] [property] ...
Bài 03 sẽ đào sâu derived queries.
7. Spring Data JPA — autoconfiguration
Boot autoconfig setup mọi thứ:
flowchart TB
Pull["Pull starter"]
SBSDJPA["spring-boot-starter-data-jpa"]
SDJ["spring-data-jpa"]
HB["hibernate-core"]
JDBC["spring-jdbc"]
Tx["spring-tx"]
SQL["jakarta.persistence-api"]
Pull --> SBSDJPA
SBSDJPA --> SDJ
SBSDJPA --> HB
SBSDJPA --> JDBC
SBSDJPA --> Tx
SBSDJPA --> SQLspring-boot-starter-data-jpa pull:
spring-data-jpa— repository abstraction.hibernate-core— JPA implementation.spring-orm— Spring's ORM integration (transaction, exception translation).spring-jdbc— DataSource, transaction.jakarta.persistence-api— JPA annotation.jakarta.transaction-api—@Transactional.
Boot autoconfig kích hoạt:
DataSourceAutoConfiguration—HikariDataSourcetừspring.datasource.*.HibernateJpaAutoConfiguration—EntityManagerFactory+JpaTransactionManager.JpaRepositoriesAutoConfiguration— scan@Repositoryinterface, sinh proxy.
Setup tối thiểu:
spring:
datasource:
url: jdbc:postgresql://localhost:5432/taskflow
username: ${DB_USER}
password: ${DB_PASS}
jpa:
hibernate:
ddl-auto: validate
properties:
hibernate:
format_sql: true
@SpringBootApplication
public class App {
public static void main(String[] args) {
SpringApplication.run(App.class, args);
}
}
@Repository
public interface ProjectRepository extends JpaRepository<Project, Long> { }
Đó là toàn bộ — Boot tự setup connection pool, transaction, EntityManager, repository proxy. Compare với Spring 4 era 50 dòng XML config.
8. ddl-auto — DDL strategy
spring.jpa.hibernate.ddl-auto quản schema generation:
| Value | Hành vi |
|---|---|
none | Không action — production safe |
validate | Verify schema match entity, fail nếu mismatch — production recommended |
update | Auto add column/table missing — dev-only |
create | Drop tables + create lại — test only |
create-drop | Same as create + drop khi shutdown — test only |
Cảnh báo: không bao giờ create/update production. Schema change phải qua migration tool (Flyway/Liquibase — bài 06).
Pattern thực tế:
# application.yml
spring:
jpa:
hibernate:
ddl-auto: validate
# application-dev.yml
spring:
jpa:
hibernate:
ddl-auto: update # dev convenience
# application-test.yml
spring:
jpa:
hibernate:
ddl-auto: create-drop # ephemeral schema
9. Vận hành production — persistence context, ddl-auto, monitoring
Hibernate persistence context (1st level cache) là source of truth cho dirty checking. Tuning sai → memory leak hoặc slow flush. Section này cover production tuning.
9.1 ddl-auto strategy production
Production rule: ddl-auto: validate + Flyway migration.
| Mode | Use case | Production? |
|---|---|---|
none | Manual schema control | OK |
validate | Verify schema match entity | Recommend |
update | Hibernate auto-modify | Never — race condition multi-pod |
create | Drop + create | Test only |
create-drop | Drop on shutdown | Test only |
validate fail startup nếu entity không match DB schema → catch bug Flyway migration miss.
9.2 Hibernate Statistics — diagnostic
spring.jpa.properties.hibernate.generate_statistics: true
logging.level.org.hibernate.stat: INFO
Output cuối tx:
Statistics:
3000000 nanoseconds spent acquiring 1 JDBC connection
120 nanoseconds spent executing 12 JDBC statements
4 entities loaded
0 collections fetched
Production: bật khi diagnose, tắt sau (overhead 5-10%). Hoặc bật trên 1 instance debug mode.
Export Micrometer cho dashboard:
management.metrics.enable.hibernate: true
9.3 Persistence context bloat — heap pressure
Default Hibernate hold mọi entity loaded trong tx → 1st level cache. Long tx + nhiều entity = heap pressure → GC pause lớn.
Pattern batch (Module 04 bài 04 cover sâu):
for (int i = 0; i < records.size(); i++) {
em.persist(new Project(records.get(i)));
if (i % 50 == 0) {
em.flush();
em.clear(); // detach all → free heap
}
}
9.4 Failure runbook
Mode 1 — LazyInitializationException production:
- Cause: serialize entity ngoài tx (OSIV tắt — Module 04 bài 04).
- Remediate: map sang DTO trong service.
Mode 2 — OptimisticLockException:
- Cause: 2 user update cùng entity, version field conflict.
- Remediate:
@Versionannotation + retry logic, hoặc inform user "data changed, reload".
Mode 3 — Schema validation fail tại startup:
- Triệu chứng: Boot start fail "missing column" hoặc "wrong type".
- Diagnose: entity field đổi nhưng Flyway migration chưa apply.
- Remediate: add migration tương ứng, redeploy.
Mode 4 — Persistence context bloat:
- Triệu chứng: heap usage tăng dần, GC pause lớn.
- Diagnose: heap dump →
EntityManagerinstance hold nhiều entity. - Remediate: shorter tx, batch processing với
em.clear().
10. Pitfall tổng hợp
❌ Nhầm 1: Tin Spring Data JPA "thay" SQL. ✅ Spring Data JPA wrap JPA wrap SQL. Mọi query cuối cùng là SQL. Hiểu SQL → debug query nhanh hơn 10x.
❌ Nhầm 2: Modify entity ngoài transaction.
Project p = repo.findById(42L).orElseThrow();
// tx ket thuc o day
p.setStatus(ACTIVE); // KHONG sync DB — detached
✅ Modify trong @Transactional method. Hoặc repo.save(p) explicit.
❌ Nhầm 3: Quên transactional read.
public Project findById(Long id) { // KHONG @Transactional
Project p = em.find(Project.class, id);
return p.tasks(); // LazyInitializationException
}
Lazy load tasks cần persistence context active.
✅ @Transactional (read-only OK) cho method query lazy association.
❌ Nhầm 4: Dùng ddl-auto=update production.
✅ ddl-auto=validate + Flyway migrate. Production never auto-modify schema.
❌ Nhầm 5: Thinking "JPA is slow". ✅ JPA cho phép viết slow code (N+1, lazy in loop) dễ. Đúng cách: fetch joins, projections, batch — hiệu năng comparable JDBC.
❌ Nhầm 6: Mix EntityManager direct + Repository.
@Service
public class S {
@PersistenceContext EntityManager em;
@Autowired ProjectRepository repo;
// hai cách quản entity, dễ confusion
}
✅ Default Repository cho 95% case. EntityManager chỉ khi cần native query phức tạp hoặc bulk operation.
11. 📚 Deep Dive Spring Reference
JPA Spec:
- Jakarta Persistence 3.0 Spec — official spec.
- Java Persistence Wiki — overview.
Hibernate:
- Hibernate ORM User Guide — full reference.
- Hibernate Tutorial — getting started.
Spring Data JPA:
- Spring Data JPA Reference — full guide.
- Spring Boot — JPA — autoconfig.
Books:
- "Java Persistence with Spring Data and Hibernate" — Catalin Tudose 2023.
- "High-Performance Java Persistence" — Vlad Mihalcea — bible cho perf JPA/Hibernate.
- "Pro JPA 2 in Java EE 8" — Mike Keith.
Source:
SimpleJpaRepository— implementation default choJpaRepository.HibernateJpaAutoConfiguration
Tool:
- IntelliJ "Persistence" tool window — visualize entity diagram.
- Hibernate Statistics — log SQL count + time per query.
- Datasource Proxy — log SQL + parameter.
12. Tóm tắt
- 3 layer abstraction: Spring Data JPA → JPA spec → Hibernate impl → JDBC → DB.
- JPA = Jakarta Persistence API, spec define
@Entity,EntityManager, JPQL. - Hibernate = >90% market JPA implementation. Generate SQL, lazy load proxy, persistence context.
- Spring Data JPA = repository abstraction. Sinh implementation runtime từ interface method name.
- ORM impedance mismatch 5 issue: inheritance, identity, association, navigation, granularity.
EntityManagercore API:persist/find/merge/remove/createQuery. Spring Data wrap quaJpaRepository.- Persistence Context = first-level cache + dirty tracking. Scope = transaction.
- Entity lifecycle 4 state: Transient → Managed → Detached → Removed.
- JPQL query entity, không table. Hibernate compile JPQL → SQL theo dialect.
- Spring Data sinh implementation từ interface qua method name parsing —
findByStatusAndCustomer→ JPQL. - Boot autoconfig:
DataSourceAutoConfiguration+HibernateJpaAutoConfiguration+JpaRepositoriesAutoConfiguration— tối thiểu cấu hìnhspring.datasource.url. ddl-auto: productionvalidate, devupdate, testcreate-drop. Production schema change qua Flyway.
13. Tự kiểm tra
Q1Vì sao 3 layer JPA / Hibernate / Spring Data JPA tồn tại? Có thể bỏ layer nào không?▸
Mỗi layer giải quyết 1 vấn đề khác:
- JPA spec: standardize ORM cho Java — tránh vendor lock-in. Code dùng
EntityManager,@Entitywork với Hibernate, EclipseLink, OpenJPA. - Hibernate: implement spec với SQL generation, lazy loading, caching. Spec không có code chạy được — cần implementation.
- Spring Data JPA: abstraction trên Hibernate. Sinh repository implementation từ interface — bỏ boilerplate
EntityManager.createQuery.
Có thể bỏ layer nào?
- Bỏ Spring Data JPA: được. Dùng
EntityManagertrực tiếp. Nhưng phải code mỗi query tay — verbose, ít productive. Use case: framework không Spring. - Bỏ JPA spec, dùng Hibernate native API: được.
SessionFactory,SessionthayEntityManagerFactory,EntityManager. Lock vào Hibernate. Hiếm — không có lý do compelling 2026. - Bỏ Hibernate, dùng EclipseLink: được. Spec compatible. Hiếm — Hibernate dominant ecosystem.
- Bỏ tất cả, dùng JDBC: được.
JdbcTemplatehoặcJdbcClient(Boot 3.2+). Phù hợp app simple, performance critical, hoặc DB-specific feature.
Trade-off:
| Stack | LoC ratio | Performance | Flexibility |
|---|---|---|---|
| JDBC raw | 10x | Best | Manual |
| JdbcClient (Spring 6.1+) | 3x | Excellent | SQL explicit |
| Hibernate native | 1.5x | Good | Hibernate-specific |
| JPA + Hibernate | 1.2x | Good | Standard |
| Spring Data JPA | 1x baseline | Good (need tuning) | Highest abstraction |
Default 2026: Spring Data JPA cho standard CRUD + business logic. Drop xuống EntityManager/JdbcClient khi cần optimize hoặc query phức tạp. Hybrid OK.
Q2Đoạn sau crash với LazyInitializationException. Vì sao? Cách fix?@Service
public class ProjectService {
private final ProjectRepository repo;
public ProjectDto getProjectWithTasks(Long id) {
Project p = repo.findById(id).orElseThrow();
return new ProjectDto(
p.getId(),
p.getName(),
p.getTasks().size() // CRASH here
);
}
}
▸
LazyInitializationException. Vì sao? Cách fix?@Service
public class ProjectService {
private final ProjectRepository repo;
public ProjectDto getProjectWithTasks(Long id) {
Project p = repo.findById(id).orElseThrow();
return new ProjectDto(
p.getId(),
p.getName(),
p.getTasks().size() // CRASH here
);
}
}Vì sao crash:
repo.findById(id)chạy trong tx implicit của repository. Tx kết thúc khi methodfindByIdreturn.Projectentity returned trong state detached (tx ended).p.getTasks()trigger lazy loading proxy (@OneToManydefault LAZY).- Lazy proxy try open Hibernate session để query — không có active session → throw
LazyInitializationException.
3 cách fix:
Cách 1 — @Transactional trên service method (recommend):
@Service
public class ProjectService {
@Transactional(readOnly = true)
public ProjectDto getProjectWithTasks(Long id) {
Project p = repo.findById(id).orElseThrow();
return new ProjectDto(p.getId(), p.getName(), p.getTasks().size());
}
}Tx active suốt method → lazy load work. readOnly = true hint cho Hibernate skip dirty checking, optimize.
Cách 2 — Fetch join:
@Query("SELECT p FROM Project p LEFT JOIN FETCH p.tasks WHERE p.id = :id")
Optional<Project> findByIdWithTasks(@Param("id") Long id);
// Service
public ProjectDto get(Long id) {
Project p = repo.findByIdWithTasks(id).orElseThrow();
return new ProjectDto(p.getId(), p.getName(), p.getTasks().size());
// Tasks da load — khong can active session
}Single SQL JOIN load Project + Tasks. No N+1, no LazyInit.
Cách 3 — Projection DTO query trực tiếp:
@Query("SELECT new com.olhub.dto.ProjectDto(p.id, p.name, COUNT(t.id)) " +
"FROM Project p LEFT JOIN p.tasks t WHERE p.id = :id GROUP BY p.id, p.name")
Optional<ProjectDto> findDtoById(@Param("id") Long id);Map trực tiếp DB row → DTO. Không entity nào managed → no LazyInit possible. Performance tốt nhất.
Khuyến nghị 2026:
- Read-only display: Cách 3 (projection).
- Operate on entity (modify): Cách 1 (@Transactional).
- Single entity + 1-2 association cần ngay: Cách 2 (fetch join).
Q3Spring Data JPA sinh implementation cho findByStatusAndCustomer. Cơ chế cụ thể? Khi nào method name parsing fail?▸
findByStatusAndCustomer. Cơ chế cụ thể? Khi nào method name parsing fail?Cơ chế sinh implementation:
Lúc startup:
JpaRepositoriesAutoConfigurationscan package → tìm interface extendsJpaRepository.- Cho mỗi interface (vd
ProjectRepository):- Tạo
RepositoryFactoryBean. - Factory tạo JDK dynamic proxy implement interface.
- Proxy delegate đến
SimpleJpaRepository(default impl) cho method built-in (save,findById, ...). - Proxy delegate đến custom query handler cho method derived (
findByStatus, ...).
- Tạo
- Mỗi method derived:
PartTreeJpaQueryparse method name lúc startup, build JPQL template. - Register proxy như Spring bean.
Method name parsing:
findByStatusAndCustomer
↓
[verb=find] [subject=By] [criteria]
↓
criteria: Status AND Customer
↓
JPQL: SELECT p FROM Project p WHERE p.status = ?1 AND p.customer = ?2Grammar (đơn giản):
Method = (find | get | read | query | count | exists | delete) (By | All) Criteria
Criteria = Property [Keyword] [And | Or] Property [Keyword]
Keyword = Equals | Like | StartsWith | Between | LessThan | GreaterThan | IsNull | NotNull | OrderBy ...Ví dụ method names hợp lệ:
findByName(String n) WHERE name = ?
findByStatusIn(List<Status> s) WHERE status IN ?
findByCreatedAtBetween(Instant s, Instant e) WHERE created_at BETWEEN ? AND ?
findByNameContainingIgnoreCase(String n) WHERE LOWER(name) LIKE LOWER(?)
findTop10ByStatusOrderByCreatedAtDesc(Status s) LIMIT 10
countByStatus(Status s) SELECT COUNT(*)
existsByName(String n) SELECT 1
deleteByStatus(Status s) DELETEKhi nào fail:
- Property không tồn tại:Spring fail tại startup — fail-fast.
findByCustmer(String c) // typo "Custmer" // Startup error: No property 'custmer' found for type 'Project' - Logic phức tạp:Method name 60 ký tự, khó đọc. Switch sang
findByCreatedAtBetweenAndStatusInOrSomething(...) // qua phuc tap@Query:@Query("SELECT p FROM Project p WHERE p.createdAt BETWEEN :start AND :end AND p.status IN :statuses") List<Project> findRecent(...); - Multi-entity join:Work nhưng generate SQL JOIN — verify performance.
findByTasksTitleContaining(String t) // navigate Project → Task → title - Aggregate functions: derived không support
SUM,AVG. Phải@Query.
Quy tắc: derived query <5 param. Nhiều hơn → @Query JPQL hoặc Specification.
Q4Persistence Context = first-level cache. Cho ví dụ minh hoạ identity guarantee + dirty checking. Hậu quả nếu modify entity outside tx?▸
Identity guarantee:
@Transactional
public void identityExample() {
Project p1 = repo.findById(42L).orElseThrow(); // SELECT
Project p2 = repo.findById(42L).orElseThrow(); // KHONG SELECT — cache hit
System.out.println(p1 == p2); // true (same instance)
System.out.println(p1.hashCode() == p2.hashCode());// true
}Trong cùng tx, mọi findById(42L) trả cùng instance. Hibernate maintain Map từ entity ID → instance trong persistence context. Lookup lần thứ 2 hit cache, không SQL.
Dirty checking:
@Transactional
public void dirtyExample(Long id, String newName) {
Project p = repo.findById(id).orElseThrow();
p.setName(newName); // KHONG goi save()
// Method return — tx commit
// Hibernate phat hien field 'name' thay doi → auto SQL UPDATE
}Hibernate save snapshot của entity lúc load. Tại commit, compare current state vs snapshot. Field khác → generate UPDATE.
Không cần repo.save(p). Spring Data save() nội bộ check entity managed hay không — managed → return existing instance, transient → call em.persist().
Modify outside tx — hậu quả:
public void outsideTxExample(Long id, String newName) {
// KHONG @Transactional
Project p = repo.findById(id).orElseThrow(); // tx cua repo.findById ket thuc
p.setName(newName); // entity detached, KHONG sync DB
// Method return — KHONG UPDATE
}Vấn đề:
- Change không persist: entity detached, dirty checking không active.
- Lazy loading fail: nếu access lazy association →
LazyInitializationException. - Equality unstable: 2 lần
findById(42L)outside tx có thể trả 2 instance khác nhau (mỗi call mới persistence context).
Fix:
- Annotate service method
@Transactional:@Transactional public void update(Long id, String newName) { Project p = repo.findById(id).orElseThrow(); p.setName(newName); // dirty check active, auto UPDATE } - Hoặc explicit
save()cuối:public void update(Long id, String newName) { Project p = repo.findById(id).orElseThrow(); p.setName(newName); repo.save(p); // explicit — work với detached entity }
Recommend: luôn @Transactional trên service write method. Read-only method dùng @Transactional(readOnly = true) để Hibernate optimize (skip dirty checking, flush mode COMMIT).
Q5Production app `ddl-auto=validate`. Bạn add field mới `Project.priority`. Restart fail với "Schema validation: missing column priority". Quy trình deploy đúng?▸
Deploy đúng = schema migration đi trước code change.
Quy trình tổng:
- Write migration script (Flyway):
-- src/main/resources/db/migration/V2__add_project_priority.sql ALTER TABLE projects ADD COLUMN priority VARCHAR(20) NOT NULL DEFAULT 'MEDIUM'; - Update entity:
@Entity public class Project { @Enumerated(EnumType.STRING) @Column(nullable = false) private ProjectPriority priority; } - Build app: Flyway tự apply V2 lúc startup, sau đó Hibernate validate schema match.
- Deploy production:
- App rolling update K8s.
- Pod 1 start: Flyway thấy version 1 trong DB, apply V2 (ALTER TABLE). Hibernate validate OK. Start Tomcat.
- Pod 2-N start: Flyway check version 2 đã có (concurrent-safe via lock). Skip. Validate OK. Start.
Vì sao ddl-auto=validate + Flyway:
- Schema source of truth = migration script: versioned, reviewed, rollback-able.
- Production safe: Hibernate không tự ALTER TABLE — có thể destroy data nếu auto.
- Rollback friendly: nếu deploy fail, rollback migration (Flyway có
repair+ downgrade script). - Audit: mọi schema change trong git history.
Anti-pattern:
# DUNG dev — KHONG production
spring.jpa.hibernate.ddl-auto=updateHibernate update mode:
- Add column missing — work cho add.
- Không drop column. Không rename column (= drop + add → data loss).
- Không reorder constraint.
- Inconsistent giữa Hibernate version → behavior thay đổi unpredictably.
- Gây race khi multi-pod startup concurrent.
Rule production: ddl-auto=validate + Flyway/Liquibase. Bài 06 đào sâu Flyway setup.
Workflow dev local:
ddl-auto=update+ skip Flyway — convenience.- Hoặc
ddl-auto=validate+ Flyway luôn — consistent với production. - Khuyến nghị: option 2 — quen workflow production từ early.
Q6App TaskFlow Module 03 dùng in-memory ConcurrentHashMap. Migrate sang JPA — service layer có cần thay đổi nhiều không? Vì sao?▸
Service layer không cần thay đổi nhiều — nhờ Repository abstraction.
Trước (Module 03):
// Repository interface
public interface ProjectRepository {
Project save(Project p);
Optional<Project> findById(Long id);
List<Project> findByStatus(ProjectStatus status);
boolean existsByName(String name);
void delete(Long id);
}
// Implementation in-memory
@Repository
public class InMemoryProjectRepository implements ProjectRepository {
private final Map<Long, Project> data = new ConcurrentHashMap<>();
private final AtomicLong sequence = new AtomicLong();
public Project save(Project p) {
if (p.id() == null) {
Long id = sequence.incrementAndGet();
p = new Project(id, p.name(), ...);
}
data.put(p.id(), p);
return p;
}
// ...
}
// Service KHONG biet implementation
@Service
public class ProjectService {
private final ProjectRepository repo; // depend interface
public Project create(...) {
if (repo.existsByName(name)) throw ...;
return repo.save(...);
}
}Sau (Module 04):
// Same interface
public interface ProjectRepository extends JpaRepository<Project, Long> {
List<Project> findByStatus(ProjectStatus status);
boolean existsByName(String name);
}
// Bo InMemoryProjectRepository — Spring Data sinh impl tu dong
// Service KHONG DOI
@Service
public class ProjectService {
private final ProjectRepository repo; // van depend interface
public Project create(...) {
if (repo.existsByName(name)) throw ...;
return repo.save(...);
}
}Thay đổi cần làm:
- Project domain → JPA Entity:JPA require mutable class với no-arg constructor. Module 04 bài 02 đào sâu.
// Truoc: record (immutable) public record Project(Long id, String name, ...) {} // Sau: class voi JPA annotation @Entity @Table(name = "projects") public class Project { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; @Column(nullable = false, unique = true, length = 100) private String name; // getters, setters, no-arg constructor — JPA require } - Repository extends JpaRepository: bỏ in-memory impl, change interface signature.
- Service: thêm
@Transactional:@Service @Transactional(readOnly = true) // class level: mac dinh read-only public class ProjectService { @Transactional // override cho method write public Project create(...) { ... } public Project findById(Long id) { ... } // inherit readOnly = true } - Application properties: add
spring.datasource.*+spring.jpa.*. - Migration script Flyway:
V1__init_schema.sqlvớiCREATE TABLE projects.
Service code không đổi:
- Method signature giữ nguyên.
- Logic business giữ nguyên.
- Test với mock
ProjectRepositoryvẫn work — Mockito mock interface.
Đây là power của Repository pattern + Liskov substitution principle. Service depend interface, không depend implementation. Migrate storage = swap implementation, no business logic change.
Bonus — test infrastructure:
- Module 03: unit test với in-memory repo direct — fast, no Spring.
- Module 04: integration test với Testcontainers + real Postgres. Service test mock repo vẫn work.
- 2-tier testing: unit test fast (mock), integration test slow but real (Testcontainers).
Q7Có 5 cách query DB trong Spring stack: JDBC raw, JdbcTemplate, JdbcClient (Spring 6.1+), JPA EntityManager, Spring Data Repository. Khi nào dùng cái nào?▸
| Tool | Abstraction | SQL control | Boilerplate | Use case |
|---|---|---|---|---|
| JDBC raw | Lowest | 100% control | 10x | Library/framework. Production app rarely. |
| JdbcTemplate | Low | SQL explicit | 3x | Legacy code. SQL-heavy app pre-Spring 6.1. |
| JdbcClient (Spring 6.1+) | Low | SQL explicit | 2x | Modern alternative JdbcTemplate. Fluent API. |
| JPA EntityManager | Mid | JPQL/SQL native | 1.5x | Complex query, batch update, when Repository không đủ. |
| Spring Data Repository | Highest | Derived query/JPQL/SQL | 1x baseline | Standard CRUD + business query. Default 2026. |
Decision tree:
Standard CRUD?
Yes → Spring Data Repository (default)
No → Continue
Complex JPQL/JOIN/aggregate?
Yes → @Query annotation in Repository, hoặc EntityManager
No → Continue
DB-specific feature (Postgres array, JSON, full-text)?
Yes → @Query nativeQuery=true, hoặc JdbcClient
No → Continue
Bulk update/delete (millions of rows)?
Yes → JdbcClient (raw SQL, no entity overhead)
No → Continue
Performance ultra-critical (microbenchmark, hot path)?
Yes → JdbcClient hoặc raw JDBC
No → Continue
Default → Spring Data RepositoryMix patterns trong 1 app:
@Service
public class OrderService {
private final OrderRepository repo; // Spring Data: 90% case
private final JdbcClient jdbc; // Modern Spring 6.1+
private final EntityManager em; // JPA fallback
// 90% method: standard CRUD
public Order create(...) { return repo.save(...); }
public List<Order> findActive() { return repo.findByStatus(ACTIVE); }
// 5% method: complex query không express qua Repository
public OrderStats getStats() {
return jdbc.sql("""
SELECT status, COUNT(*) AS cnt, SUM(total) AS sum
FROM orders
WHERE created_at >= :since
GROUP BY status
""")
.param("since", LocalDate.now().minusDays(30))
.query(OrderStats.class)
.single();
}
// 5% method: bulk operation
public void archiveOldOrders() {
em.createQuery("UPDATE Order o SET o.archived = true WHERE o.createdAt < :cutoff")
.setParameter("cutoff", LocalDate.now().minusYears(2))
.executeUpdate();
}
}Khoá này (TaskFlow): Spring Data Repository default. Module 04 bài 03 introduce @Query. Module 09 (Performance) sẽ dùng JdbcClient cho bulk operation.
Quy tắc: bắt đầu cao nhất (Repository). Drop xuống tầng thấp khi gặp limitation cụ thể, không phải "phòng hờ".
Bài tiếp theo: Entity mapping — @Entity, @Id, @GeneratedValue, naming strategy
Bài này có giúp bạn hiểu bản chất không?
Bình luận (0)
Đang tải...