Relationships — @OneToMany, @ManyToOne, lazy vs eager, N+1 problem
JPA relationships map associations object → foreign key. Bài này bóc 4 association type, owning vs inverse side, fetch type LAZY/EAGER, N+1 problem cause + 4 fix (fetch join, EntityGraph, batch size, projection), cascade types, orphan removal.
Bài 03 đã chỉ ra repository query đơn entity. App thực tế: Project có Tasks, Order có Customer, User có Roles — relationships. Bài này bóc cách JPA map relationships, N+1 problem (bug hiệu năng nguy hiểm nhất JPA), và 4 cách fix.
Đây là bài quan trọng nhất Module 04 cho production performance. Sai 1 chỗ N+1 → app chậm 100x. Hiểu rồi, performance Hibernate ngang hand-tuned SQL.
1. 4 association type
flowchart LR
A["@OneToOne<br/>(Order ↔ Invoice)"]
B["@OneToMany<br/>(Project → Tasks)"]
C["@ManyToOne<br/>(Task → Project)"]
D["@ManyToMany<br/>(User ↔ Roles)"]| Type | Cardinality | DB representation |
|---|---|---|
@OneToOne | 1 ↔ 1 | FK in 1 table (Order.invoice_id) |
@OneToMany | 1 → N | FK in N table (Task.project_id) |
@ManyToOne | N → 1 | FK in N table (Task.project_id) — owning side |
@ManyToMany | M ↔ N | Join table (user_role) |
@OneToMany + @ManyToOne là cặp đôi quan trọng nhất — modeling parent-child.
2. @ManyToOne — owning side
@Entity
public class Task {
@Id @GeneratedValue Long id;
String title;
@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "project_id", nullable = false)
private Project project;
// ...
}
@Entity
public class Project {
@Id @GeneratedValue Long id;
String name;
@OneToMany(mappedBy = "project", fetch = FetchType.LAZY,
cascade = CascadeType.ALL, orphanRemoval = true)
private List<Task> tasks = new ArrayList<>();
}
DB schema:
CREATE TABLE projects (
id BIGSERIAL PRIMARY KEY,
name VARCHAR(100)
);
CREATE TABLE tasks (
id BIGSERIAL PRIMARY KEY,
title VARCHAR(200),
project_id BIGINT NOT NULL,
FOREIGN KEY (project_id) REFERENCES projects(id)
);
Owning side = Task (chứa FK project_id). Inverse side = Project với mappedBy = "project" chỉ về field Task.project.
Quy tắc: side có @JoinColumn là owning. Inverse dùng mappedBy để tránh duplicate column.
2.1 Helper method bidirectional
Khi sync 2 side:
@Entity
public class Project {
@OneToMany(mappedBy = "project", cascade = CascadeType.ALL, orphanRemoval = true)
private List<Task> tasks = new ArrayList<>();
public void addTask(Task task) {
tasks.add(task);
task.setProject(this); // sync inverse
}
public void removeTask(Task task) {
tasks.remove(task);
task.setProject(null); // sync — trigger orphan removal
}
}
Pattern bắt buộc: bidirectional require sync 2 side. Modify chỉ 1 side → state inconsistent → bug runtime.
3. Fetch type — LAZY vs EAGER
fetch parameter quyết định khi load association:
| FetchType | Load when | Default for |
|---|---|---|
LAZY | Khi truy cập field (order.getCustomer()) | @OneToMany, @ManyToMany |
EAGER | Cùng query với entity chủ | @OneToOne, @ManyToOne |
3.1 LAZY — proxy mechanism
@ManyToOne(fetch = FetchType.LAZY)
private Project project;
Hibernate setup proxy CGLIB cho project field. SQL chỉ load Task:
SELECT * FROM tasks WHERE id = 42;
Khi access task.getProject().getName() → trigger 2nd query:
SELECT * FROM projects WHERE id = ?;
Pros: không load data không cần. Cons: N+1 problem (section 5).
3.2 EAGER — JOIN luôn
@ManyToOne(fetch = FetchType.EAGER)
private Project project;
Hibernate luôn JOIN:
SELECT t.*, p.* FROM tasks t LEFT JOIN projects p ON t.project_id = p.id WHERE t.id = 42;
Pros: simple, no LazyInitializationException. Cons:
- Load data không cần khi không truy cập association.
- Chain EAGER → join 5+ table cho 1 query đơn giản.
3.3 Khuyến nghị production
Always LAZY:
@ManyToOne(fetch = FetchType.LAZY) // override default EAGER
private Project project;
@OneToOne(fetch = FetchType.LAZY) // override default EAGER
private Invoice invoice;
@OneToMany(mappedBy = "project") // already LAZY by default
private List<Task> tasks;
@ManyToMany // already LAZY
private List<Role> roles;
LAZY default cho mọi association. Fetch khi cần qua JOIN FETCH hoặc @EntityGraph (section 5).
4. Cascade — operation propagation
@OneToMany(mappedBy = "project",
cascade = {CascadeType.PERSIST, CascadeType.MERGE, CascadeType.REMOVE},
orphanRemoval = true)
private List<Task> tasks = new ArrayList<>();
CascadeType propagate operation từ entity chủ → association:
| Type | Propagate |
|---|---|
PERSIST | em.persist(project) → also persist tasks chưa save |
MERGE | em.merge(project) → also merge tasks |
REMOVE | em.remove(project) → also remove tasks |
REFRESH | em.refresh(project) → reload tasks |
DETACH | em.detach(project) → detach tasks |
ALL | All of above |
orphanRemoval = true: khi remove task khỏi list → also remove from DB.
project.getTasks().remove(task);
// Tx commit → DELETE tasks WHERE id = task.id
4.1 Khi nào CascadeType.ALL OK
@OneToMany parent-child với strict ownership: Project owns Tasks, Tasks không tồn tại độc lập.
// Project chua co
project = new Project("Mobile App");
project.addTask(new Task("Design")); // Task transient
project.addTask(new Task("Develop"));
repo.save(project);
// CASCADE PERSIST: Spring also persist 2 task
// SQL: INSERT projects, INSERT tasks (2 rows)
4.2 Khi nào không nên ALL
@ManyToOne: Task không own Project. Cascade REMOVE Task → REMOVE Project = nguy hiểm.
@ManyToOne(cascade = CascadeType.REMOVE) // SAI - delete task → delete project
private Project project;
✅ Không cascade từ child → parent. Cascade chỉ từ parent → child.
5. N+1 problem — bug nguy hiểm nhất
5.1 Vấn đề
@Service
@Transactional(readOnly = true)
public List<ProjectDto> listAll() {
List<Project> projects = repo.findAll(); // 1 query
return projects.stream()
.map(p -> new ProjectDto(
p.getId(),
p.getName(),
p.getTasks().size() // LAZY load → 1 query per project!
))
.toList();
}
SQL trace:
SELECT * FROM projects; -- 1 query, 100 rows
SELECT * FROM tasks WHERE project_id = 1; -- 100 queries
SELECT * FROM tasks WHERE project_id = 2;
...
SELECT * FROM tasks WHERE project_id = 100;
101 queries cho 100 projects. Đây là N+1 problem (1 + N queries).
Performance:
- 100 projects: 101 queries × 50ms latency = 5 seconds.
- 1000 projects: 1001 queries × 50ms = 50 seconds — UI timeout.
Bug subtle: code work tại dev (10 row), fail production (10K row).
5.2 Detect N+1
3 cách:
1. Log SQL:
logging:
level:
org.hibernate.SQL: DEBUG
Count query trong log per request — N+1 lộ ngay.
2. Hibernate Statistics:
spring.jpa.properties.hibernate.generate_statistics: true
logging.level.org.hibernate.stat: DEBUG
Statistics:
90000003 nanoseconds spent executing 101 JDBC statements;
3. Datasource Proxy / p6spy:
Lib log every JDBC call với caller stack trace. Identify chính xác code line gây N+1.
5.3 Fix 1 — JOIN FETCH
@Query("SELECT p FROM Project p LEFT JOIN FETCH p.tasks WHERE p.id = :id")
Optional<Project> findByIdWithTasks(@Param("id") Long id);
@Query("SELECT DISTINCT p FROM Project p LEFT JOIN FETCH p.tasks")
List<Project> findAllWithTasks();
SQL:
SELECT p.*, t.*
FROM projects p LEFT JOIN tasks t ON t.project_id = p.id;
1 query — no N+1.
DISTINCT: avoid duplicate Project khi multiple Task. Hibernate post-process deduplicate.
5.4 Fix 2 — @EntityGraph
@EntityGraph(attributePaths = {"tasks"})
List<Project> findAll();
@EntityGraph(attributePaths = {"tasks", "tasks.assignee"}) // nested
List<Project> findByStatus(ProjectStatus status);
Equivalent JOIN FETCH nhưng cleaner — define ngoài @Query.
@NamedEntityGraph(
name = "Project.withTasks",
attributeNodes = @NamedAttributeNode("tasks")
)
@Entity
public class Project { ... }
@EntityGraph("Project.withTasks")
List<Project> findByStatus(ProjectStatus status);
Reuse named graph across multiple methods.
5.5 Fix 3 — Batch size
spring.jpa.properties.hibernate.default_batch_fetch_size: 50
Hoặc per-entity:
@OneToMany(mappedBy = "project")
@BatchSize(size = 50)
private List<Task> tasks;
Hibernate batch lazy load: thay 100 query đơn → query với IN:
-- Original: 100 queries
SELECT * FROM tasks WHERE project_id IN (?, ?, ?, ..., ?); -- batch 50 IDs
-- 100 projects → 2 batched queries (50 + 50)
Pros: không phải sửa code, just config. Cons: vẫn 2 query thay 1 (so với JOIN FETCH).
5.6 Fix 4 — DTO projection
@Query("""
SELECT new com.olhub.dto.ProjectSummary(p.id, p.name, COUNT(t.id))
FROM Project p LEFT JOIN p.tasks t
GROUP BY p.id, p.name
""")
List<ProjectSummary> findSummaries();
Single SQL với GROUP BY — count tasks per project. Không load entity → no N+1.
5.7 Comparison
| Fix | When |
|---|---|
| JOIN FETCH | Cần full entity với association loaded |
| @EntityGraph | Same as JOIN FETCH, declarative |
| Batch size | Quick win không refactor — global config |
| DTO projection | Read-only display, performance critical |
Default 2026: DTO projection cho list endpoint, JOIN FETCH cho single entity load.
6. MultipleBagFetchException
@Query("SELECT p FROM Project p LEFT JOIN FETCH p.tasks LEFT JOIN FETCH p.contributors")
Project findFullById(Long id);
Throw MultipleBagFetchException! Hibernate không cho phép JOIN FETCH 2 collection cùng lúc (cả 2 là List = "bag" type).
Vì sao: JOIN 2 collection → Cartesian product → result size = N × M (rows duplicate exponentially).
Fix 3 cách:
- Đổi 1 List → Set:
@OneToMany Set<Task> tasks; // Set không phải bag - Tách 2 query:
@EntityGraph(attributePaths = "tasks") Project findByIdWithTasks(Long id); @EntityGraph(attributePaths = "contributors") Project findByIdWithContributors(Long id); - DTO query với separate aggregation.
7. @OneToOne patterns
7.1 Shared primary key
@Entity
public class Order {
@Id @GeneratedValue Long id;
@OneToOne(mappedBy = "order", cascade = CascadeType.ALL)
private Invoice invoice;
}
@Entity
public class Invoice {
@Id Long id; // SAME as Order.id
@MapsId // tells Hibernate to copy ID from Order
@OneToOne
@JoinColumn(name = "id")
private Order order;
}
Pros: no extra column, FK = PK. Cons: lock-step relationship.
7.2 FK pattern (more common)
@Entity
public class Order {
@OneToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "invoice_id")
private Invoice invoice;
}
@OneToOne thường có FK trong table chủ.
7.3 LAZY pitfall với @OneToOne
@OneToOne không-nullable + inverse side (mappedBy): Hibernate phải query DB để biết "có tồn tại invoice không" → không thể LAZY proxy. Always EAGER trong case này dù khai LAZY.
Workaround: bytecode enhancement (Hibernate 5+). Hoặc avoid @OneToOne mappedBy, dùng @ManyToOne + unique constraint thay.
8. @ManyToMany pattern
@Entity
public class User {
@ManyToMany(fetch = FetchType.LAZY)
@JoinTable(
name = "user_roles",
joinColumns = @JoinColumn(name = "user_id"),
inverseJoinColumns = @JoinColumn(name = "role_id")
)
private Set<Role> roles = new HashSet<>();
}
@Entity
public class Role {
@ManyToMany(mappedBy = "roles")
private Set<User> users = new HashSet<>();
}
Join table user_roles (user_id, role_id).
Best practice: convert sang 2 @OneToMany qua entity intermediate khi cần extra field:
@Entity
public class UserRole {
@Id @GeneratedValue Long id;
@ManyToOne User user;
@ManyToOne Role role;
Instant grantedAt;
String grantedBy;
}
@ManyToMany đơn thuần không support extra field. Convert sớm khi nghi ngờ cần.
9. Vận hành production — N+1 detection trong CI, batch processing, read replica
Bài này đã cover N+1 problem đầy đủ + 4 fix. Section này cover quy trình production: detect N+1 trong CI/CD, lazy loading anti-pattern khi scale, batch processing pattern, read replica routing.
9.1 N+1 detection trong CI/CD — fail build automatic
Production: bug N+1 sneak in qua PR khi reviewer không tinh ý. Cách enforce trong test:
Cách 1 — Datasource Proxy với assertion query count:
@SpringBootTest
class ProjectServiceTest {
@Autowired ProjectService service;
@Test
@Transactional
void listProjects_executes_at_most_2_queries() {
QueryCountHolder.clear();
List<ProjectDto> projects = service.listAll();
QueryCount count = QueryCountHolder.getGrandTotal();
assertThat(count.getSelect()).isLessThanOrEqualTo(2); // bound assertion
}
}
PR thêm code N+1 → test fail trước khi merge. Đầu tư test đáng giá — N+1 silent in dev (nhỏ data) nhưng chết production.
Cách 2 — Hibernate Statistics threshold:
# application-test.yml
spring.jpa.properties.hibernate.generate_statistics: true
@Autowired EntityManagerFactory emf;
@AfterEach
void assertNoNPlusOne() {
Statistics stats = emf.unwrap(SessionFactory.class).getStatistics();
long entityFetch = stats.getEntityFetchCount();
long collectionFetch = stats.getCollectionFetchCount();
// Threshold theo test scope
assertThat(entityFetch + collectionFetch).isLessThan(5);
}
Cách 3 — Hypersistence Optimizer (paid lib of Vlad Mihalcea) — auto-detect ORM anti-pattern, integrate JUnit:
@Test
void noNPlusOne() {
HypersistenceOptimizer optimizer = new HypersistenceOptimizer(...);
List<Event> events = optimizer.getEvents();
assertThat(events).isEmpty();
}
Khuyến nghị: cách 1 + 2 cho project free. Cách 3 cho team enterprise có ngân sách.
9.2 Production monitoring — query patterns + N+1 alert
Metric stack:
spring.jpa.properties.hibernate.generate_statistics: true
management:
metrics:
enable:
hibernate: true
Hibernate metrics export Micrometer:
| Metric | Mô tả |
|---|---|
hibernate.queries.executed | Total query count |
hibernate.entities.loaded | Entity load count |
hibernate.collections.fetched | Collection lazy-load count |
hibernate.cache.hit.ratio | 2nd level cache hit |
hibernate.flushes | Flush count (cao = nhiều dirty checking) |
Alert N+1 production:
- alert: HibernateNPlusOne
expr: |
rate(hibernate_collections_fetched_total[5m])
/
rate(http_server_requests_seconds_count[5m])
> 10
for: 10m
labels:
severity: warning
annotations:
summary: "Possible N+1: collection fetches per request high"
Logic: collection fetch / request vượt 10 → likely N+1 trong code path đang hot. Investigate code path mới deploy gần.
9.3 Lazy loading anti-pattern khi scale
Problem cụ thể: LazyInitializationException khi serialize entity ngoài transaction.
@GetMapping("/projects/{id}")
public Project get(@PathVariable Long id) {
return projectRepo.findById(id).orElseThrow();
// Tx end khi service return
// Serialize entity → access lazy field → LazyInitializationException
}
3 cách fix với trade-off:
Cách 1 — Open Session in View (OSIV) — Boot default true. Anti-pattern production:
- Hold connection suốt request (thậm chí suốt serialize JSON) → connection pool exhausted.
- Hide N+1 trong view layer → khó detect.
- Disable:
spring.jpa.open-in-view: false.
Cách 2 — Map sang DTO trong service (recommend):
@Transactional(readOnly = true)
public ProjectDto getDto(Long id) {
Project p = repo.findById(id).orElseThrow();
return new ProjectDto(p.getId(), p.getName(),
p.getTasks().stream().map(TaskDto::from).toList()); // load lazy in tx
}
Tx scope chỉ trong service method — connection release sớm. DTO immutable không lazy proxy.
Cách 3 — @EntityGraph load eager khi cần — findById trả Project full association:
@EntityGraph(attributePaths = {"tasks", "tasks.assignee"})
Optional<Project> findById(Long id);
Pattern enterprise: spring.jpa.open-in-view: false global, force dev hydrate DTO trong service. Catch lazy bug tại dev local thay production.
# application.yml — production safe default
spring:
jpa:
open-in-view: false
Switch này có thể làm vỡ app legacy — migrate gradually, fix LazyInitializationException từng endpoint.
9.4 Batch processing pattern — tránh N+1 và memory bùng
Anti-pattern import 100k row:
@Transactional
public void importAll(List<Row> rows) {
for (Row r : rows) {
Project p = new Project(r.name());
p.addTask(new Task(r.taskTitle()));
repo.save(p);
}
// Hibernate hold 100k entity in 1st level cache → heap OOM
// Tx hold connection 30+ minutes → pool exhausted
}
Pattern đúng — batch + flush + clear:
@PersistenceContext EntityManager em;
public void importAll(List<Row> rows) {
int batchSize = 50;
for (int i = 0; i < rows.size(); i++) {
Project p = new Project(rows.get(i).name());
em.persist(p);
if (i % batchSize == 0 && i > 0) {
em.flush();
em.clear(); // detach all entity → free heap
}
}
em.flush();
em.clear();
}
Config Hibernate batch insert:
spring:
jpa:
properties:
hibernate:
jdbc:
batch_size: 50
order_inserts: true
order_updates: true
batch_versioned_data: true
100k row nhân 1 INSERT = 100k roundtrip DB. Batch 50 → 2k roundtrip. 50x speedup.
Caveat: tx scope vẫn dài. Pattern enterprise → tách thành chunk-based job:
public void importInChunks(List<Row> rows) {
int chunkSize = 1000;
for (int i = 0; i < rows.size(); i += chunkSize) {
List<Row> chunk = rows.subList(i, Math.min(i + chunkSize, rows.size()));
self.importChunk(chunk); // 1 chunk = 1 tx
}
}
@Transactional
public void importChunk(List<Row> chunk) {
chunk.forEach(r -> em.persist(new Project(r.name())));
// tx end automatically — connection released
}
Mỗi chunk 1 tx ngắn. Connection pool không bị hold. Spring Batch (Module 09) đào sâu pattern này.
9.5 Failure runbook — query performance
Mode 1 — Slow query (P99 endpoint vượt SLA):
Diagnose:
- Bật
spring.jpa.show-sql: true+org.hibernate.SQL: DEBUGtạm thời. - Capture slow query → run
EXPLAIN ANALYZEPostgres. - Check missing index, full table scan.
Remediate: thêm index, refactor query, dùng DTO projection.
Mode 2 — Connection pool exhausted (link Module 04 bài 05):
Triệu chứng: hikaricp.connections.pending vượt 0. Nguyên nhân thường: long tx + N+1 → mỗi tx hold connection 5+ giây.
Remediate: fix N+1 (4 cách section 5), add connection timeout @Transactional(timeout = 5).
Mode 3 — MultipleBagFetchException runtime:
Triệu chứng: 500 với "cannot simultaneously fetch multiple bags".
Diagnose: query JOIN FETCH 2 List collection.
Remediate: section 6 đã cover — đổi 1 List sang Set hoặc tách 2 query EntityGraph.
Mode 4 — Cascade gone wrong (delete parent xoá nhiều hơn expected):
Diagnose: cascade = ALL trên @ManyToOne (anti-pattern). Delete child cascade lên parent.
Remediate: revert cascade chỉ parent → child. Audit cascade configuration toàn entity. CI test verify delete behavior.
Mode 5 — LazyInitializationException production:
Diagnose: entity serialize ngoài tx (vd controller return entity raw, OSIV tắt).
Remediate: map sang DTO trong service. Bật OSIV tạm thời nếu cần — không phải fix lâu dài.
9.6 Pattern enterprise — read replica routing với CQRS
App high read:write ratio (10:1, 100:1) → route read tx sang replica:
@Configuration
public class RoutingDataSourceConfig {
@Bean
@Primary
public DataSource routingDataSource(
@Qualifier("primary") DataSource primary,
@Qualifier("replica") DataSource replica) {
AbstractRoutingDataSource routing = new AbstractRoutingDataSource() {
protected Object determineCurrentLookupKey() {
return TransactionSynchronizationManager
.isCurrentTransactionReadOnly() ? "replica" : "primary";
}
};
routing.setTargetDataSources(Map.of("primary", primary, "replica", replica));
routing.setDefaultTargetDataSource(primary);
return new LazyConnectionDataSourceProxy(routing);
}
}
LazyConnectionDataSourceProxy quan trọng — defer connection acquire đến khi tx start, đảm bảo readOnly flag được set trước khi route.
Service split CQRS:
@Service
@Transactional(readOnly = true)
public class ProjectQueryService {
public Page<ProjectDto> list(Pageable p) { ... } // route → replica
}
@Service
@Transactional
public class ProjectCommandService {
public Project create(...) { ... } // route → primary
}
Pattern CQRS: command service write → primary, query service read → replica. Module 12 đào sâu CQRS event sourcing.
Cảnh báo replication lag: primary → replica lag 10-100ms. Write trên primary, read trên replica có thể không thấy data vừa write. Use case strict consistency (read-after-write) → force route primary qua @Transactional (default, không readOnly).
9.7 Schema evolution với relationships
Production: thêm/đổi relationship cần backwards-compat trong rolling deploy:
| Change | Strategy |
|---|---|
Add @OneToMany mới | Add column nullable, populate dần, deploy code |
Convert @ManyToOne sang @ManyToMany | Tạo join table riêng, migrate dần, deprecate FK cũ |
| Đổi LAZY sang EAGER | Test kỹ — có thể trigger N+1 ở code path không expect eager |
Add @Cascade ALL | Ramp gradually, test cascade scope chính xác |
| Drop relationship | Remove code reference trước, drop column sau (separate migration) |
Pattern enterprise — schema migration phải reversible. Mỗi Flyway migration có rollback script. Module 04 bài 06 đào sâu Flyway.
10. Pitfall tổng hợp
❌ Nhầm 1: EAGER mặc định cho @ManyToOne.
✅ Override LAZY everywhere.
❌ Nhầm 2: N+1 not detected. ✅ Bật SQL log production-ready từ ngày 1.
❌ Nhầm 3: Cascade ALL cho @ManyToOne.
✅ Cascade chỉ parent → child. @ManyToOne no cascade thường.
❌ Nhầm 4: Quên helper method bidirectional.
✅ addTask/removeTask sync 2 side.
❌ Nhầm 5: JOIN FETCH 2 collection. ✅ MultipleBagFetchException. Set hoặc 2 query.
❌ Nhầm 6: @ManyToMany cho relationship có metadata.
✅ Convert sớm sang entity intermediate.
❌ Nhầm 7: Modify entity outside @Transactional.
✅ Service method luôn @Transactional.
11. 📚 Deep Dive Spring Reference
Hibernate:
Vlad Mihalcea (essential):
- N+1 query problem
- JOIN FETCH vs EntityGraph
- MultipleBagFetchException
- The best way to map @ManyToMany
- Cascade types
Tool:
- IntelliJ "JPA Console" — preview JPQL → SQL.
- Datasource Proxy — log SQL với caller.
- Hypersistence Optimizer — paid lib detect anti-pattern auto.
12. Tóm tắt
- 4 association:
@OneToOne,@OneToMany,@ManyToOne,@ManyToMany. - Owning vs Inverse side: side có
@JoinColumnlà owning. Inverse dùngmappedBy. - Bidirectional cần helper method sync 2 side:
addTask/removeTask. - Fetch type: LAZY default cho
@OneToMany/@ManyToMany. EAGER default cho@ManyToOne/@OneToOne— always override LAZY. - N+1 problem: lazy load trong loop → 1 + N queries. Performance tệ.
- 4 fix N+1: JOIN FETCH (
@Query),@EntityGraph, batch size, DTO projection. - Cascade type: ALL cho parent-child strict ownership. Không cascade từ child → parent.
orphanRemoval = true: remove khỏi list → DELETE DB.MultipleBagFetchException: JOIN FETCH 2 collection fail. Fix: Set hoặc 2 query.@OneToOneLAZY: inverse side mappedBy buộc EAGER. Use FK pattern hoặc bytecode enhancement.@ManyToManyđơn thuần: convert sang entity intermediate khi cần metadata.
13. Tự kiểm tra
Q1Đoạn sau in ra bao nhiêu SQL query? Vì sao?@Service @Transactional(readOnly = true)
public List<ProjectDto> listProjects() {
List<Project> projects = repo.findAll(); // 100 projects
return projects.stream()
.map(p -> new ProjectDto(p.getId(), p.getName(), p.getTasks().size()))
.toList();
}
▸
@Service @Transactional(readOnly = true)
public List<ProjectDto> listProjects() {
List<Project> projects = repo.findAll(); // 100 projects
return projects.stream()
.map(p -> new ProjectDto(p.getId(), p.getName(), p.getTasks().size()))
.toList();
}101 queries — N+1 problem classic.
Phân tích:
repo.findAll()→ 1 SQL:SELECT * FROM projectstrả 100 row.- Loop qua 100 project, mỗi
p.getTasks()trigger lazy load:- Project 1:
SELECT * FROM tasks WHERE project_id = 1— 1 SQL. - Project 2:
SELECT * FROM tasks WHERE project_id = 2— 1 SQL. - ... 100 SQL tổng.
- Project 1:
- Tổng: 1 + 100 = 101 query.
Performance impact:
- Local DB ping <1ms: 101 × 1ms = 101ms — slow nhưng acceptable.
- Cloud DB ping 50ms: 101 × 50ms = 5 giây — UI timeout.
- 10,000 project → 10,001 queries × 50ms = 500 giây.
Fix cleanest — DTO projection (single SQL):
public record ProjectDto(Long id, String name, long taskCount) {}
@Query("""
SELECT new com.olhub.dto.ProjectDto(p.id, p.name, COUNT(t.id))
FROM Project p LEFT JOIN p.tasks t
GROUP BY p.id, p.name
""")
List<ProjectDto> findAllSummaries();SQL: 1 query với GROUP BY:
SELECT p.id, p.name, COUNT(t.id) AS task_count
FROM projects p LEFT JOIN tasks t ON t.project_id = p.id
GROUP BY p.id, p.name;Performance: 1 query × 50ms = 50ms. 100x speedup.
Alternative — JOIN FETCH (load full entity):
@Query("SELECT DISTINCT p FROM Project p LEFT JOIN FETCH p.tasks")
List<Project> findAllWithTasks();
// Service
public List<ProjectDto> list() {
return repo.findAllWithTasks().stream()
.map(p -> new ProjectDto(p.getId(), p.getName(), p.getTasks().size()))
.toList();
}2 query (1 main + 1 count). Tốt hơn 101 nhưng tệ hơn DTO projection. Use case: cần modify tasks sau load.
Recommend cho list endpoint: DTO projection. Cho single entity với full data: JOIN FETCH.
Q2So sánh 4 cách fix N+1: JOIN FETCH, @EntityGraph, batch_size, DTO projection. Khi nào pick cái nào?▸
Comparison:
| Approach | SQL count | Fetch full entity | Refactor cost | Best for |
|---|---|---|---|---|
| JOIN FETCH | 1 (with JOIN) | ✅ | Med (write @Query) | Single entity load + modify |
| @EntityGraph | 1 (with JOIN) | ✅ | Low (annotation only) | Same as JOIN FETCH, declarative |
| batch_size config | 2-N (batched IN) | ✅ | Zero (config only) | Quick win, không refactor |
| DTO projection | 1 (no JOIN entity) | ❌ (subset) | Med (DTO + JPQL) | Read-only, performance critical |
Decision tree:
Cần modify entity sau load?
Yes → JOIN FETCH hoặc @EntityGraph
No → DTO projection (best performance)
Cần đầy đủ field entity?
Yes → JOIN FETCH
No → DTO projection (subset field)
Có thể refactor query?
No, app legacy quá lớn → batch_size config (quick win)
Yes → JOIN FETCH / @EntityGraph
Code dynamic, nhiều method?
Yes → @EntityGraph (reuse via @NamedEntityGraph)
No → JOIN FETCH inline @QueryCode mẫu mỗi:
// 1. JOIN FETCH
@Query("SELECT DISTINCT p FROM Project p LEFT JOIN FETCH p.tasks WHERE p.id = :id")
Optional<Project> findByIdWithTasks(@Param("id") Long id);
// 2. @EntityGraph
@EntityGraph(attributePaths = {"tasks", "tasks.assignee"})
Optional<Project> findById(Long id);
// 3. batch_size — config global
spring.jpa.properties.hibernate.default_batch_fetch_size: 50
// Per-entity
@OneToMany(mappedBy = "project")
@BatchSize(size = 50)
private List<Task> tasks;
// 4. DTO projection
@Query("SELECT new com.olhub.dto.ProjectSummary(p.id, p.name, COUNT(t.id)) " +
"FROM Project p LEFT JOIN p.tasks t GROUP BY p.id, p.name")
List<ProjectSummary> findSummaries();Performance comparison cho 100 project × 10 task:
- N+1 (no fix): 1 + 100 = 101 query × 50ms = 5050ms.
- batch_size=50: 1 + 2 = 3 query × 50ms = 150ms.
- JOIN FETCH: 1 query × 100ms = 100ms (joined result lớn hơn).
- @EntityGraph: Same as JOIN FETCH (= 100ms).
- DTO projection: 1 query với GROUP BY × 80ms = 80ms (count, no entity).
Recommend pattern 2026 cho TaskFlow:
- List endpoint (display 100+ items): DTO projection.
- Single GET với association cần: @EntityGraph (clean).
- Modify scenario: JOIN FETCH single entity.
- Legacy app difficult to refactor: batch_size + monitor.
Q3JPA mặc định LAZY hay EAGER cho mỗi association? Recommend production?▸
| Annotation | JPA spec default | Recommend production |
|---|---|---|
@OneToOne | EAGER | LAZY (override) |
@ManyToOne | EAGER | LAZY (override) |
@OneToMany | LAZY | LAZY (keep) |
@ManyToMany | LAZY | LAZY (keep) |
Vì sao spec default EAGER cho *ToOne:
- Single record — load thêm 1 row negligible.
- Avoid LazyInitializationException out of session.
- Spec viết 2009 — pre-microservice era, performance không critical bằng convenience.
Vì sao production override LAZY:
- Chain EAGER: Order → Customer (eager) → Account (eager) → ... → 5+ JOIN cho 1 query đơn giản.
- Load không cần: hiển thị Order list không cần Customer detail. Eager wastes bandwidth.
- Memory: 100 entity với eager association = 1000+ object trong heap.
- Predictability: LAZY explicit fetch tốt hơn implicit.
Override pattern:
@Entity
public class Order {
@ManyToOne(fetch = FetchType.LAZY) // override default EAGER
@JoinColumn(name = "customer_id")
private Customer customer;
@OneToOne(fetch = FetchType.LAZY) // override default EAGER
@JoinColumn(name = "invoice_id")
private Invoice invoice;
@OneToMany(mappedBy = "order") // already LAZY
private List<OrderItem> items;
}Quy tắc chống lazy:
- Mọi
@ManyToOne,@OneToOneoverride LAZY. - Service method luôn
@Transactionalđể lazy load work. - Fetch explicit qua JOIN FETCH / @EntityGraph khi cần association.
- Detect N+1 qua SQL log production-ready.
Lazy gotcha — `@OneToOne` mappedBy buộc EAGER:
@Entity
public class Order {
@OneToOne(fetch = FetchType.LAZY, mappedBy = "order")
Invoice invoice; // VAN EAGER do mappedBy + null check
}Hibernate phải query DB check "invoice tồn tại không" trước khi tạo proxy null/non-null. Không tránh được — workaround bytecode enhancement (Hibernate 5+) hoặc redesign relationship (FK pattern).
Q4Bidirectional `@OneToMany` + `@ManyToOne`: vì sao cần helper method `addTask`/`removeTask`? Cho ví dụ bug nếu không sync.▸
Vì sao cần helper:
Bidirectional = 2 reference cùng đại diện 1 relationship. Modify 1 side không tự sync side kia. State inconsistent → bug runtime.
Bug ví dụ — không sync:
@Entity
public class Project {
@OneToMany(mappedBy = "project", cascade = CascadeType.ALL, orphanRemoval = true)
private List<Task> tasks = new ArrayList<>();
public List<Task> getTasks() { return tasks; }
}
@Entity
public class Task {
@ManyToOne
@JoinColumn(name = "project_id")
private Project project;
}
// Service code
@Transactional
public Task createTask(Long projectId, String title) {
Project project = projectRepo.findById(projectId).orElseThrow();
Task task = new Task(title);
task.setProject(project); // SET MOT SIDE
project.getTasks().add(task); // SET SIDE KIA — quen hoac mismatch?
// Bug 1: Quen line 2 → Project.tasks list NOT contain new task in current tx
// → service trả Project DTO với task list cũ → user không thấy task mới
// Bug 2: Quen line 1 → Task.project = null → INSERT task voi project_id NULL → constraint violation
return taskRepo.save(task);
}Fix với helper method:
@Entity
public class Project {
@OneToMany(mappedBy = "project", cascade = CascadeType.ALL, orphanRemoval = true)
private List<Task> tasks = new ArrayList<>();
// Helper sync 2 side
public void addTask(Task task) {
tasks.add(task);
task.setProject(this); // sync inverse
}
public void removeTask(Task task) {
tasks.remove(task);
task.setProject(null); // sync inverse → trigger orphan removal
}
}
// Service code clean
@Transactional
public Task createTask(Long projectId, String title) {
Project project = projectRepo.findById(projectId).orElseThrow();
Task task = new Task(title);
project.addTask(task); // 1 line, 2 side sync
return taskRepo.save(task);
}Lợi ích helper:
- Encapsulate sync logic: sửa 1 chỗ, áp dụng mọi caller.
- Prevent bug: developer không phải nhớ "sync 2 side" — method enforce.
- Test friendly: verify behavior method, không cần check internal state.
Alternative: JPA cascade auto-sync:
// Project có cascade = ALL + orphanRemoval = true
@Transactional
public void example() {
Project project = projectRepo.findById(projectId).orElseThrow();
Task task = new Task(title);
project.getTasks().add(task); // chi modify owning side
task.setProject(project); // VAN can — JPA require both side consistent
// Tx commit:
// - Cascade PERSIST: INSERT task with project_id (set vi setProject)
// - OK
}Cascade không tự sync 2 side — nó propagate operation. Sync vẫn manual.
Quy tắc: mọi entity bidirectional có helper method. Test entity-level (ngoài Spring) verify sync behavior.
Q5Đoạn `@Query("SELECT p FROM Project p LEFT JOIN FETCH p.tasks LEFT JOIN FETCH p.contributors")` throw `MultipleBagFetchException`. Vì sao? Cách fix?▸
Vì sao:
"Bag" trong Hibernate = unordered List. @OneToMany List<Task> và @OneToMany List<User> contributors đều là bag.
JOIN FETCH 2 bag cùng lúc → SQL Cartesian product:
SELECT p.*, t.*, c.*
FROM projects p
LEFT JOIN tasks t ON t.project_id = p.id
LEFT JOIN contributors c ON c.project_id = p.id;
-- Project có 10 task + 5 contributor → 10 × 5 = 50 row trả về
-- Hibernate cannot deduplicate bag-bag combination
-- Throw: org.hibernate.loader.MultipleBagFetchExceptionHibernate fail intentionally để dev không bị surprise:
- Result row inflate: 10 task + 5 contributor → 50 row → memory bùng nổ với entity lớn.
- Hibernate không thể đoán correct deduplication strategy.
3 cách fix:
Fix 1 — Đổi 1 collection sang Set:
@Entity
public class Project {
@OneToMany(mappedBy = "project")
private Set<Task> tasks = new HashSet<>(); // Set, not List
@OneToMany(mappedBy = "project")
private List<User> contributors = new ArrayList<>(); // List
}
// Query work — 1 bag + 1 set OK
@Query("SELECT DISTINCT p FROM Project p LEFT JOIN FETCH p.tasks LEFT JOIN FETCH p.contributors")
Project findById(Long id);Set không phải bag → Hibernate cho phép. Nhưng Cartesian product vẫn xảy ra trong DB → result row count vẫn 50. Performance impact.
Fix 2 — Tách 2 query (recommended):
@EntityGraph(attributePaths = "tasks")
Optional<Project> findByIdWithTasks(Long id);
@EntityGraph(attributePaths = "contributors")
Optional<Project> findByIdWithContributors(Long id);
// Service layer
@Transactional(readOnly = true)
public ProjectDetailDto getDetail(Long id) {
Project p1 = repo.findByIdWithTasks(id).orElseThrow();
Project p2 = repo.findByIdWithContributors(id).orElseThrow();
// Note: 2 separate persistence context entries — be aware
return new ProjectDetailDto(
p1.getId(),
p1.getName(),
p1.getTasks().stream().map(TaskDto::from).toList(),
p2.getContributors().stream().map(UserDto::from).toList()
);
}2 SQL: JOIN tasks + JOIN contributors. No Cartesian. Total row: 10 + 5 = 15.
Fix 3 — DTO projection (best):
public record ProjectDetailDto(
Long id, String name,
List<TaskDto> tasks,
List<UserDto> contributors
) {}
// Custom @Query với SELECT new — phức tạp hơn vì có 2 collection
// Hoặc 2 query DTO + assemble trong service
@Query("SELECT new com.olhub.dto.TaskDto(t.id, t.title) FROM Task t WHERE t.project.id = :id")
List<TaskDto> findTasksByProjectId(@Param("id") Long id);
@Query("SELECT new com.olhub.dto.UserDto(u.id, u.name) FROM Project p JOIN p.contributors u WHERE p.id = :id")
List<UserDto> findContributorsByProjectId(@Param("id") Long id);Performance tốt nhất — chỉ select fields cần.
Recommend: fix 2 (tách 2 query với EntityGraph) cho most cases. Fix 3 cho high-traffic endpoint.
Q6Cascade type ALL vs DEFAULT — khi nào nên dùng cái nào? Với `@ManyToOne` thì cascade gì hợp lý?▸
Quy tắc cascade — propagate parent → child only.
| Relationship | Cascade hợp lý | Vì sao |
|---|---|---|
@OneToMany (parent → child) | ALL + orphanRemoval = true | Project owns Tasks. Delete Project = Delete Tasks. |
@ManyToOne (child → parent) | NONE (default) | Task không own Project. Delete Task không nên delete Project. |
@OneToOne (1 owner) | ALL nếu owner | Order owns Invoice — same as @OneToMany pattern. |
@ManyToMany | NONE (chỉ PERSIST/MERGE nếu cần) | User và Role independent. Delete User không delete Roles. |
Với @ManyToOne — KHÔNG cascade:
@Entity
public class Task {
@ManyToOne(fetch = FetchType.LAZY) // KHONG cascade
@JoinColumn(name = "project_id", nullable = false)
private Project project;
}Vì sao không:
cascade = CascadeType.REMOVEtrên @ManyToOne = thảm hoạ:taskRepo.delete(task); // delete task // CASCADE: also delete project! // → all OTHER tasks lose project_id → constraint violation OR orphan- cascade = ALL include REMOVE → tương tự nguy hiểm.
- Logic semantic: Task tồn tại chỉ khi Project tồn tại. Reverse không true.
Khi nào @ManyToOne cascade hợp lý:
CascadeType.PERSIST: save Task với new Project (Project chưa save) → cascade persist Project trước.Hiếm dùng — thường tạo Project trước, rồi associate Task.Task task = new Task("Design"); task.setProject(new Project("Mobile App")); // project transient taskRepo.save(task); // Without cascade: throw "TransientPropertyValueException" // With CascadeType.PERSIST: save project first, then task
Default safe:
// @ManyToOne — no cascade
@ManyToOne(fetch = FetchType.LAZY)
private Project project;
// @OneToMany parent — cascade ALL + orphanRemoval
@OneToMany(mappedBy = "project", cascade = CascadeType.ALL, orphanRemoval = true)
private List<Task> tasks;orphanRemoval semantics:
// Without orphanRemoval = true
project.getTasks().remove(task);
// task.project_id stay set, task vẫn tồn tại in DB → "orphan" task with project_id stale
// With orphanRemoval = true
project.getTasks().remove(task);
// → DELETE FROM tasks WHERE id = task.id
// task gone from DBorphanRemoval khác CascadeType.REMOVE:
- orphanRemoval: trigger khi association detach (remove from list).
- cascade REMOVE: trigger khi parent removed.
Both thường dùng cùng (ALL + orphanRemoval) cho parent-child strict ownership.
Bài tiếp theo: Transactions — @Transactional, propagation, rollback rules
Bài này có giúp bạn hiểu bản chất không?
Bình luận (0)
Đang tải...