Pagination & sorting — Pageable, Sort, projection DTO, cursor scroll
Pageable + Sort là API chuẩn cho list endpoint. Bài này bóc cơ chế OFFSET/LIMIT, performance impact deep page, alternatives Slice và keyset pagination, sort multi-field, projection DTO cho perf, validation Pageable, và pattern Spring Data Web.
Module 03 đã introduce Pageable cho REST endpoint. Module 04 này đào sâu DB integration: SQL generated, performance impact OFFSET deep pagination, 3 alternatives (Slice, KeysetPagination, Window), pattern projection DTO cho perf, validation Pageable input.
Pagination quan trọng cho production — list 1M record trả về full = OOM + slow. Implement đúng = scale tới 100M+ record.
1. Pageable basic
public interface ProjectRepository extends JpaRepository<Project, Long> {
Page<Project> findAll(Pageable pageable);
Page<Project> findByStatus(ProjectStatus status, Pageable pageable);
}
@RestController
public class ProjectController {
@GetMapping("/projects")
public Page<ProjectDto> list(
@RequestParam(required = false) ProjectStatus status,
Pageable pageable // auto-bind from query string
) {
Page<Project> page = status == null
? repo.findAll(pageable)
: repo.findByStatus(status, pageable);
return page.map(ProjectDto::from);
}
}
Request: GET /projects?status=ACTIVE&page=0&size=20&sort=createdAt,desc.
Spring Data Web bind:
page=0→ page index (0-based).size=20→ results per page.sort=createdAt,desc→ sort field + direction.
Pageable resolve qua PageableHandlerMethodArgumentResolver.
2. Generated SQL
-- Query 1: data
SELECT p.* FROM projects p
WHERE p.status = ? -- 'ACTIVE'
ORDER BY p.created_at DESC
LIMIT 20 OFFSET 0;
-- Query 2: total count
SELECT COUNT(*) FROM projects p
WHERE p.status = ?; -- 'ACTIVE'
Spring wrap result trong Page<Project>:
public interface Page<T> extends Slice<T> {
int getTotalPages();
long getTotalElements();
boolean isFirst();
boolean isLast();
boolean hasNext();
boolean hasPrevious();
int getNumber();
int getNumberOfElements();
int getSize();
Pageable getPageable();
List<T> getContent();
}
JSON response:
{
"content": [...],
"pageable": {"pageNumber": 0, "pageSize": 20, "sort": ...},
"totalElements": 1000,
"totalPages": 50,
"first": true,
"last": false,
"size": 20,
"number": 0,
"numberOfElements": 20,
"empty": false
}
3. Performance — OFFSET deep page
OFFSET pagination scale linearly với page number:
-- Page 0
SELECT * FROM projects ORDER BY created_at DESC LIMIT 20 OFFSET 0;
-- Time: 5ms
-- Page 100
SELECT * FROM projects ORDER BY created_at DESC LIMIT 20 OFFSET 2000;
-- Time: 50ms (Postgres skip 2000 rows first)
-- Page 1000
SELECT * FROM projects ORDER BY created_at DESC LIMIT 20 OFFSET 20000;
-- Time: 500ms
-- Page 10000
SELECT * FROM projects ORDER BY created_at DESC LIMIT 20 OFFSET 200000;
-- Time: 5000ms — UI timeout
Vì sao slow: Postgres phải scan + sort 20000 row trước khi apply OFFSET 20000.
Optimize 1 — index:
CREATE INDEX idx_projects_created ON projects(created_at DESC);
Index scan thay sequential. Time linear vẫn nhưng faster (~10x).
Optimize 2 — limit max page:
@GetMapping("/projects")
public Page<ProjectDto> list(
@RequestParam(defaultValue = "0") @Min(0) @Max(99) int page, // limit 100 page
@RequestParam(defaultValue = "20") @Min(1) @Max(100) int size,
Sort sort
) { ... }
Force user use search/filter thay scroll deep.
Optimize 3 — keyset pagination (section 5).
4. Slice — bỏ COUNT
Slice không count total — chỉ check "có next page".
Slice<Project> findByStatus(ProjectStatus status, Pageable pageable);
Mechanism: Hibernate fetch size + 1 row. Nếu có row thứ size+1 → hasNext = true.
-- Slice query (size=20)
SELECT * FROM projects WHERE status = ? ORDER BY ... LIMIT 21 OFFSET 0;
-- 1 query only — no COUNT
Performance: ~50% faster cho table lớn (skip COUNT(*)).
JSON:
{
"content": [...],
"pageable": ...,
"first": true,
"last": false,
"hasNext": true,
"size": 20,
"number": 0,
"numberOfElements": 20
}
Use case:
- Infinite scroll UI (Twitter, Instagram).
- Mobile app không hiển thị "page X of Y".
- Performance critical — bypass COUNT.
Trade-off: client không biết total. UX show "Load more" button thay "Page X of Y".
5. Keyset pagination — cursor-based
OFFSET fragile với data thay đổi:
Time T1: list page=0, size=20 → return row 1-20
Time T2: insert new row (now ID 1)
Time T3: list page=1, size=20 → expect row 21-40, but get 20-39 (1 row shifted)
Keyset pagination (cursor-based) dùng "last seen value":
public interface ProjectRepository extends JpaRepository<Project, Long> {
@Query("""
SELECT p FROM Project p
WHERE p.createdAt < :cursor
ORDER BY p.createdAt DESC
""")
List<Project> findNextPage(@Param("cursor") Instant cursor, Pageable pageable);
}
// Service
public List<ProjectDto> nextPage(Instant lastCreatedAt, int size) {
Pageable limit = PageRequest.of(0, size);
return repo.findNextPage(lastCreatedAt, limit).stream()
.map(ProjectDto::from)
.toList();
}
// First page: cursor = Instant.now() (max value)
// Next page: cursor = lastReturnedRow.createdAt
SQL:
-- Page 1
SELECT * FROM projects WHERE created_at < '9999-01-01' ORDER BY created_at DESC LIMIT 20;
-- Page 2 (cursor = lastRow.createdAt = '2026-04-15 10:00:00')
SELECT * FROM projects WHERE created_at < '2026-04-15 10:00:00' ORDER BY created_at DESC LIMIT 20;
-- Page 3 (cursor = next lastRow.createdAt)
SELECT * FROM projects WHERE created_at < '...' ORDER BY created_at DESC LIMIT 20;
Performance: O(log N) với index — same speed mọi page.
Pros:
- Constant performance mọi page.
- Stable: insert mới không shift.
- Real-time friendly: news feed, log viewer.
Cons:
- No "page X of Y": chỉ next/previous.
- No jump page: không skip directly to page 50.
- Sort key unique: nếu nhiều row cùng
createdAt→ cần composite cursor (createdAt + id).
5.1 Spring Data Window/ScrollPosition (3.1+)
Spring Data 3.1 thêm Window API native:
public interface ProjectRepository extends JpaRepository<Project, Long> {
Window<Project> findByStatus(ProjectStatus status, ScrollPosition position);
}
// First page
Window<Project> first = repo.findByStatus(ACTIVE, ScrollPosition.offset());
// Next page
ScrollPosition next = first.positionAt(first.size() - 1);
Window<Project> page2 = repo.findByStatus(ACTIVE, next);
ScrollPosition:
OffsetScrollPosition— OFFSET-based.KeysetScrollPosition— keyset-based.
Native API support both modes — switch implementation dễ.
6. Sort — multi-field
// Single field
Sort sort = Sort.by("createdAt").descending();
// Multi-field
Sort sort = Sort.by(
Sort.Order.desc("priority"),
Sort.Order.asc("name")
);
// In Pageable
Pageable page = PageRequest.of(0, 20, Sort.by("createdAt").descending());
Request URL: ?sort=priority,desc&sort=name,asc.
Spring bind multiple sort fields automatically.
6.1 Sort by nested property
// Sort by Project.owner.name (relationship)
Sort sort = Sort.by("owner.name");
// JPQL: ORDER BY p.owner.name
Hibernate JOIN owner table for sort.
6.2 Custom sort whitelist
User-controlled sort risky:
?sort=password,desc # leak password order info
?sort=expensive_compute_field,asc # slow query
Whitelist allowed:
private static final Set<String> ALLOWED = Set.of("createdAt", "name", "priority");
@GetMapping("/projects")
public Page<ProjectDto> list(Pageable pageable) {
pageable.getSort().forEach(order -> {
if (!ALLOWED.contains(order.getProperty())) {
throw new ValidationException("Cannot sort by: " + order.getProperty());
}
});
return service.list(pageable);
}
Hoặc dùng @SortDefault:
@GetMapping("/projects")
public Page<ProjectDto> list(
@SortDefault.SortDefaults({
@SortDefault(sort = "createdAt", direction = Sort.Direction.DESC)
})
Pageable pageable
) { ... }
7. Validation Pageable
spring:
data:
web:
pageable:
default-page-size: 20
max-page-size: 100 # enforce server-side
one-indexed-parameters: false # 0-based default
max-page-size Spring enforce — request size=10000 → cap to 100. No exception, silent cap.
Manual validation:
@GetMapping("/projects")
public Page<ProjectDto> list(
@PageableDefault(size = 20)
@Validated Pageable pageable
) {
if (pageable.getPageSize() > 100) {
throw new ValidationException("Page size max 100");
}
return service.list(pageable);
}
8. Projection DTO + Pageable
Module 04 bài 03 đã giới thiệu projection. Combine với Pageable:
@Query("""
SELECT new com.olhub.dto.ProjectSummary(p.id, p.name, p.status, COUNT(t.id))
FROM Project p LEFT JOIN p.tasks t
WHERE p.status = :status
GROUP BY p.id, p.name, p.status
""",
countQuery = """
SELECT COUNT(p) FROM Project p WHERE p.status = :status
""")
Page<ProjectSummary> findSummariesByStatus(@Param("status") ProjectStatus status, Pageable page);
countQuery quan trọng — default count query có thể join unnecessary table cho count. Custom optimal.
Performance:
- Default: load full Project entity → map to DTO. 10 column from DB, mapping overhead.
- Projection: SELECT 4 column directly. 50% less network data + faster.
For list endpoint hot path: always use DTO projection.
9. Pattern thực tế
9.1 Filter + sort + page combined
@RestController
public class ProjectController {
@GetMapping("/projects")
public Page<ProjectSummary> list(
@RequestParam(required = false) ProjectStatus status,
@RequestParam(required = false) @DateTimeFormat(iso = ISO.DATE) LocalDate from,
@RequestParam(required = false) @DateTimeFormat(iso = ISO.DATE) LocalDate to,
@PageableDefault(size = 20) @SortDefault("createdAt") Pageable pageable
) {
return service.search(new ProjectFilter(status, from, to), pageable);
}
}
@Service
public class ProjectService {
public Page<ProjectSummary> search(ProjectFilter filter, Pageable pageable) {
// Use Specification cho dynamic filter
Specification<Project> spec = ProjectSpecs.matching(filter);
return repo.findAll(spec, pageable).map(ProjectSummary::from);
}
}
9.2 Slice cho infinite scroll mobile
@GetMapping("/feed")
public Slice<ProjectDto> feed(@PageableDefault(size = 10) Pageable pageable) {
return repo.findRecent(pageable).map(ProjectDto::from);
}
Mobile UI render "Load more" button. No total count needed.
9.3 Keyset cho realtime activity feed
@GetMapping("/activities")
public ScrollResult<Activity> activities(
@RequestParam(required = false) Long lastSeenId,
@RequestParam(defaultValue = "20") @Max(50) int size
) {
ScrollPosition position = lastSeenId == null
? ScrollPosition.offset()
: ScrollPosition.forward(Map.of("id", lastSeenId));
Window<Activity> window = repo.scrollOrderByCreatedAtDesc(position, Limit.of(size));
return new ScrollResult<>(
window.getContent(),
window.hasNext() ? window.positionAt(window.size() - 1) : null
);
}
public record ScrollResult<T>(List<T> items, ScrollPosition nextCursor) {}
Client:
- First request: no cursor.
- Each response:
nextCursorreturned. - Next request: send cursor.
UX: news feed style — never out of date even khi data shift.
10. Vận hành production — DoS protection, performance tuning, monitoring
Pagination endpoint thường là entry point bị scrape / abuse. Section này cover security limits, performance tuning, monitoring deep page anomaly.
10.1 Security limits — DoS qua pagination
Anti-pattern user gửi ?size=999999:
# application.yml — Spring Data Web global config
spring:
data:
web:
pageable:
max-page-size: 100 # cap silent (no error, just truncate)
default-page-size: 20
one-indexed-parameters: false # 0-indexed standard
prefix: "" # ?page=&size=
Hoặc per-endpoint enforce với validation:
@GetMapping("/orders")
public Page<OrderDto> list(
@RequestParam(defaultValue = "0") @Min(0) @Max(10000) int page,
@RequestParam(defaultValue = "20") @Min(1) @Max(100) int size,
@RequestParam(defaultValue = "createdAt,desc") String sort
) { ... }
@Max(10000) cho page ngăn deep page DoS — page 10000 với OFFSET = 200k rows scan.
10.2 Sort field whitelist — security + performance
User gửi ?sort=password,desc → leak data (sort phơi bày null vs non-null) hoặc ?sort=non_indexed_column → full table scan.
private static final Set<String> ALLOWED_SORT = Set.of(
"createdAt", "updatedAt", "name", "status", "priority"
);
@GetMapping("/orders")
public Page<OrderDto> list(@RequestParam(required = false) String sort, Pageable p) {
if (sort != null && !ALLOWED_SORT.contains(sort.split(",")[0])) {
throw new InvalidSortFieldException(sort);
}
return service.list(p);
}
Hoặc map qua DTO sort:
public enum OrderSort {
CREATED_AT("createdAt"),
NAME("name");
final String column;
OrderSort(String c) { this.column = c; }
}
@GetMapping("/orders")
public Page<OrderDto> list(@RequestParam(defaultValue = "CREATED_AT") OrderSort sortBy, ...)
Type-safe — Spring convert enum, invalid value → 400.
10.3 Performance — OFFSET deep page anomaly
Production traffic 99% page 1-3 (UX shallow). Spike page vượt 100 → bot scraping hoặc abuse.
Monitoring metric:
@RestControllerAdvice
public class PaginationMetrics {
private final MeterRegistry registry;
@ModelAttribute
public void trackPagination(@Valid Pageable pageable, HttpServletRequest req) {
registry.summary("pagination.page",
"endpoint", req.getRequestURI()
).record(pageable.getPageNumber());
}
}
Alert deep page anomaly:
- alert: DeepPageScrapingSuspect
expr: histogram_quantile(0.95, pagination_page_bucket) > 100
for: 10m
labels:
severity: warning
P95 page vượt 100 → likely scraping. Investigate → rate limit / CAPTCHA.
10.4 Migration sang keyset pagination
Production app vượt 100k row + frequent insert → migrate OFFSET sang keyset:
// Old API — page-based
GET /orders?page=5&size=20
{ "content": [...], "totalPages": 1000, "page": 5 }
// New API — cursor-based
GET /orders?cursor=eyJpZCI6MTIzfQ&size=20
{ "content": [...], "nextCursor": "eyJpZCI6MTQzfQ" }
Migration strategy:
- Add new endpoint
/orders/v2với cursor-based. - Frontend mobile dùng v2 (infinite scroll natural).
- Frontend admin giữ v1 (Page X of Y vẫn cần).
- Deprecate v1 nếu không còn use case.
Encode cursor base64 JSON {"id": 123, "createdAt": "..."} để client opaque, server decode.
10.5 Failure runbook — pagination
Mode 1 — Slow query deep page:
Triệu chứng: ?page=500&size=100 chạy 30s.
Diagnose: EXPLAIN ANALYZE SELECT ... LIMIT 100 OFFSET 50000 → Postgres scan 50k row trước trả 100.
Remediate: migrate keyset hoặc cap max-page-size strict.
Mode 2 — Inconsistent results across pages:
Triệu chứng: cùng item xuất hiện ở page 1 và page 2 (hoặc bỏ sót).
Diagnose: sort không stable. ORDER BY status không unique → page boundary không xác định.
Remediate: add tie-breaker ORDER BY status, id. Always có unique field cuối sort.
Mode 3 — count() query timeout:
Triệu chứng: list endpoint trả 504, log "count query took 30s".
Diagnose: COUNT trên table lớn vượt timeout. JOIN trong COUNT thừa.
Remediate:
- Switch sang
Slice(skip count). - Custom
countQueryđơn giản, không JOIN unnecessary. - Approximate count với
pg_class.reltuplescho table lớn.
@Query(value = "SELECT o FROM Order o WHERE ...",
countQuery = "SELECT COUNT(o) FROM Order o WHERE ...") // chi WHERE, no JOIN
Page<Order> findFiltered(...);
11. Pitfall tổng hợp
❌ Nhầm 1: OFFSET deep pagination cho large table. ✅ Keyset pagination cho table 100K+ row + frequent inserts.
❌ Nhầm 2: Quên countQuery cho complex @Query.
✅ Provide countQuery để skip JOIN unnecessary cho count.
❌ Nhầm 3: No max size validation.
✅ max-page-size global hoặc @Max(100) on parameter.
❌ Nhầm 4: Sort by user input không whitelist. ✅ Whitelist allowed columns. Reject suspicious.
❌ Nhầm 5: Page entity, map DTO trong service.
Page<Project> page = repo.findAll(p); // load full entity
return page.map(ProjectDto::from); // map in memory
✅ DTO projection trực tiếp via @Query SELECT new ....
❌ Nhầm 6: Bỏ sort default → unstable order across page.
✅ Always specify sort. Default createdAt DESC hoặc id ASC.
❌ Nhầm 7: Page với findAll() (no Pageable).
List<Project> all = repo.findAll(); // could be millions
✅ Always Pageable hoặc Stream cho large dataset.
12. 📚 Deep Dive Spring Reference
Spring Data:
- Spring Data — Pagination
- Spring Data — Limit and Sort
- Spring Data — Scrolling Large Result Sets
- Spring Data Web Support
Pattern:
- Markus Winand — Use the Index, Luke! — keyset pagination chính chủ.
- Vlad Mihalcea — Postgres pagination
Performance:
13. Tóm tắt
Page<T>= content + total count + metadata (totalPages, hasNext, ...). 2 SQL: data + COUNT.Slice<T>= content + hasNext only (no count). 1 SQL với LIMIT size+1.- OFFSET pagination: linear performance — slow cho deep page (page 1000+).
- Keyset pagination (Spring Data 3.1+ Window): O(log N) constant, stable với insert.
- Spring Data Web auto-bind
?page=&size=&sort=→Pageable. - Multi-field sort:
?sort=priority,desc&sort=name,asc. - Whitelist sort columns — security: prevent leak/slow query via user input.
max-page-sizeenforce server-side — default 100. Cap silent (no exception).- DTO projection + Pageable: best performance. Custom
countQueryđể optimize. - Filter + sort + page combined: Specification + Pageable.
- Pattern UX: Page (totalPages displayed), Slice (infinite scroll), Keyset (realtime feed).
14. Tự kiểm tra
Q1So sánh Page, Slice, Keyset pagination. Khi nào pick cái nào?▸
| Type | SQL count | Total info | Performance | Stable insert | Use case |
|---|---|---|---|---|---|
| Page | 2 (data + COUNT) | totalElements, totalPages | O(N) deep page | ❌ | Admin table với "Page X of Y" |
| Slice | 1 (data + 1) | hasNext only | O(N) deep page | ❌ | Mobile infinite scroll |
| Keyset / Window | 1 | hasNext + cursor | O(log N) constant | ✅ | Realtime feed, log viewer |
Decision tree:
Show "Page X of Y" + jump to page?
Yes → Page (need totalElements)
No → Continue
Table size?
Small (<10k rows) → Page or Slice OK
Medium (10k-1M) → Slice (skip COUNT) hoặc keyset
Large (>1M) → Keyset only
Data churn rate (insert frequency)?
Low (<1/min) → OFFSET-based OK
High → Keyset to avoid duplicate / shift
UI pattern?
Numbered pages → Page
"Load more" button → Slice or Keyset
Realtime feed → KeysetCode 3 cách:
// Page — admin
@GetMapping("/admin/projects")
public Page<ProjectDto> adminList(Pageable pageable) {
return repo.findAll(pageable).map(ProjectDto::from);
}
// Slice — mobile infinite scroll
@GetMapping("/feed")
public Slice<ProjectDto> feed(@PageableDefault(size = 10) Pageable pageable) {
return repo.findRecent(pageable).map(ProjectDto::from);
}
// Keyset — realtime activity
@GetMapping("/activities")
public ScrollResult<Activity> activities(
@RequestParam(required = false) Long lastSeenId,
@RequestParam(defaultValue = "20") int size
) {
ScrollPosition pos = lastSeenId == null
? ScrollPosition.offset()
: ScrollPosition.forward(Map.of("id", lastSeenId));
Window<Activity> window = repo.scrollOrderByCreatedAtDesc(pos, Limit.of(size));
return new ScrollResult<>(
window.getContent(),
window.hasNext() ? window.positionAt(window.size() - 1) : null
);
}TaskFlow capstone choice:
- List projects (admin): Page — show totals.
- Tasks within project: Slice — typically only browse forward.
- Activity log (Module 12): Keyset — realtime stream.
Q2Production app có table 1M projects. User request page 1000 (offset 20000) → 5 giây response. Optimize?▸
Vấn đề:
SELECT * FROM projects ORDER BY created_at DESC LIMIT 20 OFFSET 20000;
-- Postgres scan 20020 rows, sort, return last 20
-- O(N) — N = OFFSET + LIMIT4 fix theo độ tốn công:
Fix 1 — Index (cheap, immediate):
CREATE INDEX idx_projects_created_desc ON projects(created_at DESC);Postgres index scan thay sequential. ~10x faster nhưng vẫn O(N).
Fix 2 — Limit max page (prevent abuse):
@GetMapping("/projects")
public Page<ProjectDto> list(
@RequestParam @Min(0) @Max(99) int page, // max page 100
@RequestParam @Min(1) @Max(100) int size,
@SortDefault("createdAt") Pageable pageable
) { ... }Force user use search/filter cho data deep. UX rationale: nobody actually wants page 1000.
Fix 3 — Slice (skip COUNT):
Slice<Project> findByStatus(ProjectStatus status, Pageable pageable);Improve only marginally — main cost vẫn là OFFSET. Helps cho mobile UI nhưng không giải quyết deep page.
Fix 4 — Keyset pagination (best long-term):
public interface ProjectRepository extends JpaRepository<Project, Long> {
Window<Project> scrollOrderByCreatedAtDesc(ScrollPosition position, Limit limit);
}
@GetMapping("/projects")
public ScrollResult<ProjectDto> list(
@RequestParam(required = false) Instant cursor,
@RequestParam(defaultValue = "20") @Max(100) int size
) {
ScrollPosition pos = cursor == null
? ScrollPosition.offset()
: ScrollPosition.forward(Map.of("createdAt", cursor));
Window<Project> w = repo.scrollOrderByCreatedAtDesc(pos, Limit.of(size));
return new ScrollResult<>(
w.getContent().stream().map(ProjectDto::from).toList(),
w.hasNext() ? ((Project) w.getContent().get(w.size()-1)).getCreatedAt() : null
);
}SQL keyset:
-- First page (no cursor)
SELECT * FROM projects ORDER BY created_at DESC LIMIT 20;
-- Page 2 (cursor = lastRow.createdAt)
SELECT * FROM projects WHERE created_at < ? ORDER BY created_at DESC LIMIT 20;
-- Page 1000 (cursor = ...)
SELECT * FROM projects WHERE created_at < ? ORDER BY created_at DESC LIMIT 20;
-- ALL pages: O(log N) — index lookup, không scan offsetPerformance:
- Page 1: 5ms
- Page 100: 5ms
- Page 1000: 5ms — constant!
Trade-off keyset:
- Cons: không jump page direct. UX shift sang "Load more" button.
- Pros: constant performance, stable với insert (no shift), real-time friendly.
Recommend:
- Page (offset) admin: max 100 page enforced. Acceptable for small data.
- Keyset cho user-facing: news feed, activity, dashboard.
- Combo: offset cho first 10 page, keyset cho deeper. Hybrid pattern.
Q3Đoạn `@Query` với projection + Pageable. Có 2 vấn đề performance. Liệt kê + fix.@Query("""
SELECT p FROM Project p
LEFT JOIN p.tasks t
WHERE p.status = :status
GROUP BY p.id
""")
Page<Project> findByStatus(@Param("status") ProjectStatus status, Pageable pageable);
▸
@Query("""
SELECT p FROM Project p
LEFT JOIN p.tasks t
WHERE p.status = :status
GROUP BY p.id
""")
Page<Project> findByStatus(@Param("status") ProjectStatus status, Pageable pageable);- Load full entity thay DTO projection: select all 10+ columns of Project. Mapping overhead. Memory wasted.
Fix:
Select 4 fields directly. Map to DTO at SQL level. ~50% less network data.public record ProjectSummary(Long id, String name, ProjectStatus status, long taskCount) {} @Query(""" SELECT new com.olhub.dto.ProjectSummary(p.id, p.name, p.status, COUNT(t.id)) FROM Project p LEFT JOIN p.tasks t WHERE p.status = :status GROUP BY p.id, p.name, p.status """) Page<ProjectSummary> findSummariesByStatus(@Param("status") ProjectStatus status, Pageable pageable); - Default countQuery: JOIN tasks unnecessary cho count:JOIN slow cho count — count chỉ cần Project rows.
-- Default countQuery (Spring Data auto-derive) SELECT COUNT(p) FROM Project p LEFT JOIN p.tasks t -- JOIN tasks just for count! WHERE p.status = :statusFix: custom
countQuery:Count query no JOIN — fast.@Query(value = """ SELECT new com.olhub.dto.ProjectSummary(p.id, p.name, p.status, COUNT(t.id)) FROM Project p LEFT JOIN p.tasks t WHERE p.status = :status GROUP BY p.id, p.name, p.status """, countQuery = """ SELECT COUNT(p) FROM Project p WHERE p.status = :status """) Page<ProjectSummary> findSummariesByStatus(@Param("status") ProjectStatus status, Pageable pageable);
SQL generated correct:
-- Data query
SELECT p.id, p.name, p.status, COUNT(t.id) AS task_count
FROM projects p LEFT JOIN tasks t ON t.project_id = p.id
WHERE p.status = ?
GROUP BY p.id, p.name, p.status
ORDER BY p.created_at DESC
LIMIT 20 OFFSET 0;
-- Count query (custom)
SELECT COUNT(p.id)
FROM projects p
WHERE p.status = ?;
-- No JOIN — fast cho big tableBonus optimization:
- Index hỗ trợ filter + sort:Composite index. Postgres dùng index scan cho cả WHERE + ORDER BY.
CREATE INDEX idx_projects_status_created ON projects(status, created_at DESC); - Cache count cho hot filter: populate Redis với
COUNT(*)per status, refresh every 5min. Trade staleness for performance.
Performance benchmark (1M project, 10 task/project):
- Original (full entity, default count): 2-3s.
- DTO projection + custom count: 50-100ms.
- + Composite index: 20-30ms.
10-100x speedup cho list endpoint. Worth refactor effort.
Q4User request `?sort=password,desc`. Risk security? Cách prevent?▸
Risk security:
- Information leak: sort by password reveal length distribution. Combined với count → estimate weak passwords.
- Sort by sensitive column: SSN, credit card, salary — leak through observation timing/order.
- Slow column DOS: sort by computed/large column (TEXT, JSONB) → expensive query → DOS.
- Internal column expose:
?sort=internal_admin_flagreveal table structure.
Fix — whitelist allowed sort columns:
@RestController
public class ProjectController {
private static final Set<String> ALLOWED_SORT = Set.of("name", "status", "createdAt", "priority");
@GetMapping("/projects")
public Page<ProjectDto> list(
@RequestParam(required = false) ProjectStatus status,
@PageableDefault(size = 20) @SortDefault("createdAt") Pageable pageable
) {
validateSort(pageable.getSort());
return service.list(status, pageable);
}
private void validateSort(Sort sort) {
sort.forEach(order -> {
if (!ALLOWED_SORT.contains(order.getProperty())) {
throw new ValidationException(
"Sort by '" + order.getProperty() + "' not allowed. " +
"Allowed: " + String.join(", ", ALLOWED_SORT)
);
}
});
}
}Cleaner — custom Pageable resolver:
@Component
public class SafePageableArgumentResolver extends PageableHandlerMethodArgumentResolver {
private static final Map<String, Set<String>> ALLOWED_PER_ENTITY = Map.of(
"/projects", Set.of("name", "status", "createdAt", "priority"),
"/tasks", Set.of("title", "status", "priority", "dueDate")
);
@Override
public Pageable resolveArgument(MethodParameter parameter, ModelAndViewContainer mav,
NativeWebRequest webRequest, WebDataBinderFactory binder) {
Pageable pageable = super.resolveArgument(parameter, mav, webRequest, binder);
Sort filtered = filterAllowed(pageable.getSort(),
(String) webRequest.getNativeRequest(HttpServletRequest.class).getRequestURI());
return PageRequest.of(pageable.getPageNumber(), pageable.getPageSize(), filtered);
}
}Alternative — DTO field projection guard:
Project DTO không expose `password` field → entity sort by password vẫn select password vào response → leak.
Layer defense:
- DTO không có sensitive field.
- Sort whitelist còn restrict đi.
Test security:
@Test
void invalidSortField_returns400() {
mockMvc.perform(get("/projects?sort=password,desc"))
.andExpect(status().isBadRequest())
.andExpect(jsonPath("$.title").value("Validation failed"));
}Pattern enterprise:
- API gateway / BFF (Backend-for-Frontend) layer validate sort param.
- Internal service trust pre-validated input.
- Audit log: log every sort/filter param for compliance.
Q5App có 100M activity rows. UI realtime feed. Choose pagination approach + implement.▸
Choose: Keyset pagination với Window/ScrollPosition.
Vì sao không pick alternatives:
- Page: COUNT(*) trên 100M rows = 5-30s. Unusable.
- Slice: bỏ COUNT nhưng OFFSET vẫn O(N). Page 1000 với offset 20000 vẫn slow.
- Keyset: O(log N) mọi page. Constant performance. Stable với insert (realtime feed cần).
Implementation:
// Entity
@Entity
public class Activity {
@Id @GeneratedValue Long id;
@Column(nullable = false)
private Instant createdAt;
@Column(nullable = false, length = 50)
private String type;
@Column(columnDefinition = "TEXT")
private String payload;
}
// Repository
public interface ActivityRepository extends JpaRepository<Activity, Long> {
Window<Activity> findFirstByOrderByCreatedAtDescIdDesc(Limit limit, ScrollPosition position);
}
// Service
@Service
@Transactional(readOnly = true)
public class ActivityService {
private final ActivityRepository repo;
public ScrollResult<ActivityDto> feed(ActivityCursor cursor, int size) {
ScrollPosition position = cursor == null
? ScrollPosition.offset()
: ScrollPosition.forward(Map.of(
"createdAt", cursor.createdAt(),
"id", cursor.id()
));
Window<Activity> window = repo.findFirstByOrderByCreatedAtDescIdDesc(
Limit.of(size + 1),
position
);
boolean hasNext = window.size() > size;
List<Activity> items = hasNext ? window.getContent().subList(0, size) : window.getContent();
ActivityCursor nextCursor = hasNext
? new ActivityCursor(items.get(items.size() - 1).getCreatedAt(), items.get(items.size() - 1).getId())
: null;
return new ScrollResult<>(
items.stream().map(ActivityDto::from).toList(),
nextCursor
);
}
}
public record ActivityCursor(Instant createdAt, Long id) {}
public record ScrollResult<T>(List<T> items, ActivityCursor nextCursor) {}
// Controller
@RestController
@RequestMapping("/api/activities")
public class ActivityController {
@GetMapping
public ScrollResult<ActivityDto> feed(
@RequestParam(required = false) Instant cursorCreatedAt,
@RequestParam(required = false) Long cursorId,
@RequestParam(defaultValue = "20") @Min(1) @Max(100) int size
) {
ActivityCursor cursor = (cursorCreatedAt != null && cursorId != null)
? new ActivityCursor(cursorCreatedAt, cursorId)
: null;
return service.feed(cursor, size);
}
}SQL generated:
-- First request (no cursor)
SELECT * FROM activities
ORDER BY created_at DESC, id DESC
LIMIT 21;
-- Subsequent (cursor = (createdAt='2026-04-15 10:00:00', id=12345))
SELECT * FROM activities
WHERE (created_at, id) < ('2026-04-15 10:00:00', 12345)
ORDER BY created_at DESC, id DESC
LIMIT 21;
-- Composite cursor handle ties (multiple rows same createdAt)Index critical:
-- Index match composite cursor
CREATE INDEX idx_activities_created_id ON activities(created_at DESC, id DESC);
-- Postgres index scan on (created_at, id) → O(log N) lookupUI workflow:
1. GET /activities → return 20 items + nextCursor
2. User scrolls down, app load more
3. GET /activities?cursorCreatedAt=...&cursorId=... → next 20
4. RepeatPerformance:
- Any cursor position: ~5ms (index lookup).
- 100M row → 5M page → all sub-second.
Caveat:
- No "page 100" jump — UX must adapt to "load more" pattern.
- Cursor format internal — version it (Base64 encode JSON cho future change).
- Test with realistic data volume — index quality matters.
Q6Spring Boot config `spring.data.web.pageable.max-page-size: 100`. User request `?size=10000`. Response? Cách verify?▸
Spring silently cap to 100. No exception thrown.
Behavior:
Request: GET /projects?size=10000
↓
PageableHandlerMethodArgumentResolver
↓
Detect size > max-page-size
↓
Set pageable.size = 100 (cap)
↓
Continue normal flow
Response:
{
"content": [...], // 100 items (not 10000)
"size": 100, // capped
"number": 0,
...
}Vì sao silent cap (default behavior):
- UX: client get reasonable response thay error.
- Backwards compat: old client với hardcoded high size still work.
Vấn đề: client không biết sizes capped. Pagination logic (`size × pageNum = expected total seen`) sai.
Cách verify in test:
@Test
void requestOversizedPage_isCappedTo100() throws Exception {
mockMvc.perform(get("/projects?size=10000"))
.andExpect(status().isOk())
.andExpect(jsonPath("$.size").value(100)); // verify cap
}Alternative — explicit error (preferred for API explicit):
@RestController
public class ProjectController {
@GetMapping("/projects")
public Page<ProjectDto> list(
@RequestParam(defaultValue = "0") @Min(0) int page,
@RequestParam(defaultValue = "20") @Min(1) @Max(100) int size,
@RequestParam(defaultValue = "createdAt,desc") String sort
) {
// @Max validation throw exception → 400 Bad Request
Pageable pageable = PageRequest.of(page, size, parseSort(sort));
return service.list(pageable).map(ProjectDto::from);
}
}
@Validated
public class ProjectController { ... }Now request `?size=10000` → 400 with violation message. Client knows explicitly.
Caveat — hybrid:
spring:
data:
web:
pageable:
max-page-size: 100 # safety net global
default-page-size: 20
# + manual @Max(100) per endpoint
# Both: explicit validation + safety netVerify config active:
# Actuator endpoint
GET /actuator/configprops?prefix=spring.data.web
{
"prefix": "spring.data.web",
"properties": {
"pageable.max-page-size": 100,
"pageable.default-page-size": 20,
...
}
}Pattern enterprise:
- Global max-page-size = safety net (100).
- Per-endpoint @Max(50) cho hot path (smaller).
- API documentation explicit: "max page size: 100, default: 20".
- Rate limiting per-endpoint thay phase max-page extreme.
Bài tiếp theo: Mini-challenge — TaskFlow v2 Postgres + JPA capstone
Bài này có giúp bạn hiểu bản chất không?
Bình luận (0)
Đang tải...