Offset vs Cursor Pagination

Compare offset-based and cursor-based pagination strategies, analyzing database seeks, performance scaling, and page drift anomalies.

IntermediateAPI DesignChapter: API Design12 min read

What is Pagination?

Pagination is the technique of partitioning a massive database query result set into discrete, manageable blocks (pages) before returning them to the client.

If an application database table contains millions of rows, returning the entire dataset in a single HTTP response will trigger Out-Of-Memory (OOM) crashes, choke network bandwidth, and crash client browsers.

Real-World Analogy

Imagine reading a massive 1000-page encyclopedia.

  • Offset pagination is like someone instructing you: "Go read the entries starting on page 800." Because you do not have bookmarks, you must open the encyclopedia to page 1 and turn every page one-by-one until you reach page 800. If someone slides a new page into page 200 before you finish, the page numbers shift, and you will end up reading the same page twice or skipping an entry entirely.
  • Cursor pagination is like using a physical bookmark ribbon. You read up to page 50, place the ribbon there, and when you return, you open directly to the bookmark and read from page 51, regardless of how many new pages were inserted or deleted earlier in the book.

Offset Pagination: LIMIT & OFFSET

Offset-based pagination is the traditional pagination strategy. It relies on two numeric parameters: LIMIT (the page size) and OFFSET (the number of rows to skip).

A typical client SQL request looks like:

sql
SELECT * FROM users ORDER BY created_at DESC LIMIT 10 OFFSET 90000;

The Database Scan Mechanism

To execute a LIMIT M OFFSET N statement, the database storage engine cannot simply jump directly to row N. Instead, it must:

  1. Access the index or table heap file.
  2. Read and discard N rows sequentially.
  3. Keep the next M rows and return them to the client.

This sequential scan results in O(N) execution scaling, where each subsequent page request becomes slower as the client requests deeper pages.


The Performance Degradation of Offset

As the offset value N grows into hundreds of thousands, query performance degrades significantly.

Even if the sorting column (e.g. created_at) is indexed, the database planner must traverse the index leaf pointers sequentially to count up to the offset value, consuming massive CPU and disk I/O cycles.

In production environments, deep offset paging queries often trigger database thread pool starvation and application timeouts.


Page Drift Anomalies

In addition to poor performance, offset pagination suffers from page drift (inconsistent results) caused by concurrent write events:

Insertion Shift (Duplicates)

  1. Client fetches Page 1 (LIMIT 5 OFFSET 0) containing items 1 to 5.
  2. An admin inserts a new item 0 at the beginning of the table. All existing rows shift down by one index position (item 5 shifts to index position 5).
  3. Client requests Page 2 (LIMIT 5 OFFSET 5). The database skips indices 0 to 4 (which now includes items 0 to 4) and returns item 5 (which was already read on Page 1). The client sees a duplicate item.

Deletion Shift (Skipped Items)

  1. Client fetches Page 1 (LIMIT 5 OFFSET 0) containing items 1 to 5.
  2. An admin deletes item 2. All items shift up by one index position (item 6 shifts to index position 4).
  3. Client requests Page 2 (LIMIT 5 OFFSET 5). The database skips indices 0 to 4, which now contain items 1, 3, 4, 5, 6. The query returns starting from item 7. Item 6 is skipped and never displayed to the user.

Cursor Pagination: Seeking Directly

Cursor-based pagination resolves performance and consistency issues by using an immutable, unique, and sequential column (like a primary key ID or UUID combined with a timestamp) as a pointer.

Instead of passing an abstract page offset, the client requests data relative to the last seen record from the previous page:

sql
SELECT * FROM users WHERE id > 90000 ORDER BY id LIMIT 10;
Offset Pagination vs. Cursor Pagination Database Seek Offset-Based: LIMIT 10 OFFSET 50 Scanned & Discarded (50 Rows) O(N) CPU & Disk Reads Returned: Rows 51 to 60 Cursor-Based: WHERE ID > 50 LIMIT 10 Skipped Entirely (0 Reads) O(log N) Seek Returned: Rows 51 to 60 Total Scanned: 60 Records Total Scanned: 10 Records

The Index Seek Advantage

Because the query filter uses a strict inequality comparison (WHERE id > 90000) on an indexed column, the database performs an O(log N) B-Tree index seek. It traverses direct parent-to-child nodes to find the node pointing to id=90000 in microseconds, then scans exactly the next LIMIT rows sequentially from that leaf.

This guarantees constant-time page lookup speeds regardless of query depth.


Statelessness and State Propagation

Cursor-based pagination enforces clean stateless REST standards by embedding state indicators inside server response bodies.

Instead of forcing the client application to calculate offset numbers dynamically, the server generates the URL for the next page of data, encoding the last record's ID into the query string:

json
{
  "data": [
    {"id": 90009, "name": "User_90009"},
    {"id": 90010, "name": "User_90010"}
  ],
  "paging": {
    "next": "/api/users?limit=10&cursor=90010"
  }
}

This self-describing response model isolates pagination execution details on the server. If the cursor structure changes later (e.g. switching from integer IDs to encrypted tokens), client code remains unchanged.


Real-World Limits of Cursors

While cursor pagination is faster and more reliable, it introduces trade-offs:

  • No Random Page Jumps: Under cursor pagination, a client cannot jump directly to an arbitrary page (e.g. "Go directly to Page 42"). To display Page 42, the client must traverse pages 1 through 41 to obtain the cursor value for page 42. This makes cursors ideal for infinite scrolls (like social media feeds) but unsuitable for tabular search panels that require jumps.
  • Complex Multi-Column Sorts: If sorting by multiple columns (e.g. sorting by name, then by updated timestamp), the cursor must combine both values (e.g. encoding name and timestamp into a single base64 string). The database query must then construct compound inequality clauses (e.g. WHERE (name > last_name) OR (name = last_name AND timestamp > last_timestamp)), which requires complex composite index designs.

Further Reading

Code Examples

Core Literature References

SQL Pagination: An Index-Friendly Approach

by Torsten Grust — Database Query Engineering, pp. 1-12

View source