Paging (keyset/offset-free)

Paging SQL query results with limit ... offset ... is simple, but can cause problems.

The larger the offset (or the higher the page number), the more rows have to be loaded - queries will become slower and slower. And if other items get added/deleted between page loads, paging can skip over or repeat items confusingly.

Offset-based paging avoids all these problems. Conceptually, instead of requesting the next 10 items after a row number, your query gets the next ten items after the last item of the previous page.

databaseci makes a keyset-paged query almost as easy as just writing a vanilla unpaged query.

For offset-free, multi-column, multi-direction keyset paging, you just need a couple of extra parameters.

A worked example

Assuming the following table is already created:

create table names(
    id int serial primary key,
    name text unique
);

...let's insert some sample rows:

db.insert(
    'names',
    [
        dict(name='Alex'),
        dict(name='Bob'),
        dict(name='Charlie'),
        dict(name='Dirk'),
        dict(name='Emily')
    ],
    upsert_on='name'
)

And assume we have the following vanilla unpaged query which we would like to query a page at a time.

select * from names

We want the rows in name order, 2 items at a time (a page size of 2).

We can look through the pages as follows.

bookmark = None

while True:
    page = db.q(
        'select * from names',
        paging=dict(
            order_by='names desc, id',
            per_page=2,
            bookmark=bookmark
        )
    )

    print(page)

    if not page.paging.has_next:
        break

    bookmark = page.paging.next

The keyset paging golden rule

The specified ordering must include a set of columns that are unique per row. The easiest way to do this (at least if querying only a single table) is to include the primary key column(s) of the table in the order_by.

This is because the values of the ordering columns for each row are used for before/after comparisons, and an equal comparison will result in an indeterminate order, resulting in unexpected results.

What is a "bookmark"?

With offset paging, the position of a page can be defined by one thing: Its page number.

With a keyset-paged page, the page is position defined by the row that page comes after, if we are moving forward through the dataset.

However, we can also move backwards through the dataset, in which case the page is defined by the row that the page is before.

Thus, a page "bookmark" comprises two things:

DatabaseCI structures these bookmarks as a Tuple[Tuple, bool], for example:

('Emily', 5), True

In the above example, this is the second page moving backwards through the dataset. The equivalent page moving forwards would be:

('Bob', 2), False

The first page is always:

None, False

And the last page is always:

None, True

Checking if there's another page

To find if there's a next page (or previous page if going backwards), the paging logic queries for 1 more item than actually needed - for example, 11 items if the page size is 10.

If the 11 items are returned, there must be at least one more row after the current page (this "excess" row is then discarded from the actual page rows returned).

If a not-None bookmark is used, we assume there's a previous page.

The exact same logic is used "going backwards", except in reverse.

The .paging object

When you pass the paging= parameter to .q(, the returned rows will have a .paging attribute set, with the bookmarks and other useful information: