Configuring snapshots

Choose the rows you want

There are two ways to specify rows you want:

Or you can use both at the same time to get a union of the two criteria.

Handling relationships

Rows in other tables relating to the rows you've chosen are always included. DatabaseCI does the tricky work for you of figuring out all the dependencies of foreign keys and including them.

For instance, if your transactions table includes a customer_id referencing the customer table, those customers will also get included automatically.

DatabaseCI can also follow following foreign keys backwards - as might be necessary to handle a many-to-many relationship. See below.

Zero rows by default

If you don't specify a selection criteria for a particular table, no rows will be selected for that table - this is to keep the size of your subsets small by default.

Of course, rows chosen because because they are related to other rows you've chosen will still be selected (see immediately above).

Config layout

The hierarchy of table_config is schemas->[schema_name]->[table_name], as in the following example:

api_key: c72cc5a97000d49bb33abd48ac730cc2

real_db_url: postgres://
copy_db_url: postgres:///subsetted

      where: 'ordered_at > current_date - 7'

      random_sample: 10

Random sampling

This is simply percentage of rows from 0 to 100 - "10" for a million row table will result in (approximately) 10 000 rows being chosen for the snapshot.

Rows referenced by those 10 000 rows will be included too.

To leave that table empty, just enter 0. And to select every row no matter what, enter 100.

    sample_percent: 10

where clause

To include specific rows, just include the where criteria for those rows, for instance to make a snapshot that includes only users named Alex, enter the following:

    where: "firstname = 'Alex'"

Backwards/many-to-many relationships

Sometimes you may be interested in following foreign keys "backwards". A common scenario might involve a many-to-many relationship table.

For instance, a product and order table linked by a product_order many-to-many relationship table.

Foreign keys "flow" outwards from such a table, but you likely want to choose particular orders or products, and follow those relationships "inwards" from one base (order/customer) table towards the relationship table, then back "outwards" to the other table.

Such relationships aren't followed by default as they can lead to an explosion in the quantity of selected rows - but to enable them for a table, simply specify backwards: true on the table containing the relevant foreign key column, like so:

    sample_percent: 5

    backwards: true

With this information, you should be able to tailor your configuration to get exactly subsets of exactly the shapes and sizes you're interested in.

Further info: Some tips, tricks and limitations.

Give feedback