Skip to main content

Source and Ref Functions

Sources

Sources are descriptions of the databases and tables already in Analytical Platform. With those tables defined as sources in dbt, it is then possible to select from source tables in your models using the source() function which helps define the lineage of your data. To see which sources have been defined, look in the ./mojap_derived_tables/models/sources/ directory. Below is an example of using the source() function to select from the contact table in the delius_prod database:

model_a.sql

select * from {{ source("delius_prod", "contact") }} limit 10

Adding a new source

If a database is not defined as a source it must be added. Please follow the instructions below:

  • Create a new branch off main.
  • Add the source database name exactly as it appears in AWS Athena to the list in scripts/source_database_name.txt. Ensure it is in alphabetical order for readability.
  • Commit and push the changes, then raise a pull request.

The update-source workflow is scheduled to run weekly. When run it generates the YAML code for all sources listed, creating any new files and updating existing ones. Then it raises a PR to merge these changes into main. Once this PR is merged the source is available. If you need a source urgently please follow the the steps above and then contact the Data Modelling team at #ask-data-modelling.

The ref function

The most important function in dbt is ref(); it’s impossible to build even moderately complex models without it. ref() is how you reference one model within another as typically models are built to be ‘stacked’ on top of one another. These references between models are then used to automatically build the dependency graph. Here is how this looks in practice:

model_b.sql

select contact_id from {{ ref("model_a") }} 
where mojap_current_record=true
This page was last reviewed on 7 August 2023. It needs to be reviewed again on 7 August 2024 by the page owner #ask-data-modelling .