Skip to main content
Analytical Platform User Guide
Table of contents
Search (via Google)
Before you begin
2. Create Slack account
3. Create GitHub account
4. Access the Analytical Platform
5. Set up JupyterLab
6. Set up RStudio
Where do I find out what data is already on the Platform?
How do I gain access to existing data?
Where should I store my own data?
How do I read/write data from an s3 bucket?
How do I query a database on the Platform?
I am running into memory issues, what should I do?
How do I create my own Athena database?
What is Amazon S3?
Working with Amazon S3 buckets
Interacting with Amazon S3 via the Analytical Platform
Accessing Amazon Athena
Working with tables
Using the Athena UI
Querying databases from the AP
R - dbtools
R - Rdbtools
Python - pydbtools
Using mojap_*_timestamp filters
Joining temporal-schema tables
Using databases and data for apps
Guidance on using our databases for analysis
Guidance on using databases / data for deployed apps
Data Discovery and Documentation
Exporting data to other platforms
Create a Derived Table
What is Create a Derived Table?
Standard database access
Your Data Engineering Database Access project access file
Rstudio Set Up
Clone the repository using the RStudio GUI
Clone the repository using the terminal
Setting up a Python virtual environment
Show indent guides in RStudio
Collaborating with Git
Updating your branch with main
Data Modelling Concepts - placeholder
Standard directory structure and naming conventions
What is a model?
Where can I define configs?
Source and Ref Functions
Adding a new source
The ref function
What are seeds?
Custom generic tests
Virtual environment set up
Moving models to production:
Linting YAML files
Folded style >
Literal style |
Linting SQL files
Deploying to Dev
Using the + prefix
How to use the incremental materialisation with the append strategy
General troubleshooting tips
Delete dev models instructions
Scheduling to Prod
dbt-athena Upgrade Guidance
Table of contents
Test set up
Test prod models
Test dev models
SQLFluff linting changes
Update your branch with the dbt-athena upgrade
S3 location change for seeds
R package management
Why use a package manager?
Why should I upgrade?
2.2.6 to 3.0.12
Python package management
venv and pip
Using a project that has a requirements.txt
Library conflicts & warnings
Why use Airflow
What is Airflow
What is Kubernetes
When not to use an Airflow
DAG and Role pipeline
Define the DAG
Define the IAM Policy
Validate from the command line (optional)
Deploy the changes
Tips on writing the code
Test Docker image (optional)
Troubleshooting Airflow Pipelines
Why use the Uploader?
Data Uploader pre-requisites
Step 1 of 4: Data governance requirements
Step 2 of 4: Choose database
Step 3 of 4: Choose table
Step 4 of 4: Check your inputs before uploading your data
Getting access to uploaded data
Limitations and awareness
Current configured alerts
R Shiny app publishing
Managing published apps
Troubleshooting and monitoring
Managing and Monitoring Deployments
Manage deployment settings of an app on Control panel
Introduction to the settings
Want to make change to AUTHENTICATION_REQUIRED flag?
Can I make changes(add/remove/update) the secrets/vars on GitHub repo directly?
When will the changes made on Control panel will be applied to the deployment pipeline?
Deploying a static webapp
Manage existing apps
Access the app
Git and GitHub
Set up GitHub
Create a new project in GitHub
Manage access in GitHub
Collaborate on a project
Work with git in RStudio
Work with git in JupyterLab
Work with git on the command line
Install packages from GitHub
Security in GitHub
GitHub organisation management
Processes and practices
Parameters - working with secrets
A file which is not committed to git
GitHub repositories - DO NOT use for secrets
Acceptable use policy
Who this policy applies to
Reporting security incidents
How to get Support
Routes of support
How to ask for support: creating a reproducible example
Common Errors and Solutions
Benefits of GitHub
Reproducible Analytical Pipelines
Step by step guide to setup Two Factor Authentication
Infrastructure Migration - step by step instructions
Shared Responsibility Model
Onboarding and offboarding
Tools and packages
Running your app within Jupyter
Running Plotly Dash apps
Athena workgroup upgrade
How do I prepare for the new version?
How do I use the testing workgroup?
How can I check if my queries are running on the correct version?
What if I’m using create-a-derived-table for my work?
What should I do if I get stuck?
Where can I find out more about Athena engine version 3?
Migrating to botor
Table of Contents
Migrating from s3tools
Work with git in JupyterLab
There is no git interface built into JupyterLab. You should use the
This page was last reviewed on 30 January 2023. It needs to be reviewed again on 30 January 2024 by the page owner
This page was set to be reviewed before 30 January 2024 by the page owner
. This might mean the content is out of date.