Skip to main content

Tools

The MoJ Analytical Platform comes with various tools including:

Control panel

The main entry point to the Analytical Platform.

RStudio

A development environment for writing R code and R Shiny apps.

JupyterLab

A development environment for writing Python code.

Data Discovery

The data engineering team maintain a number of databases on the Analytical Platform (curated databases). The best way to find out about these is using the data discovery tool.

Upload File Data

A web application to upload data (.csv, .json, .jsonl) to the MoJ Analytical Platform in a standardised way.

Upload Microservices Data

Tools for uploading and refreshing data from microservices to the MoJ Analytical Platform in a standardised way:

Airflow

A tool for scheduling and monitoring workflows.

Create a Derived Table

A tool for creating persistent derived tables in Athena.

Python packages

The data engineering team maintain a number of python packages to help with data manipulation, as well as interfacing with data using our preferred services. The following python packages are those we consider the most useful:

pydbtools

Standard package for querying MoJAP athena databases with useful features including temp table creation.

mojap-arrow-pd-parser

Useful package for ensuring type conformance when reading with arrow or pandas.

mojap-metadata

MoJAP defined metadata that interacts with other packages (inc arrow-pd-parser) for ensuring type conformance as well as a number of schema converters.

dataengineeringutils3

A collection of useful utilities for interacting with AWS

athena_tools

User friendly way of making small persisting ad hoc databases. In it’s alpha release, please report all problems!

mojap-aws-tools-demo

A repo containing some helpful guides on how to use some of the above packages. You can also ask for help with these on #ask-data-engineering.

R packages

The data engineering team maintain the following R package:

dbtools

A package for accessing Athena databases from the Analytical Platform.

The Analytical Platform community maintain the following R packages, which avoid the need for using Python in R projects:

Rdbtools

A native R package for accessing Athena databases from the Analytical Platform.

Rs3tools

A native R package that is used to access AWS S3 from the Analytical Platform, which is mainly compatible with the legacy package s3tools.

This page was last reviewed on 8 December 2022. It needs to be reviewed again on 8 December 2023 by the page owner #analytical-platform-support .
This page was set to be reviewed before 8 December 2023 by the page owner #analytical-platform-support. This might mean the content is out of date.