Data documentation#
Use this to document the data used in the project.
We recommend using panderas to document data structures where possible.
As a minimal documentation process, we suggest:
Use the infer_schema method to create the simpliest schema for your tables, containing inferred types and column names.
Store the schemas of each table as dedicated
.json
files in aschemas
folder at the root of the package.Enrich the schemas by hand all along the project.
Please also document:
the data provider:
where did you get this data ?
who is responsible for providing it ?
When did you get it ?
the data life-cycle:
refresh frequency
Storage location
Data outputs