flowchart LR A(Author) -->|Request| B(RDM Team) B -->|Drafts| A B -->|"Illustration (pdf)"| C(Typesetter) C ---> D("Printed<br>Book Series") B --->|"Illustration (pdf)"| E("Open Research<br>Data Platform") B --->|"Dataset<br>(csv, geojson)"| E B --->|Metadata| E B --->|"Code (R)"| F(sgb-figures) B --->|"Dataset (csv)"| F B --->|"Metadata (json)"| F F <-.-> E style B fill:#3a1e3e,color:#fff,stroke:#3a1e3e style D fill:#86bbd8,stroke:#86bbd8 style E fill:#86bbd8,stroke:#86bbd8 style F fill:#ffe880 click B href "https://dokumentation.stadtgeschichtebasel.ch/team.html" "Research Data Management Team" click D href "https://emono.unibas.ch/stadtgeschichtebasel/" "Open Access Version" click E href "https://forschung.stadtgeschichtebasel.ch/" "Research Data Platform" click F href "https://github.com/stadt-geschichte-basel/sgb-figures/" "sgb-figures GitHub Repository"
Workflow
Repository Structure, Software and Data Model
This repository stores data and R code used by the Team for Research Data Management and Public History of the Stadt.Geschichte.Basel project to create figures published in the nine-volume book series.
To support open research with FAIR data, the RDM team developed a research data platform (Mähr, Görlich, and Twente 2024) ensuring open, long-term access to sources and research data regarding the history of Basel. The platform facilitates access to the data behind the publication and features rich metadata annotation (cf. Data Model).
Using raw data provided by the individual authors of Stadt.Geschichte.Basel, the RDM team created maps, diagrams, and other types of visualizations (Mähr 2022) for publication in the print and online (OA) versions of the book series. The sgb-figures repository stores code and data for charts and diagrams only (referred to as plots or figures). For other types of data visualization in the context of Stadt.Geschichte.Basel, refer to the extensive RDM Documentation (in German).
Visualization Workflow
The starting point for each plot is data the authors of Stadt.Geschichte.Basel provided to the RDM team. Using R (cf. Software), we tidy the data, assemble multiple sources into one dataset to use for visualization, write that data to a csv
file and store information on the dataset in an accompanying json
metadata file structured according to the W3C standard for tabular data and metadata on the web (W3C 2022).
Data stewards in Stadt.Geschichte.Basel’s RDM team then create visualizations for the book series, often in multiple iterations and in close cooperation with the researchers (Twente and Mähr 2025; Münch et al. 2023). Dedicated design principles and color schemes are implemented to ensure a common visual identity across all products. The finalized products are then processed for print production and long-term archival (Figure 1).
While the generated pdf
file for each plot, the corresponding legend and the csv
dataset files are uploaded to the Research Data Platform, the R scripts used to clean the data and generate the plot are only linked to from the figures’ metadata, referring to sgb-figures on GitHub.
Repository Structure
Datasets and R scripts to generate the plots are accessible in this repository, providing a reproducible R environment to facilitate further work with the available code. Using the Open Research Template, research data is managed implementing best practices as outlined in The Turing Way. This structured approach includes automated release management, integrated archiving with Zenodo, structured documentation via Quarto, and long-term accessibility through GitHub Pages (Mähr and Twente 2025).
R scripts, built plots, processed data and metadata files are sorted according to the following folder structure:
build/
– helper scripts used to build the plotsdata/
– data filesdocs/
– documentation for the data and the repositoryoutput/
– generated PDF filessrc/
– source code for data processing and building plots
Build Workflow
From the raw data as input to the published figure, a number of objects play a role in the build process. The environment is illustrated here with a made-up example figure abb01313
published in Stadt.Geschichte.Basel volume <n>
.
Software
All plots are produced using R. In addition to ggplot2 (Wickham 2016) and other parts of the tidyverse, this project uses several packages for data processing and visualization, including here (Müller and Bryan 2020) and renv (Ushey and Wickham 2025) for creating a reproducible environment as well as csvwr (Gower 2022) for writing metadata files.
Code, data and documentation are checked into version control and stored in a GitHub repository. Code formatting and linting is done via prettier (Long 2025) resp. styler (Müller and Walthert 2024) and lintr (Hester et al. 2025) for R code. The R environment including all necessary packages can be restored with the renv.lock
file1. The documentation is rendered with Quarto (Allaire et al. 2022) and hosted on GitHub Pages.
Data Processing
The first step of the workflow is processing the raw data and creating an annotated dataset that is ready for both being published and for being used as input for creating a figure (Figure 2). The script in src/01313/01313_clean.R
loads the raw data file from data/raw/Band<n>/01313/01313_Data_raw.xlsx
into the R environment as data01313
, processes the data set (reformatting columns, transforming absolute into relative values etc.) and exports the cleaned data into data/clean/Band<n>/01313/01313_3_Data.csv
. Additionally, a metadata list object meta01313
is created in R and written to data/clean/Band<n>/01313/01313_3_Data.csv-metadata.json
(see Datamodel).
flowchart LR clean_script <--->|load data| rawdata([01313_Data_raw.xlsx]) clean_script([01313_clean.R]) -->|clean data| data01313(data01313) clean_script -->|annotate<br>data| meta01313(meta01313) data01313 -->|export data| csv01313([01313_3_Data.csv]) meta01313 -->|export<br>metadata| json01313([01313_3_Data.csv-metadata.json]) style csv01313 fill:#ffe880 style json01313 fill:#fff3e0 style clean_script fill:#86bbd8,stroke:#86bbd8 style data01313 fill:#c0dceb,stroke:#c0dceb style meta01313 fill:#c0dceb,stroke:#c0dceb
Plotting
After the dataset is processed, it can be used for building a plot (Figure 3). This is done by executing the script in src/01313/01313_plot.R
. This sources the cleaning script first, making sure the plot is being drawn using up-to-date files. The data is loaded into R as data01313
again. If necessary, further transformations are applied (e.g. creating custom labels, aggregating columns, manipulating data for better readability etc.). The plot object plot01313
is created using ggplot2
. For technical reasons, plot and legend must be exported to separate files2. To this end, a separate_legend
object is created in R using ggpubr (Kassambara 2025). Both objects are then saved as pdf
files 01313_1_Plot.pdf
and 01313_2_Legende.pdf
to output/Band<n>/01313.
When using the scripts in this repository, the plots are not shipped with the project’s signature font family but with a generic system font.
For the print edition of Stadt.Geschichte.Basel, light post-processing using Adobe Illustrator was done by the RDM Team before delivering the figures to the typesetter for publication, incorporating further technical requirements, last-minute changes by authors etc.
flowchart LR plot_script([01313_plot.R]) <--> |source to<br>load data| clean_script([01313_clean.R]) plot_script --> |transform data| data01313(data01313) data01313 -->|create plot<br>object| plot01313(plot01313) plot01313 --->|export plot| export01313([01313_1_Plot.pdf]) plot01313 -->|extract legend| legend01313(separate_legend) legend01313 -->|export legend| exportlegend([01313_2_Legende.pdf]) plot_script ----->|build preview| infopage([01313.qmd]) style clean_script fill:#86bbd8,stroke:#86bbd8 style plot_script fill:#86bbd8,stroke:#86bbd8 style data01313 fill:#c0dceb,stroke:#c0dceb style plot01313 fill:#c0dceb,stroke:#c0dceb style legend01313 fill:#c0dceb,stroke:#c0dceb style export01313 fill:#ffe880 style exportlegend fill:#ffe880 style infopage fill:#fff3e0
In addition to the plot itself, the 01313_plot.R
script builds a separate qmd
file in docs/plots/
. This page is rendered when deploying the sgb-figures repository to GitHub Pages and provides previews of plot and data. The Quarto file contains a preview rendering of the figure itself, a table displaying the dataset used for creating the plot, as well as selected metadata on the dataset parsed from the json
file. Metadata describing the plot (media object) itself is only available on the Research Data Platform.
Users can take advantage of a range of npm scripts, making it easier to build the plots directly from the command line. Running npm run list
will print an overview of all available plots with a corresponding media ID. Using this ID, the plot, metadata and qmd
files can be generated using npm run plot <ID>
. A full list of all available npm scripts is available in the README.
Data Model
Metadata for research data of Stadt.Geschichte.Basel is provided according to a data model developed by the Stadt.Geschichte.Basel Research Data Management Team to meet the requirements of the wide range of sources used in the project. The model (and the subsequent annotation process) follow the Manual for Creating Non-Discriminatory Metadata for Historical Sources and Research Data (Mähr and Schnegg 2024).
For the data in sgb-figures, the model was slightly adapted to align with the requirements of publishing tabular data. To this end, recommendations as outlined in the W3C standard for tabular data and metadata on the web (W3C 2022) were consulted and implemented using the csvwr R package.
Metadata for the csv
datasets used for creating the figures is stored in a separate json
file. Again using the example metadata object abb01313
, Figure 4 illustrates how this annotation integrates with the Stadt.Geschichte.Basel data model used for the research data platform. In this example, abb01313
has one child media object m01313
. m01313
is a sgb-figures plot built using the dataset 01313_3
which in turn is described by the metadata file 01313_3_Data.csv-metadata.json
.
Each json
metadata file annotates one csv
dataset. Each dataset is used for one figure, but a figure may be built taking multiple datasets as input. Each figure is represented as one media object on the Stadt.Geschichte.Basel Research Data Platform. This (child) media object is then part of a parent metadata object, alongside zero or more other child media objects. If a parent metadata object has more than one child media object, their id
values – as well as the id
values of the corresponding dataset(s) and metadata file(s) – are numbered consecutively (m01313_1
, m01313_2
, etc.).
classDiagram direction LR class metadata["metadata json<br><i>describing csv file</i>"] class metadata2["metadata object<br><i>describing media object(s)</i>"] class media["media object"] class csv["csv dataset"] metadata "1" --> "1" csv csv "1" --> "*" media media "m" <-- "n" metadata2 class csv { id (01313_3) } class metadata { id (01313_3_Data.csv-metadata) url (01313_3_Data.csv) [columns] (title, datatype, description) media_id (m01313_3) [isPartOf] title description [creator] (incl. ORCID) [contributor] (incl. ORCID) publisher date (EDTF) coverage type format source language (ISO 639-2 code) [relation] rights license modified (ISO 8601) bibliographicCitation } %%| no label for namespaces, see https://github.com/mermaid-js/mermaid-live-editor/issues/1452 namespace sgb_datamodel { class media { id (m01313) title [subject;subject] (keywords from GenderOpen Index) description [abstract] (alt attribute for alternative text) [creator] (incl. link to Wikidata) [publisher] (incl. link to Wikidata) date temporal type format extent [source] (Source and catalogue link) language (ISO 639-2 code) [relation] (internal links to other items, link to GitHub, further information) rights license } class metadata2 { id (abb01313) title [subject;subject] description temporal [isPartOf;isPartOf] (Data DOIs) } } style csv fill:#F7CB45,stroke:#777 style metadata fill:#fff3e0,stroke:#777 style media fill:#FFFFFF,stroke:#777,color:#3A1E3E click media href "https://dokumentation.stadtgeschichtebasel.ch/products/coding/plattform/#datenmodell" "Main Data Model Documentation" click metadata2 href "https://dokumentation.stadtgeschichtebasel.ch/products/coding/plattform/#datenmodell" "Main Data Model Documentation"
References
Footnotes
Run
npm run setup
for a shortcut that will install R dependencies.↩︎For technical reasons, the actual plot and the corresponding legend are often stored in separate PDF files during the book production process. These two files are represented as one object programmatically (
plot01313
) and are collectively referred to as one object in this context.↩︎