Migrating our Data Policies to Github

by | Apr 30, 2022 | Data Policy, News

The Tetiaroa Data Policy and the Generic Place-based Data Policy that evolved from it, were both originally drafted in Google Documents. It was an easy way for many people to edit directly, comment and view, but it masked the evolution of the Policy, which is one key aspect of the FAIR Island Project. It also wasn’t open by design. If you didn’t have the Google Doc link you couldn’t find it. We want to understand how the policy evolves and track the changes in a visible way. We also want to be able to see how versions of the Policy are adopted and transformed. To help us achieve those goals, I created two repositories for Tetiaroa Data Policy and the Generic Place-based Data Policy in our FAIR Island Github Organization.

Getting Started

As a first step, I cloned the Tetiaroa Data Policy repo and using RStudio’s visual editor, I created an R markdown (RMD) file. I went through the Tetiaroa Google Doc version history and copied the earliest version of the policy into the RMD file. I then saved the file in RStudio and committed the changes to the file locally, noting in the commit title the date of the version and then pushing to Github. I repeated this process going through the Google Doc history, copying the version into the RMD file, committing with date and pushing to our repo. In the history for the RMD file on Github you can see each of these versions and when you click on a version you can see the changes.

On February 24, 2021 we copied (or ‘forked’) the Tetiaroa Policy into a new Google Doc and modified it to be a Generic Place-based Data Policy that could be a template for others. When I got to the Feb 24, 2021 version of the Tetiaroa policy, I created a copy of the Tetiaroa repo called Generic Place-based Data Policy and then began the same iterative process copying the Google Doc history for this second policy and committing with the date of the revision so that we can also track changes on the Generic policy. We published the Generic Policy to Zenodo in November 2021. I have included the DOI on that Github repository. Moving forward, we would like to use the Zenodo/Github integration to create releases of the data policies.

As a last step, we used Github pages to create human-readable websites for each policy – Tetiaroa and the Generic Place-based.

Tracking Changes

We also took advantage of Github Issues to begin tracking feedback. The Generic Policy had an informal review recently. I migrated the comments from the review into individual issues linked to lines of the RMD file.

Then as these items are addressed, the commit message for the change is used to link to the issue and the issue is closed.

Pros & Cons

There are a lot of positives about the way we are using Github to manage the data policies – we can see the different versions, there are ways to freeze the releases of the policy and the integration with Zenodo. The downside is that our team will need to learn how to use this new workflow and there is a learning curve associated with using Github at all. We will see how this evolves and if this is the optimal spot for the policies to live.