Overview

Analysis that involves data that is subject to an NIH Data Use Certification (DUC) often includes the creation of data and other derivative files that are also subject to the DUC. The Scientific Computing team can configure regulated storage for DUC-governed data, which gives researchers the space and opportunity to ensure that each piece of their analysis is compliant with their respective data stewardship plan. If you have read the previously linked article about our regulated storage resource and you are interested in using it, please visit the Data Governance team’s page about NIH Repository Data Access to learn more.

PROOF is configured to store the intermediate data files generated during WDL execution in a compliant location. If a user has access to data on regulated, PROOF will automatically create a directory at /fh/regulated/[PI name]/temp/user/[username]/cromwell-scratch/ which will contain the intermediate files produced while running a workflow.

How to Start a PROOF Server with Regulated Storage

In the PROOF Server tab, when you click “Start a PROOF Server” you will see a checkbox with the option to use regulated data:

Check the box and click “Start.”

If you already had a PROOF server running, you will need to stop that server, and then start a new regulated server.

Once the server starts, there will be several indications that you are using a regulated PROOF server:

  1. You will receive an email confirming that you started a regulated server.
  2. A badge that says “Using regulated data” will appear in the navigation bar on every page.
  3. On the server tab, the scratch directory will be a sub-folder of /fh/regulated/.
  4. On the server tab, “Use regulated data” under the troubleshooting heading will be TRUE.

Important Considerations When Using a Regulated PROOF Server