Saving Scientific Data in the Cloud – Biggest Hurdles to Overcome

Before we enter the age of fully automated labs where all devices are connected and compatible, there are several hurdles to overcome. Switching to an Electronic Lab Notebook often opens up questions about saving data on the cloud. Labs have to address, discuss and make appropriate decisions about these important aspects:

  1. Transition to Cloud

Cloud data storage brings many benefits such as remote accessibility, easier integration with other data systems, easier maintenance and updating as well as automated back-up. However, storing data in the cloud brings up many issues in terms of data security. There are different regulatory requirements for scientific data storage depending on your location and your line of work. Therefore, there are many cloud providers out there who offer different cloud storage options based on the regulatory requirements that the client needs to follow. In most cases it is enough to have the data simply stored in a cloud with encryption and private access. Luckily, this has already become the standard with basically any cloud provider.

As the security standards for the clouds are getting higher, in my opinion more and more scientists will decide to store their data on the cloud.

However, if your line of work involves patient data for example, you are obliged to store the data in a compliant manner, as defined by regulators, such as HIPAA. Some electronic lab notebook providers are already offering hosting on such validated systems that satisfy all regulatory needs. In short, the transition of science to the cloud is already happening. As the security standards for the clouds are getting higher and the actual location of the cloud servers is more transparent, in my opinion more and more scientists will decide to store their data on the cloud.

  1. Data ownership

As soon as you put your data in the cloud, you are exposing it to third parties, even if just to the electronic lab notebook provider and the actual cloud storage provider. It is therefore important that the Terms of use of your service provider clearly states that you are not giving away any rights or ownership over your data. If you are a researcher working for a research organization, it is very likely that all your research is ultimately owned by your employer. This means that the institution should have the right to decide who can access the data and who can manage it. Therefore, the design of digital notebooks for example has to have two kinds of entities; on one side Users, who manage, upload and monitor the data and the Organizations, which own the data and decide how it is stored and accessed. Organization has to be able to gain access to your research at any point and in case you would leave the organization, and has to have the power to restrict the access to your data.

  1. Data sharing

Once everyone starts to store data in the cloud, scientists will be able to exchange their data and results much more efficiently. This has the potential to drastically reduce the amount of experiments being performed as the data will be appropriately annotated while being generated. Of course not all data can be shared with everyone. Even if you are working in the same organization, you are usually not permitted to see the data of every project. Some data can only be shared with certain people working in a closed group. The scientist who is creating and uploading the content of an electronic lab notebook has to have absolute control who sees, edits or signs specific data at all times.

By Klemen Zupancic, PhD

by Wouter de Jong , post on 11 October 2016 |

Thanks for the great out-line here. I do think you are missing out on one of the biggest hurdles for cloud solutions that is particularly important for ELN solutions: Handling of large data sets. When you want to store and link large data sets (e.g genomics, proteomics and imaging data) in a traceable manner in a ELN cloud solution, it is difficult to offer a satisfactory solution end-users.
We have addressed this issue in the ELN eLabJournal by offering a hybrid cloud solution. Both application and database are hosted in the cloud, but file storage is done on a local network server of the institute or company. In this way, the audit trail and file versions are still tracked, but the files are stored on server and transferred only over the local network, thereby solving issues when you want to link a large data set.