Synthetic Biologists Outline Framework for Assessing Data Hazards

New Data Hazard labels proposed by University of Bristol researchers aim to mitigate risks in synthetic biology
Policy & Public
Emerging Technologies
July 9, 2024

Experts at the University of Bristol have highlighted significant hazards that data-centric engineering biology methods pose. Their research aims to ensure safer future synthetic biology endeavors by identifying potential data usage risks.

The potential misuse of data-centric approaches in synthetic biology poses serious threats. The accessibility of data science tools could enable malicious actors to create harmful biological agents for bioterrorism or to intentionally disrupt ecosystems. These findings, published in the journal Synthetic Biology, propose new Data Hazard labels to describe data-related risks in synthetic biology.

Data Hazards in Synthetic Biology

The study outlines several key risks:

  • Uncertain Accuracy of Source Data: The reliability of the underlying data is unknown, potentially leading to erroneous results or biases.
  • Uncertain Completeness of Source Data: Incomplete data with missing values can cause skewed outcomes.
  • Integration of Incompatible Data: Combining different types or data sources that may not be compatible.
  • Capable of Ecological Harm: This technology could cause extensive ecological damage, even with correct usage.
  • Potential Experimental Hazard: Translating technology into experimental practice requires stringent safety measures.

Kieren Sharma, co-author and PhD student working in AI for cellular modeling at the School of Engineering Mathematics and Technology, emphasized the transformative potential of combining artificial intelligence with synthetic biology. “We’re entering a transformative era where artificial intelligence and synthetic biology converge to revolutionize biological engineering, accelerating the discovery of novel compounds, from life-saving pharmaceuticals to sustainable biofuels.”

However, Sharma also highlighted the risks uncovered in their study. “Our study has uncovered potential risks associated with the specific types of data being used to train the latest systems biology models. For instance, inconsistencies in measurements from complex and dynamic living organisms and privacy concerns that could compromise the safety of next-generation models trained on human genome data.”

Building a Vocabulary for Data Hazards

The project builds upon the Data Hazards project (datahazards.com), which aims to establish a clear vocabulary for the potential hazards in data science research. This clear vocabulary is crucial for researchers to proactively consider the risks of their work and implement mitigating actions.

Dr. Nina Di Cara, co-author and co-lead of the Data Hazards project from the School of Psychological Science, explained the importance of a common language. “Having a clear vocabulary of hazards makes it easier for researchers to proactively consider the risks of their work and implement mitigating actions. It also facilitates communication among people from different fields who may use varied terminology to discuss the same issues.”

The Importance of Interdisciplinary Collaboration

Interdisciplinary collaboration is vital for achieving these clear vocabularies. Dr. Daniel Lawson, Director of the Jean Golding Institute and Associate Professor in Data Science at the School of Mathematics, noted: “As datasets grow in magnitude and ambition, increasingly sophisticated algorithms are developed to gain new insights. This complexity necessitates a collaborative approach to identifying and preventing downstream harms.”

Dr. Thomas Gorochowski, senior author and Associate Professor of Biological Engineering at the School of Biological Sciences, added: “Data science is set to revolutionize how we engineer biology to harness its unique capabilities to tackle global challenges, from sustainable production of materials and fuels to the development of innovative therapeutics. The extensions developed by our team will help bioengineers consider and discuss risks around data-centric approaches to their research and help ensure the huge benefits of bio-based solutions are realized safely.”

The work, resulting from a collaboration between researchers from the Bristol Centre for Engineering Biology (BrisEngBio) and the Jean Golding Institute for Data Intensive Research, highlights the dual potential of synthetic biology. While the benefits are immense, the associated risks cannot be ignored. By addressing these risks proactively, the research community can better harness the power of synthetic biology for a safer and more sustainable future.

Related Articles

No items found.