Research done by Griffith University and the Hefei Institute of Material Science of the Chinese Academy of Sciences proposes that deep mutational scanning in combination with high throughput sequencing alone can be used to infer highly accurate structures for proteins and RNAs (Ribonucleic acids) without having to use expensive equipment.

Understanding the structural clues behind proteins and RNA would help to better explain their role in many diseases, including cancer. Armed with this knowledge we can develop new tools to expedite the discovery of new drugs to fight cancer.

Professor Yaoqi Zhou, Principal Research Leader at Griffith’s Institute for Glycomics, led the research which has been published in Nucleic Acids Research.

“Knowing the intrinsic structure of a machine is the first step in understanding how it works, as well as figuring out how to fix it when something goes wrong. This is true not only for large machinery we deal with in everyday life, but also for molecular machines such as RNA and proteins in our bodies,” explained Professor Yaoqi Zhou from the Institute for Glycomics.

“Currently, we can only determine these molecular structures by using sophisticated tools such as X-ray diffraction, Nuclear Magnetic Resonance, and cryo-electron microscopy. However, these methods are costly and time-consuming and sometimes ineffective for many RNA and proteins. It’s becoming increasingly evident that alternative, inexpensive methods are urgently needed.”

It’s widely known that one can infer protein or RNA structures accurately if these proteins or RNAs belong to a large family of biomolecules in different species that have evolved from the same ancestor. This is because proteins or RNAs in the same family have essentially the same structure and the same function, although they adopt different sequences of amino acids or nucleotides.

“By examining all these sequences, one can discover structural ‘neighbors’, because double mutations of structural neighbors have to evolve concurrently in order to maintain structural and functional stability.”

“For example, if two amino acid residues of a protein are in contact with each other and one of the two mutates into a larger residue during evolution, the other must be mutated to a smaller residue in order for the protein to continue working appropriately. Such co-variation information, which is extractable from statistical analysis, provides a distance restraint between the two amino acid residues. Given a sufficient number of distance restraints, one can figure out the structure computationally,” continued Professor Zhou.

“However, most protein families are not large enough. Some may not evolve long enough, others may be unique to a certain species, and still others are not yet covered by limited sequencing of existing species. A small family will not have sufficient statistics to extract co-variation information. The question then is this: can one artificially generate enough homologous sequences for structure inference?”

Professor Zhou’s collaborative research tested the idea of using artificially generated functional ribozyme sequences to infer the base-pairing structures of RNA. They found that nearly all base pairs can be recovered by using artificially generated ribozymes. These base pairs are expected to be useful to serve as structural restraints for 3D structure inference.

With further improvement and refinement, this method could open a new door for structural biologists investigating structures that are difficult to determine using existing, time-consuming and expensive techniques.

Commenting on the research, Professor Mark von Itzstein AO, Director of the Institute for Glycomics, said, “We are delighted by this research outcome, which highlights the depth of our basic science and our multidisciplinary approach to drug and vaccine discovery.”

The research “Accurate inference of the full base-pairing structure of RNA by deep mutational scanning and covariation-induced deviation of activity” has been published in Nucleic Acids Research.