Google Search Engine Software goes 'Chemistry'

What if medical and environmental researchers could harness the power of internet search engine software – something like Google’s PageRank software that provides a detailed picture of all the links between the trillion-plus webpages on the internet – to create better drugs and environmental clean-up agents? All that these researchers would have to do is turn molecules into webpages, and turn the bonds that exist between molecules into hyperlinks. Then we could monitor the interactions between molecules just as easily as Google can monitor the network of links and webpages on the internet in order to provide you with the best website when you Google-Search “National Geographic” or “What is a molecule?”

google-period-table3-1024x521.jpg
Image: Chemistry lessons with Google

Sound a bit sci-fi? Not anymore. Aurora Clark, an associate professor of chemistry at Washington State University, and her graduate students have adapted Google’s PageRank software to model how molecules are shaped and organized in a fluid network – in your glass of water, for example. The new software, called moleculaRnetworks, uses Google’s PageRank mathematical algorithm “to determine molecular shapes and chemical reactions without the expense, logistics and occasional danger of lab experiments.” (WSU) With this new software, for example, environmental researchers could design better clean-up chemical agents used to remove heavy metals including lead, uranium, and plutonium from nuclear waste or groundwater – without ever handling these dangerous substances in the lab.

One of Clark’s graduate student who had extensive experience in computer programming, Barbara Mooney, first brought PageRank to Clark’s attention. “We asked, what is the software’s maximum impact and utility? After doing some tests, we realized that it was actually very efficient and very good at mapping the hydrogen bond connectivity within liquids like water,” Clark said.

Hydrogen-bonding occurs between water molecules, and is ubiquitous in nature. From keeping your icecubes afloat in your iced tea, to keeping the strands of your DNA together, to being responsible for the high boiling point of water, hydrogen-bonding is everywhere. This is exactly why Clark and her collegues want to model the interactions between water molecules – because a large majority of chemical reactions in nature take place in the presence of water.

“The pagerank algorithm is really based on connectivity,” Clark said, “and we can tweak the software to include any interaction that we want…” But how much ‘tweaking’ does it take for this software that usually ranks webpages on the internet to serve as a modeling system for links between molecules?

The pagerank algorighm itself is actually very simple, Clark explains. “Really it’s all about getting the data to be in the appropriate form to be read into the PageRank algorithm,” Clark said. “That actually takes the largest amount of time,” she said. PageRank is designed for analyzing websites: “You have to redefine the variables that are in the mathematical expression of the algorithm such that molecules become webpages and hydrogen bonds become hyperlinks.”

Clark and her students designed a entire suite of software that harnesses the PageRank analysis along with a host of other statistical analyses for molecular modeling. Clark explains that moleculaRnetworks can analyze molecular interactions on both the large scale – the macro-scale that we normally think of and can see with our naked eyes – as well as on the ‘zoomed in’ scale of atoms and molecules.

“Our program can do a macroscopic analysis of an entire water network – 2,000 to 3,000 water molecules – at once. When we do that, we get a fingerprint of the hydrogen bonding pattern within the whole thing.” The fingerprint for a particular water network is unique, Clark explains, and it changes when a foreign substance, for example a heavy metal ion or other water contaminant, is introduced into the network. “Now that’s useful,” Clark said, “but I think it’s even more useful to zoom in to the molecular scale, and say ok, how are the waters that are immediately around the ion organized?” Clark and her students can PageRank only the water molecules that are closest to heavy metal ion introduced into the simulation, getting a better picture of the shape that water molcules form around a particular ion in a particular condition.

Page-Rank1.jpg
Google’s PageRank software has been adapted to determine the way molecules are shaped, organized, and combined (credit: WSU)

Currently, using traditional experimental methods, researchers can only get the ‘average picture’ of the water structures around a heavy metal ion, for example. “We can reproduce that average picture of water geometry with our simulations,” Clark said, “but we are able to go one step beyond what is possible experimentally.” Clark and her students can tell, with their simulations, what percentage of the time water molecules are forming various distinct shapes around an ion, and from there how these different geometries impact the chemical reactivity of the system. This is where the real power of the simulation kicks in.

Clark and her students are using moleculaRnetworks to characterize how strongly water organizes around a solute such as a heavy metal, and how that organization affects the ability of clean-up agents to bind to that contaminating metal and remove it from the water. “By having this PageRank algorithm that tells us how water is organizing and rearranging dynamically around the ion [heavy metal], we can say what that water’s impact is upon the reactivity of that metal,” Clark said. “For example, we could say, ok well, the waters are really really stuck around this ion and they aren’t moving, so this ion isn’t going to react with a chemical agent that is meant to pull that ion out of water,” she explained. This information allows the team to design better clean-up agents for removing contaminants from water. All without leaving the computer screen.

If the PageRank algorithm tells the team that water molecules have a really strong affinity for uranium in a nuclear waste zone, for example, and the water molecules are organized very strongly around the uranium ion, then researchers will have to use a clean-up agent that wants to be around the uranium ion more than water, otherwise the clean-up will never displace the waters around the ion.

The moleculaRnetworks software is precise enough to sort weak vs. strong hydrogen bonds, and even to sort bonds pointing in different directions.

“It’s a very nice compliment to the experimental data, by us looking at hydrogen bonding [via computer simulation],” Clark said.

As it turns out, Google’s PageRank software can do a whole lot more than sort and rank webpages. Enter the era of computer science revolutionizing traditional laboratory experiments.

WSU Video on this technology:

Citation: B. L. Mooney, L. R. Corrales, A. E. Clark, J. Comput. Chem. 2012, 33, 853-860. DOI: 10.1002/jcc.22917