Molecular Database Systems/ Chemical Structure Search (AURAmol)     

              click here to go to the AURAmol on-line demonstration page

Introduction

Cybula has developed a powerful range of tools based on the AURA high performance pattern recognition technology for searching databases containing 3 Dimensional views of complex small molecules (in the order of 60 atoms). The AURAmol technology allows a user to search for molecules that have a similar shape to a particular description. The basic technology underpinning this task can be used in a wide variety of problems. Extensions to the basic method allow the system to take into account local properties of the molecules. More details of the system are now available on the on-line demonstration web site.

Screen shot

The image below shows an typical example screen shot from a demonstration front end desk top system. 200 molecules have been taught from the NCI open database. A query has 'netropsin' has been entered and 17 potential matches found in 0.32 seconds (on a 800MHz PC) as displayed in the list. Two example matches have been displayed, along with the query (click on the screen shot for a detailed view).

AURA MOL Screen shot

Principal features of the technology are:

  • The technology allows very large databases of molecules (> 100,000) to be searched.
  • Input of new molecules to the database is quick.
  • The methods can be run on desktop workstations to supercomputers matching the needs of the user.
  • The technology uses the full 3D structure and surface properties of the molecules
  • The methods used within the technology are published, allowing full understanding of the methods

The technology consists of a set of C++ functions built on top of the AURA CMM library used in many of our systems. Typical use is through UNIX/linux based programs. A simple JAVA based demonstration system is available to show the operation of the system.

Outline function

The AURAmol system describes the surface of the molecule by a set of points. These points are joined in a graph that is then used to search the database of molecules. The nodes in the graph contain attributes that describe the local properties of the molecule at that point. The match engine is composed of a number of CMMs working together through a constraint propagation process. The constraint update procedure has been developed specifically to support CMM based systems, and efficiently searches large databases for potential matches. The results of the process are then supplied to the user with a measure of the similarity to each molecule returned.

Technical details

The AURAmol system has been developed at the University of York, Computer Science Department in association with GSK and Evotec. The technology is described in detail in a number of papers at the Advanced Advanced Computer Architectures web site.

Use of the technology

The AURAmol system may be embedded into many applications as well as run as a web service. The system exists as a C++ library and runs on Linux, NT, Windows. The system can be run on small PCs to supercomputers. The technology has been licensed to Cybula Ltd. and is now available for incorporation into your systems.


home : products : services : news : the company : partners : contact