Kiwis make molehills from mountains of information

STEPHEN BALLANTYNE

   




  

8 April 2004

The Mole boils text down to relevant words and phrases, attracting interest from businesses and terrorism fighters. Stephen Ballantyne finds out more

Computerised analysis of text is one of the longest running research threads in information technology ­ it was what the Colossus machine and its kin were engaged in at Bletchley Park during World War II.

Later, in the 1950s and '60s, the US Air Force subsidised Noam Chomsky's pioneering research into linguistic structures. The USAF wanted to be able to translate Russian into English electronically, but the computers of the day weren't really up to the job.



EXTRACTING MEANING: From left Roy Anderson,
Liz Swanston and Ted Thomas

Photo:
BARRY DURRANT

Now though, after decades of further research and the development of much better hardware, linguistic analysis is good enough to be a practical tool.

As an example, consider The Mole, a powerful new analysis tool from Wellington-based software developer and marketer Hyperbolex.

"We've been working on commercialising The Mole for the past three years," Hyperbolex director of business development Ted Thomas said, "although the underlying technology was developed from work done over 15 years by Roy Anderson. He developed it as a byproduct of PhD work he was doing in linguistics and information science at the University of Otago. He was working with unstructured documents, looking for ways to extract meaning from the relationships between words and sentences."

For human beings, that's no big task, but to automate it so a computer can do it very quickly, and in bulk, is another story altogether.

The Mole's claim to fame is it can burrow through huge quantities of unstructured text as fast as your computer can process them and extract "meaning" of interest to humans by analysing the underlying structure of the text.

Even demonstrated on a laptop computer, it is notably fast and flexible.

"Fundamentally it's an engine that can deal with information from the web, or from PDF files or Word files, or straight text, or many of a range of file formats," Mr Thomas said.

"Under the bonnet is an analysis engine able to index everything on a hard drive as well as engage with the web. It can also be deployed in an enterprise-wide environment so that users can engage with information across the organisation. And we can license the components so they can be embedded into other technologies.

Mr Thomas opens a 20-page legal document with Word, and The Mole plug-in immediately sifts the display down to what it determines to be the top 20% most relevant sentences. It can also display the most significant words and phrases with remarkable acuity.

Selecting keywords allows the user to penetrate further into the significance of the document, with the selected words guiding the program towards the most significant content. "The Mole doesn't change any text ­ it simply gives a representation of it based on its construction. It makes molehills out of mountains of information."

With a background in commercialising banking software, Mr Thomas and fellow director Liz Swanston were perhaps not the obvious choice to make when Roy Anderson sought backing for the software technology he had developed.

"We had been doing some due diligence work for venture capital funders when we came across Roy's work," Mr Thomas said. "We thought it deserved further exploration to see how best it could be commercialised and wound up branding and packaging it into a product that could be taken to market."

Already The Mole has found customers in the legal and government sectors in New Zealand, and the team is now looking overseas. "The real opportunity for it is offshore, in association with other technologies," Mr Thomas said.

Indeed, overseas response to The Mole has been more significant than local.

With some help from NZ Trade and Enterprise, Hyperbolex formed an association with ThisQuarter, a software business development specialist in Austin, Texas, which "did an assessment for us to determine our readiness to go into the US market and to see whether potential customers would find value in The Mole. The feedback they got from that was tremendous."

As well as the American interest, The Mole is also attracting attention from Britain where Imprimatur Capital is providing further funding for its development.

"The network it brings with it, both in terms of introductions to business opportunities and in market research to identify technologies today that will meet the requirements of the future, is very sophisticated," Ms Swanston said.

Although the Mole works well with today's fast PCs and large hard drives, the underlying software could also be run on more powerful systems, particularly (in these security conscious times) those operated by government intelligence agencies. Without being too specific, Mr Thomas and Ms Swanston acknowledge there is interest among overseas agencies anxious to "join the dots" as quickly as possible, especially in the fight against terrorism.

Even in the more routine world of commerce The Mole can have valuable security uses: "You could use it for content sniffing in emails," Ms Swanston said. "If you had confidential documents you didn't want circulated outside your organisation, The Mole's ability to work on patterns could allow it to detect if an attempt to send portions of the document was being made."

About Hyperbolex

Hyperbolex develops and licenses content intelligence technologies The Mole used in the discovery of information from unstructured data sources. The Mole uses a powerful (international patent pending) content analysis technology to provide abstracts of information based on a user's area of interest. The Mole's unique ability to unlock the value within unstructured data and automate processes around this information differentiates it from its competitors.

You can find out more information about Hyperbolex and The Mole by contacting us at www.themolesite.com or email .