|

Convera launches RetrievalWare® 8 Knowledge Discovery Platform
Categorisation and dynamic classification software
delivers new end-user capabilities for search, navigation and discovery
IN
OCTOBER Convera announced the commercial
availability of RetrievalWare 8, a knowledge discovery platform to help
organisations automate their knowledge management and discovery processes.
RetrievalWare 8 integrates Convera’s enterprise
search capabilities with a new dynamic classification methodology, providing
a single integrated platform that can categorise, organise and deliver
enterprise content, regardless of format, language or storage location.
The new classification software enables end users to
dynamically personalise their view onto the available data and so provide
immediate support for innovation and problem solving challenges. According
to Graham Charlesworth, VP & GM of European Operations at Convera,
“Legacy categorisation systems based, for example, on naïve
Bayesian technology have failed to support innovators as they tend to
impose the same monolithic taxonomy view onto all users. We don’t
know of any legacy Bayesian-based installations that have successfully
maintained workable categorisation accuracy beyond a few hundred taxonomic
nodes. Yet our enterprise scale customers are demanding a level of granularity
in automated categorisation, which requires thousands of taxonomic nodes.
We are going to deliver against that with RetrievalWare 8, where previous
generations of technology have failed.”
To an aid to fast start-up, Convera will be supplying
a range of pre-seeded, highly granular taxonomies off-the-shelf, covering
more than a dozen industry domains. These will be editable by customers
and system integrators so that precise customer needs can be met.
According to Mark Walter of The Seybold Report: “RetrievalWare
raises the bar in categorisation. Convera has made impressive strides
in its categorisation and classification techniques. There are new tools
for importing, creating and editing taxonomies, a new categorisation engine
and impressive display options for end users. This is the first system
we've seen that lets users mix and match taxonomies in their search results,
even to the point of showing search results classified in a table along
two user-definable axes at once, like showing me the search hits by topic
and geography.”
Convera’s
Domain Cartridges
Finding the right information quickly and efficiently is critical to problem
solving and innovation in any business. This is particularly true for
knowledge workers and information analysts accessing domain or subject-specific
content such as scientific journals, engineering reports, financial analysis,
industry specific news, industry websites, and intelligence reports. For
these professionals, the cost of not finding and extracting ‘critical’
knowledge from the vast sea of available information can result in millions
of dollars in both lost productivity and lost business opportunities.
For some military and intelligence applications, the result of missing
that ‘critical’ bit of information could even result in catastrophic
loss.
A significant challenge in making the most of an information
retrieval system is that each industry or field of endeavour uses its
own unique terminology and concepts to communicate and express ideas.
This unique terminology is not only found in the source content being
searched over, but is also used by professionals to form their ‘queries’
while searching or organising their information. Understanding and leveraging
the terminology, concepts, and relationships between concepts for a particular
industry or field is vital to mission critical information retrieval.
Convera’s Domain Cartridges provide out-of-the-box
domain specific semantic networks to improve precision and relevancy of
search, retrieval and categorisation when used with RetrievalWare. These
semantic networks are stored in a flexible cartridge format and contain
thousands of domain specific concepts and terms linked together based
on their relationships to one another. When an end-user performs a search,
RetrievalWare will use the Domain Cartridge to expand the query terms
to retrieve relevant documents based on not only the query terms, but
also related domain-specific concepts found in the Domain Cartridge.
Convera’s pre-packaged Domain Cartridges seamlessly
plug into RetrievalWare® and extend Convera’s existing set of
semantic network cartridges that support a wide range of languages. Each
Domain Cartridge can be used in combination with other industry cartridges
or with customised cartridges that include your business specific vocabulary
and concepts
Professionals in each industry or specialised field
have their own unique set of vocabulary and concepts to communicate and
express ideas. Understanding the terminology, concepts, and relationships
between them is vital to mission critical information used in those industries
or fields.
The
Convera Domain Cartridges provide an out-of-the-box solution for businesses
to leverage domain specific semantic networks to improve precision and
recall of search and categorisation. Each domain specific cartridge can
be used in combination with Convera’s other industry and language
cartridges or with customised cartridges that include your business specific
vocabulary and concepts.
The primary sources used to populate the cartridges
cover a number of principal domains subdivided into several thousand specialised
domains. The content for these sources have been gathered over many years
by terminologists working with external language specialists and field
specialists. The content has been reviewed and filtered to ensure quality
coverage or a technically oriented nature in domains of interest to professionals,
technical staff and researchers. The accumulated data would span roughly
3,000 reference books if grouped together.
Example of domains and sub-domains
The following list details the domains and sub-domains of Convera's MeSH
(Medical Subject Headings) Domain Cartridge. There are 37,600 concepts
represented by approximately 142,700 terms and expressions distributed
across the following domains:
Anatomy
|_ Body Regions
|_ Musculoskeletal System
|_ Digestive System
|_ Respiratory System
|_ Urogenital System
|_ Endocrine System
|_ Cardiovascular System
|_ Nervous System
|_ Sense Organs
|_ Tissues
|_ Cells
|_ Fluids and Secretions
|_ Animal Structures
|_ Stomatognathic System
|_ Hemic and Immune Systems
|_ Embryonic Structures
|_ Integumentary System
Organisms
|_ Invertebrates
|_ Vertebrates
|_ Bacteria
|_ Viruses
|_ Algae and Fungi
|_ Plants
|_ Archaea
Diseases
|_ Bacterial Infections and Mycoses
|_ Virus Diseases
|_ Parasitic Diseases
|_ Neoplasms
|_ Musculoskeletal Diseases
|_ Digestive System Diseases
|_ Stomatognathic Diseases
|_ Respiratory Tract Diseases
|_ Otorhinolaryngologic Diseases
|_ Nervous System Diseases
|_ Eye Diseases
|_ Urologic and Male Genital Diseases
|_ Female Genital Diseases and Pregnancy Complications
|_ Cardiovascular Diseases
|_ Hemic and Lymphatic Diseases
|_ Neonatal Diseases and Abnormalities
|_ Skin and Connective Tissue Diseases
|_ Nutritional and Metabolic Diseases
|_ Endocrine Diseases
|_ Immunologic Diseases
|_ Disorders of Environmental Origin
|_ Animal Diseases
|_ Pathological Conditions, Signs and Symptoms
Chemicals and Drugs
|_ Inorganic Chemicals
|_ Organic Chemicals
|_ Heterocyclic Compounds Polycyclic Hydrocarbons
|_ Environmental Pollutants, Noxae, and Pesticides
|_ Hormones, Hormone Substitutes, and Hormone Antagonists
|_ Reproductive Control Agents
|_ Enzymes, Coenzymes, and Enzyme Inhibitors
|_ Carbohydrates and Hypoglycemic Agents
|_ Lipids and Antilipemic Agents
|_ Growth Substances, Pigments, and Vitamins I_Amino Acids, Peptides,
and Proteins
|_ Nucleic Acids, Nucleotides, and Nucleosides
|_ Neurotransmitters and Neurotransmiuer Agents
|_ Central Nervous System Agents
|_ Peripheral Nervous System Agents
|_ Anti-Inflammatory Agents, Antirheumatic Agents, and Inflammation Mediators
|_ |_ Cardiovascular Agents
|_ Hematologic, Gastrointestinal, and Renal Agents ~_ Anti-Infective Agents
|_ Anti-Allergic and Respiratory System Agents
|_ Antineoplastic and Immunosuppressive Agents
|_ Dermatologic Agents
|_ Immunologic and Biological Factors
|_ Biomedical and Dental Materials Specialty Chemicals and Products
|_ Chemical Actions and Uses
Analytical, Diagnostic and Therapeutic Techniques
and Equipment
|_Diagnosis
|_ Therapeutics
|_ Anesthesia and Analgesia
|_ Surgical Procedures, Operative
|_ Investigative Techniques
|_ Dentistry
|_ Equipment and Supplies
Psychiatry and Psychology
|_ Behavior and Behavior Mechanisms
|_ Psychological Phenomena and Processes
|_ Mental Disorders
|_ Behavioral Disciplines and Activities
Biological Sciences
|_ Biological Sciences
|_ Health Occupations
|_ Environment and Public Health
|_ Biological Phenomena, Cell Phenomena, and Immunity
|_ Genetic Processes
|_ Biochemical Phenomena, Metabolism, and Nutrition
|_ Physiological Processes
|_ Reproductive and Urinary Physiology
|_ Circulatory and Respiratory Physiology
|_ Digestive, Oral, and Skin Physiology
|_ Musculoskeletal, Neural, and Ocular Physiology
|_ Chemical and Pharmacologic Phenomena
|_ Genetic Phenomena
|_ Genetic Structures _Physical Sciences
Physical Sciences
Physical sciences
Anthropology, Education, Sociology and Social Phenomena
|_ Social Sciences
|_ Education
|_ Human Activities
Technology and Food and Beverages
|_ |Technology, Industry, and Agriculture
|_ Food and Beverages
Humanities
|_ Humanities
Information Science
|_ Information Science
Persons
|_ Persons
Health Care
|_ Population Characteristics
|_ Health Care Facilities, Manpower, and Services
|_ Health Care Economics and Organizations
|_ Health Services Administration
|_ Health Care Quality, Access, and Evaluation
Geographic Locations
|_ Geographic Locations
Sample Expansions
Although it is not possible to give an exhaustive sampling of the terminology
included in the MeSH Domain Cartridge, the following samples illustrate
the diversity and potential richness of the cartridge.
For example, “in the domain” we would find
the term:
“Nervous System Diseases” >
brain ischemia
English synonyms automatically searched
Ischemic Encephalopathy
Ischemic Encephalopathies
Ischemia, Brain
Encephalopathy, Ischemic
Brain Ischemias
“Heterocyclic Compounds” > Methotrimeprazine
English synonyms automatically searched
Levopromazine
Levomepromazine
Levomeprazin
“Enzymes, Coenzymes, and Enzyme Inhibitors”
> micrococcal nuclease
English synonyms automatically searched
Nuclease, Thermostable
TNase
Thermonuclease
Thermostable Nuclease
Staphylococcal Nuclease
Nuclease, Staphylococcal
Micrococcal Nuclease
Nuclease, Micrococcal
“Organic Chemicals” > mustard
gas
English synonyms automatically searched
Yperite
Yellow Cross Liquid
Mustard, Sulfur
Sulfur Mustard
Mustard gas
Sulfide, Dichlorodiethyl
Dichlorodiethyl Sulfide
Sulfide, Di-2-chloroethyl
Di-2-chloroethyl Sulfide
Bis(beta-chloroethyl) Sulfide
Convera claims to offer the most extensive, scalable
and intelligent search and categorisation system available today. The
MeSH Domain Cartridge is just one example of many industry-specific cartridges
available from Convera.
Other industry-specific cartridges include:
Biology, Chemistry
Computers
Electronics
Finance
Food Science
Geography
Geology
Health Sciences
Information Science
Law
Mathematics
MeSH (Medical Subject
Headings)
Military
Petroleum Natural Gas & Petrochemicals
Pharmacology
Physics
Plastics
Rubber
Telecommunications
Users
Convera has certified a number of partners into its new Taxonomy Developer
Certification Program. These include Access Innovations, IBM and Veridian.
Early adopters of RetrievalWare 8 include the UK Ministry of Defence (MOD)
offices within the US Department of Defense, US Department of Energy and
the US Navy. Initial interest is also strong within the pharmaceutical
and financial communities.
The US Federal Bureau of Investigation (FBI)is also
using Convera RetrievalWare. See case study below.

FBI selects Convera for new FBI investigative data warehouse
RetrievalWare® to increase information sharing
among law enforcement, intelligence and homeland security communities
CONVERA
announced in October that the US Federal Bureau of Investigation
has selected Convera's RetrievalWare as a search and categorisation platform
within the Agency's new Investigative Data Warehouse. The initial value
of the deployment of Convera's software is approximately $1.5 million.
After the events of September 11th, the FBI created a sophisticated Secure
Collaborative Operational Prototype Environment (SCOPE) with a counter-terrorism
and intelligence data repository. RetrievalWare was selected by the FBI
for the repository to improve the sharing of intelligence information
and collaboration across multiple government agencies, enhancing the government's
ability to prevent terrorist attacks. RetrievalWare will work with other
tools to help FBI analysts identify critical pieces of intelligence within
the massive information repository that they use to drive investigative
and intelligence activities. Specific RetrievalWare capabilities required
by the FBI for the project include extensive security options, real-time
message profiling, breadth of language support, multimedia search, scalability
and powerful new dynamic classification capabilities.
Information sharing among intelligence agencies - essential to national
security - will be bolstered by RetrievalWare's ability to cut through
enormous amounts of data to find minute details agents need to respond
to possible homeland security threats. RetrievalWare will also ensure
FBI agents can search authorised information in other agency databases,
in addition to the FBI's own data repository.
The new Intelligence Data Warehouse system will provide a Web-based,
collaborative environment for hundreds of agents who will eventually analyse
over one billion text, video, audio and image files. Using RetrievalWare,
agents can compare and contrast relevant information and find missing
links by securely accessing the Agency's Investigative Data Warehouse.
Search and classification requirements
RetrievalWare met the FBI's stringent search and classification requirements
that included:
Dynamic Classification
RetrievalWare will be tailored for the FBI's document classification system
to meet the agency's specific user requirements. RetrievalWare's dynamic
classification improves search and discovery quality by presenting search
results in personalised views enabled by visual discovery techniques that
reduce the time required to find and share knowledge. This gives the FBI
freedom and flexibility to dynamically organise and view essential information
assets, providing more efficient information exploration and problem solving
capabilities. With dynamic classification, agents will define personalised
criteria for information of interest and be notified when relevant information
enters the data warehouse.
Language Breadth
The FBI will also use RetrievalWare to search in nearly 50 languages,
including European, Asian and Middle Eastern languages. Convera's advanced
concept search technology will be utilised to search many of the languages.
RetrievalWare's cross-lingual cartridges will offer agents the option
of asking a question in one language and receiving the answer in another
- a unique feature and useful for the FBI. For example, an agent could
construct a search query in English and receive results in French or German.
Security
Essential for the FBI's sensitive and worldwide operations, RetrievalWare
offers complete document level security, as well as cross-repository and
cross-platform security.
Multimedia Search (text, image, video, paper and other)
RetrievalWare will allow agents to reach into and search upon a variety
of multimedia assets representing, for example, surveillance videotapes;
forensic reports such as blood, fingerprint and DNA; automated case files;
credit card transactions; terrorist watch lists; wiretaps; bank records;
credit card transactions; automated case files and even local law enforcement
arrest reports.
Industry-Specific Taxonomies
A pivotal agency protecting national security, the FBI has its own 'language'
for operations around the world. Convera's industry-specific taxonomy
and semantic network cartridges help ensure thousands of FBI mission-specific
concepts and terms are used to optimise search, discovery accuracy, relevancy
and personalisation for the Agency.
Convera. Tel: + 44 1344 781800; fax: + 44 1344 781801;
www.convera.com; e-mail: info@convera.co.uk

IM@T Online November 2003

|