7-AppliedDOF

Download Report

Transcript 7-AppliedDOF

Applied Differential Ontology Framework
Bringing the knowledge of concepts to Information
Assurance and Cyber Security
Using FARES **
How we have done this and why
Presentation by Dr Peter Stephenson and
Paul Stephen Prueitt, PhD
Draft version 1.0 April 2, 2005
** FARES is the name of an Information Assurance product available from Center for Digital Forensics.
Ontology Tutorial 7, copyright, Paul S Prueitt 2005
Goal: extend some of the functionality of individual intelligence to the group
Of course this goal requires political and cultural activity. This kind of activity is
dependant on there being a level of education as to the natural science related
to behavior, computation and neuroscience. Since 1993, the BCNGroup has
focused on defining relevant educational, political and cultural dependencies.
We make a principled observation that semantic technology has not been
enabling to the degree that one might have assumed.
BCNGroup scientists suggest that the reason for this absence of performance
is the Artificial Intelligence (AI) polemic. This polemic falsely characterized
human cognition as a computation, and created the mythology that a computer
program will reason based on an active perception of reality.
If we look beyond the AI polemic we see that natural science understands a
great deal about individual intelligence and about group intelligence. In the
Differential Ontology Framework, this understanding is operationalized in a
software architecture that relies on humans to make judgments and on the
computer to create categories of patterns based on real time data.
The Fundamental Diagram
Scientific Origins:
J. J. Gibson (late 1950s)
Ecological Physics, Evolutionary
Psychology, and Cognitive
Engineering, and other literatures
Does a human Community of Practice (CoP) have a perceptional, cognitive
and/or action system?
Depends:
Some groups within the State Department (yes)
Some groups at HIST, NSF, DARPA, etc (yes)
Some groups in the Academy (yes)
Other groups in these same organizations (no, not at all)
Knowledge Management community (No, not really)
Computer Security and Information Assurance community (No, not really)
Iraqi Sunni community in Iraq in March 2005 (this might be forming)
Diagram from Prueitt, 2003
First two steps
are missing
Seven step
AIPM
RDBMS
Diagram from Prueitt, 2003
is not
complete
The measurement/instrumentation task
First two steps
in the AIPM
Measurement is part of the “semantic
extraction” task, and is accomplished with a
known set of techniques”
• Latent semantic technologies
• Some sort of n-gram measurement
with encoding into hash tables or internal
ontology representation (CCM and
NdCore, perhaps AeroText and
Convera’s process ontology, Orbs,
Hilbert encoding, CoreTalk/Cubicon.
• Stochastic and neural/ genetic
architectures
Differential Ontology Framework
Applications:
• Increase the degree of executive decision making capacity and the
degree of cognitive capability available to a human community of practice,
such as a group in the US State Department, or a group in US Treasury.
• Social groups interested in citizen watchdog activities, or other civil
activities can have this same technology.
• Business entities will be able to use this software to develop a greater
understanding of Risks to the business.
For technical descriptions see tutorials 1 – 7. (from [email protected])
Development steps, for DOF Beta Site
Development of an ontology with an editor like Protégé
Concepts related to Threats, Vulnerabilities, Impacts and Inter-domain communications are
specified but the set of concepts about Risks is not.
Domain expert, Peter Stephenson, used the methods of “Descriptive Enumeration” (DE) and
community polling to develop the set of concepts properties and relationships.
Peter’s role here is to represent what he knows about these realities without being
concerned about computable inference or ontology representation standards.
He used Protégé as a typewriter to write out concepts, specify relationship and properties.
It is a very creative process.
The modular DOF architecture
Three levels – upper, middle and scoped ontology individuals - are used.
The top level has a higher level abstraction for each of the core concepts that are in each of fine
middle level ontologies. Initially these middle level ontologies were developed manually for
Threats, Vulnerabilities, Impacts, and Inter-domain communication channels. The exercise at
this Beta site will demonstrate how to automate the development of a set of Risk concepts,
through the measurement of event log data.
Goal: An ontology over Risks is to be developed as a consequence of a measurement process
over some data set.
It is important to see that, in theory, any one of the five “upper level ontologies” can be
deleted and built using a data source, the other four, and the process we are prototyping.
Of course, one discovers what one discovers, and human tacit knowledge is involved in any
of these HIP processes since human in the loop is core to DOF use.
How does one judge the results?: A “arm-chair” evaluation is used whereby knowledgeable
individuals look at how and why various steps are done, and make a subjective evaluation
about the results. We also have a mapping between Risk evaluation ontology and a
numerical value with quantitative metrics. This mapping provided an informed measure of
Risk that can be converted to a financial and legal statement.
Log Data: Src
Address/port,
Dst Address/
port, Protocol
Organizational
Groups
Organizational
Group IP
Address
Ranges
Vulnerabilities
at the Policy
Domain Level
Differential
Ontology
Framework
Orb Analysis
Scoped
Ontology
Individuals
(possible
risks)
Security
Policy
Domains
Orb Analysis
Out-of-Band or
Covert
Communications
Channel
Behavior
Threats at
the Policy
Domain
Level
HumanCentric
Information
Production
Analysis
Inter-Domain
Communications
CPNet Model
Inter-Domain
Communications
Channel
Behavior
Risk Profile
Impacts at
the Policy
Domain
Level
Log Data: Src
Address/port,
Dst Address/
port, Protocol
Organizational
Groups
Organizational
Group IP
Address
Ranges
Vulnerabilities
at the Policy
Domain Level
Differential
Ontology
Framework
Orb Analysis
Scoped
Ontology
Individuals
(possible
risks)
Security
Policy
Domains
Orb Analysis
Out-of-Band or
Covert
Communications
Channel
Behavior
Threats at
the Policy
Domain
Level
HumanCentric
Information
Production
Analysis
Inter-Domain
Communications
CPNet Model
Inter-Domain
Communications
Channel
Behavior
Risk Profile
Impacts at
the Policy
Domain
Level
Top and middle
ontology
Scoped
ontology
Human expert
The Fundamental Diagram
DOF grounds the Fundamental Diagram
with correspondence to several
levels of event observation
First level: Data Instance
Example: Custom’s manifest data
ei
 w I/ s i
.
The event is measured (by humans or algorithms) in a report having both relational
database type “structured” data and weakly structured free form human language text.
Example: Cyber Security or Information Assurance data
e i  co-occurrence patterns
The event is measured (by algorithms) and expressed as a record in a log file.
In both cases, a FARES or modified FARES product establishes the ontology resources for a more
long term “True Risk Analysis” (TRA) process. DataRenewal Inc will start marketing both the FARES
product and the TRA product in April 2005.
The Fundamental Diagram
Second level: Concept Instance
Instance aggregation into a “collapse” of many instances into a category.
Example: the concept of “two-ness” allows one to talk about any instances of two things.
These aggregation of instance into category produces a bypass to many scalability
problems (the scalability issue never comes up in practice).
The aggregation process is called “semantic extraction” of instances into Subject Matter
Indicators (SMIs) that reference “concepts’. These concepts provide context for any
specific data instance.
There are several classes of patents on semantic extraction, all of these are useful within
DOF, and none is perfect with respect to always being right.
Matching Subject Matter Indicators to concepts
SMIs are found using algorithms.
•
The algorithms are complex and require expert use, however the results of good work
produces a computational filter that is used to profile the SMI and to thus allow parsing
programs to identify SMIs in new sources of data.
•
SMIs always produce a conjecture that a concept is present.
•
Once the conjecture is examined by a human, the concept’s “neighborhood” in the
explicit ontology can be reproduced as the basis for a small scoped ontology individual.
Concepts are expressed through a process of human descriptive enumeration and iterative
refinement.
In the FARES Beta site, Threats, Vulnerabilities, Impacts and Inter-domain communications are
separate middle DOF ontology each having about 40 concepts. These ontologies also have
relationships, attributes and properties and some subsumption (subconcept) relationships.
However, they are designed to be subsetting rather than to use as the basis for “inference”.
Because we do not use the Ontology Inference Layer in OWL, we convert the OWL formatted
information into Ontology referential bases (Orbs) encoded information.
We have SMI representation of the SMIs.
Thus a common representational standard exists between the SMIs and the set of explicitly
defined concepts.
Ontology referential base encoding within the DOF