app_comparison

Posted by & filed under Data Management.

The world of CBITS is two intersecting domains – Behavioral Intervention theory and web technology. Ontologies are one strategy to manage the multi-faceted body of knowledge, models, designs, data and code that researchers at CBITS encounter every day.

What Are Ontologies?

An ontology is a “domain model,” or a model of some particular area of knowledge such as anatomy, robotics, or farming. The goal of an ontology is to capture how humans understand the parts of a domain (such as bones, muscles, and organs) and the relationships between them (“muscles move bones,” “organs regulate the system.”) If sufficiently well-described, an AI system could, in theory, generate its own relationships from the model. For human users, ontologies can serve as knowledge management systems.

To promote integration with software, ontologies have a well-defined data model that ensures syntactic coherence. If a human enters knowledge that is coherent but untrue, it will still be sensible to a machine from a grammatical perspective, allowing it to compile. This data model, generally hidden from the users of graphical tools like Protégé, facilitates “expert knowledge capture.” A doctor with an excellent command of anatomy does not need to understand computer science to create an anatomical ontology.

The Protégé set of tools (http://protege.stanford.edu/) is used to create, edit, and visualize domain models. Protégé is expressive enough to support any domain. At CBITS, we have begun to explore the role of ontology in aiding the link between two seemingly disparate domains: preventative behavioral medicine and software technology.

Emphasis on Standards

The desire to create standards fuels the ontological approach. Although it is often enjoyable to debate the precise meaning and construction of higher-level concepts, these debates are time-consuming and often unresolved. In areas of medicine, where one hopes to eventually provide benefits to patients; or technology, where a software tool must eventually be built, there can be an advantage to minimizing the length and re-occurrence of such debates. Ontologies capture discussion in digital form while allowing revision. This helps to create consensus needed to move from abstract discussion to practical implementation.

This becomes even more relevant in intra-domain discussion, where the same word can have drastically different meanings. By clearly describing the meaning and relationships of terms within a specific domain, researchers who work in two different areas – such as psychological behavior and technology – can use ontologies as a guide to ensure that domain concepts are not confused during project planning and discussion.

At CBITS, the word “intent” can mean a general intention of a patient to perform a behavior, a way to tell an Android application what to do next, a specific intention of a person working in CBITS, or be used in a colloquial phrase such as “for all intents and purposes” without much particular meaning attached to it. In situations like these, even the natural ability of humans to manage ambiguity begins to break down. A computer might be more confused.

By helping to capturing broad concepts and anchoring them in domain-specific interpretations, ontologies can save time, save cost, and even resolve confusion among practitioners from different domains who work together. The logical structure of an ontology is extremely general – the next section will show some concrete examples of the power of this logical generality.

A Frames Ontology for Intellicare

“Intellicare” is a set of Behavioral Intervention Tools (BITs) currently in development by CBITS. Their clinical aim is to target anxiety and depression by addressing different behaviors that can improve or impede progress in these areas. The research goal of the Intellicare project is to create a “recommendation engine” that will suggest mini-apps to users based on their response to other apps. To create Intellicare, clinical researchers and technologists collaborated to reconcile models of clinical behavior with technology resources. The results of these collaborations were stored in an Excel spreadsheet.

To create a more flexible domain knowledge store, and to aid future design and evaluation of a recommendation engine, I used this Excel documentation to create an ontology of Intellicare.

This following example demonstrates that to capture expert knowledge in ontologies, not only do experts not necessarily have to understand details about technology, but they don’t necessarily have to understand ontologies. With sufficient semi-structured documentation, an ontologist can create a preliminary ontology that faithfully represents expert ideas – further collaboration can then refine the ontology, and in time the ontology can itself grow.

dsf

The classes in the Intellicare ontology can be seen in the left panel – these describe all the general features of Intellicare, including tech, behavior and clinical aspects.

instance_tech_interaction

Frames is an Object-Oriented model. Classes have instances which are annotated using “slots.” Slots, or attributes/properties/fields, are defined with respect to the class. Above, we describe a technology interaction called “record audio.”

slumber_instance

“Slumber Time” is an instance of “Mini_App,” and above it is described in relation to clinical goals.

Companion Ontologies

To keep ontologies compact within large domain spaces, I create a variety of ontologies for a domain rather then attempt to link all concepts in one file. This facilitates more user-friendly knowledge browsing.

Creating organizationally internal Frames-based ontologies also allows ontologists to represent similar concepts from different perspectives. This is helpful when “reverse-engineering” software tools that have an undocumented, but implicit domain model. To demonstrate, I reverse-engineered an ontology for “Slumber Time” by browsing the application on an Android tablet and documenting each observed part.

slumber_instance

Above, a set of instances of checklist items.

Designing with Ontology: Cows are Ruminants

While creating an app called “Cows are Ruminants,” a tool for addressing rumination, I re-used significant amounts of software code from “Slumber Time.” While the content of Ruminants and Slumber Time differ, from a technology perspective they share many attributes, such as slideshows, a method to create SQLite database, and questionnaires for users. The “Cows are Ruminants” ontology can be compared to the Slumber Time ontology, to retroactively begin to make generalizations that may speed up the development of future Intellicare apps.

In its unfinished state, the ontology provides an easy way to collaborate with the clinical lead. The ontology is structured enough to guide discussion, demonstrates clearly where there are gaps in understanding, and allows for precise and rapid specification of tool aspects such as the database structure or the relationship between two different surveys in the tool.

ruminants

Above, a clinician-defined database schema that I translated into an Android content provider. The ontology permits structured discussion and serves as documentation.

Frames vs OWL

What is the difference between Frames ontologies and those based on OWL, and what guides the choice of one over another? For the preceding examples, the Frames model was used. I explain this choice in the following section. * Intellicare: Closed World, Frames, Specific within Domain*

Frames ontologies follow the “closed world” assumption, which means that nothing is true in the domain unless it is specified by the ontology to be true. This closed-world assumption is limiting from the perspective of spontaneous knowledge generation by machines. The closed-world assumption is extremely helpful when porting ontologies into other technology domains, such as databases or software tools. This porting is best achieved when ambiguity is limited. The sacrifice of expression is made up for by the savings in time, and the possibilities of expanding model implementations into richly developed open-source ecosystems. An example of how this can work will be presented in a later section on the Django python framework.

The domain of Intellicare can be understood to be quite limited. It is a specific project, conceptualized by specific people, which will produce a specific suite of software tools. This specificity means that the Frames model does not do too much to hamper human expression, and the closed-world assumption allows an opt-in approach to defining concepts, which drastically reduces the amount of time required to model the domain. As an example, the Intellicare ontology above was translated from a spreadsheet in three hours of work. The two app ontologies took about an hour each. Comparatively, some OWL ontologies take years to build. * Behavior: Open World, OWL, Universal within Domain*

OWL ontologies use the “open-world” assumption, which means that anything in the domain that can be inferred from the ontology can be assumed to be true. This is useful when an ontology is intended to capture a very general domain – such as the domain of human emotion, or anatomy, or robotic decision making. The “open-world” assumption demands much more precision and time in ontological definition. Since anything that can be inferred from the ontology can be considered to be true, if machines are going to use the ontology, it is best to make sure that the ontology is as well specified as possible.

A computer has no problem conflating a logical truth with an empirical truth – this is, for example, how Amazon.com might decide that you are very interested in wireless keyboards and recommend hundreds of models to you a year after you’ve purchased the single one that you needed. Logically, you were at one point interested in keyboards, evidenced by your search for one on the site, but the system does not have sufficient capacity to understand that your interest has been sated. To avoid these issues, ontologists using OWL ontologies use precise detail when creating their ontologies. The advantage is that when done well, an OWL ontology of, say, human emotion can be considered relatively universal, and integrated across many other domains. The cost comes in time, with some OWL ontologies requiring teams of collaborators and years to complete.

Although our current work at CBITS focuses on Frames ontologies to capture concepts internal to CBITS, ontological work done in OWL will be invaluable in helping to integrate broader domain concepts (especially behavioral ones) into our specific tools. They will also aid our group when collaborating with other teams that use ontologies, or might like to, or are simply interested in reading ours. The exact reconciliation of these two methods remains to be developed. The open-source ethos of ontologies helps promote sharing and collaboration. This ethos has always been integral to productive research.

Next Steps for an Ontology of Intellicare * Intra-Ontology Integration*

One way to reduce “double-work” when developing ontologies is to integrate different ontologies as much as possible. Ontologies are easily composed and decomposed into larger or smaller domain models, and this can save a great deal of time, as concepts do not need to be continually re-worked.

prompt

The “Prompt” plugin is used to create new ontologies from old ones through composition and decomposition. Above, behavioral theory concepts are being added to the “Cows are Ruminants” app ontology.

Tool Development, Design and Reconciliation through Ontologies

As the suite of CBITs ontologies grows, they can be integrated into software tool development at earlier and earlier stages. Ontologies can be used to define the structure and clinical aims of a tool, refine the model, compare the tool to other tools already created to identify reusable code and components, and inform the integration of a tool into the overall Intellicare model. In time, as other software tools are pulled in, a general “BIT Framework” can be developed.

app_comparison

Above, shared structural components between two Intellicare apps – “Slumber Time” and “Cows are Ruminants.”

Once CBITS clinicians, researchers and technologists alike become comfortable with the use of ontologies, these models can help facilitate collaboration and create universal, reusable and persistent documentation of the many discussions that precede tool development. Ontologies can help identify redundancies, allowing teams to be more productive and do more work that focuses on new problems rather than rehashing old solutions.

Expert Knowledge Capture

Ontologies aid “expert knowledge” capture. The Intellicare ontology, translated from an Excel document, provides one example. An ontology which acts as a data-dictionary for the Purple Robot database is another. Using a plugin provided by Protégé, I imported the entire database schema for the Purple Robot sensor database into a Frames ontology. I sat down with one of the database experts and question him on the ontology, documenting choices made in the creation of the database.

datadtic

The result is a data dictionary for the current Purple Robot database that did not exist before and took approximately four man-hours to complete. Anyone who has inherited a data set to work with that has no data dictionary attached can appreciate our effort here.

Use Queries to Identify Gaps

Protégé has a query language that allows users to sift through the model based on attributes.

emptyquery

The query, defined on the left, reveals the result, on the right, that one Intellicare app has the proximal goal of “Identify and Change thoughts.”

fullqwuery

When a query returns no results, or unwanted one, this illustrates a “gap” in the model, or perhaps in the reality of current tools which, now identified, can be addressed. ** Future Work in Ontology** * Documentation / “Frames Style Guide”*

For ontologies to be productive at CBITS, more documentation is required. A style guide will be needed to guide ontology development. The flexibility of the tools promotes creativity, but guidelines will ensure that different ontologies can interact easily together.

Internal Tool that Serves Collaborative Frames Ontologies

A way to store ontologies for collective browsing or collaboration can also be developed – currently, our ontologies live on local machines and change tracking is not available.

A precedent for this exists as (http://protegewiki.stanford.edu/wiki/Collaborative_Protege). This client may serve our needs “out of the box,” but must be evaluated from an efficacy and a security standpoint. Fortunately, as an open-source project, we have the capacity to modify the tool if necessary – but due to resource constraints this may or may not be a good idea. * Translations between OWL instances and Frames Instances*

Translating between OWL and Frames remains one of the trickier problems that we are considering. It is desirable as a way to integrate the work of researchers around the world at the high-level into our more specific domains of individual software tools. Below, useful work has been captured in an ontology of emotion, and stored in OWL format:

grief grief2

The full ontology can be found at .

*Port Domain Models onto Web Technologies *

One exciting possibility of ontologies results from their highly-structured data model. Since ontologies are “object-oriented,” the models can be ported into other kinds of object-oriented systems. In the below example, a Java script reads a Frames ontology and generates a set of Django models. These models then may be passed through the Django ORM and compiled into a relational database.

django

Above on the left, the Protege data model. On the right, the appropriate translation into a python-based Django model element. A script written in Java uses regex to comb through a Protégé class file and generate the Django model.

Any object-oriented framework can be given the same treatment, creating increased coupling between expert knowledge capture and software tools. With capabilities like these, experts in domains outside technology can define tools, and technologists have to do less work to translate these definitions into the tools themselves, and can create more accurate translations.

  • Integration with Code Repositories*

One day ontologies could be directly connected to code repositories. CBITS ontologies could be delivered to collaborators who may then define ontologies of their own BITs by copying, composing, and decomposing our domain models. These models, if connected to scripts that connect the ontologies to other software Frameworks, can generate the base code for software tools. These models can also be connected directly to code repositories to import entire modules of code.

The intervention of technologists will still be needed to glue parts together, but through iteration, this intervention will become easier and easier. The net result is less work, less cost, and more patient benefits as BITs can be generated from semi-generic concepts into extremely specific, but functional tools. Additionally, research resources will be freed to continue innovating on the BIT model and incorporating new technologies as they appear.

The literal nature of computers can come in handy – the computer does not care if it is generating a database of patient information for an HIV study, or a study of depression, as long as the data model looks the same. Fortunately for us, the data model often does.