Make your own free website on

Knowledge representation and reuse on the WWW:
conventions, RDF extentions and simple notations

Philippe Martin and Peter Eklund
Griffith University, School of Information Technology, PMB 50 Gold Coast MC, QLD 9726 Australia
Tel: +61 7 5594 8271; Fax: +61 7 5594 8066; E-mail: {philippe.martin,p.eklund}

The Resource Description Framework (RDF) provides a low-level general model to describe relationships between objects. Its ultimate goal is to permit the representation, combination and processing of most kinds of metadata or knowledge from Web-accessible documents or databases. Metadata resources can then be processed for information retrieval, filtering, personalizing, aggregation or management. However, apart from simple metadata, representing knowledge with the low-level RDF model and its verbose syntax in XML (RDF/XML) is difficult. The process implies many choices in the representation of common logical/semantic notions, e.g. those related to variables, sets, negation and contexts. If an agent (person or software) does not follow the same conventions as another, representions will hardly be comparable, mergeable and retrievable. The same conclusion applies for the ontological choices (and to a lesser extent the lexical choices) an RDF/XML author is faced with.
In this paper we synthesize lexical, logical and ontological conventions from various knowledge representation projects, and when necessary, extend RDF/XML so that additional constructs can be represented. We also propose two high-level and expressive textual notations - Frame-CG and Formalised English - to allow users to rapidly represent knowledge. These notations may then be automatically translated into RDF/XML. Such conventions and notations seem to us a necessary step to permit knowledge representation and reuse by a large number of people and thus support the "semantic Web" [
Berners-Lee et al., 1999] for which RDF was created.


Data/Metadata/Knowledge Representation/Retrieval/Reuse, RDF, Ontology, Conventions.

Table of contents (Note: this article is formatted for a good display on A4 paper by Netscape).
1. Introduction
2. General conventions
2.1. Lexical normalisation
2.2. Structural and semantic normalisation
3. Terms and notations for logical cases
3.1. Simple graphs: individuals and existential quantifiers
3.2. Contexts
3.3. Collections and intervals
3.4. Universal quantifiers, type definitions, typical relations
3.5. Second-order statements
4. Terms and conventions for ontological cases
4.1. Some general categories
4.2. Descriptions
4.3. Characteristics and measures
4.4. Situations, processes and temporal entities
4.5. Miscellaneous
5. Conclusion

1. Introduction

The Resource Description Framework [RDF] can be characterized as a simple frame model [RDFModel]. It is sufficiently low-level and general for most other knowledge representation models to be translated into. However, since it is low-level, there are many possible ways for such translations. Translations into a general but higher-level model such as KIF [KIF] or Ontolingua (KIF plus constructs for frames) involve fewer choices. Ontolingua also supports a library of precisely defined top-level concepts. To a less extent, the KIF lexical conventions also reduce the choices that a KIF/Ontolingua user has to make. The fewer choices involved during knowledge representation, the less possibility that the same concept will be represented differently and therefore the more likely the representations are comparable, retrievable and mergeable. (Note: in this article, "representations" with an 's' refers to results of a knowledge representation process).

As acknowledged in [Berners-Lee et al., 1999], "work is needed to define common terms for [extending RDF to the power of knowledge representation systems, e.g. the handling of quantification and inference]". Conventions are also needed for the use of these formal terms (concept/relation type names). This should be done as soon as possible to prevent the proliferation of different representations for similar constructs. A usual concern is that an increase in expressiveness will lead to a model and a language too complex to handle efficiently. In actual fact, RDF and RDF/XML will barely have to be modified. Like other XML agents, the RDF/XML analyzers will exploit some of the new terms and ignore others. Indeed, the more terms an analyzer takes into account, the more accurate and flexible and the less efficient it is. Some applications may require the exploitation of rules, sets, negation and contexts, whereas a search engine can do a good enough job by exploiting simple structure matching techniques and allow users or other agents further interpret and refine results using terms or parts of structures not exploited by the search engine (e.g. structures representing the context of the central information). In other words, complexity is in the analyzers, not in the languages. [Berners-Lee, 1998] further illustrates the differences between the semantic Web and centralized knowledge representation systems.

RDF/XML, a serialization syntax of RDF in XML has been defined [RDFModel]. This syntax enables the reuse of some XML software but an analyzer dedicated to RDF is still required. As noted in [RDFModel], "it is important to understand that this XML syntax is only one possible syntax for RDF and that alternate ways to represent the same RDF data model may emerge". As a matter of fact, RDF/XML is so verbose it is difficult to directly represent or update a reasonable amount of knowledge in this format. More intuitive textual or graphical notations are needed for Web users to read or write knowledge and directly insert it within documents in addition to (or in place of) sentences or graphics. Conversions between such notations and RDF/XML may ensure greater reuse of this knowledge. We have designed two simple but expressive textual notations - Frame-CG (FCG) and Formalised English (FE) - to enable and encourage Web users to represent their information in a machine-readable way.

In Section 2, we propose lexical, structural and semantic general conventions, synthesizing conventions from various knowledge representation projects. In Section 3, we propose ways to apply and extend RDF/XML to various logical cases of knowledge representation. For each case, we also show how the use of FCG and FE eases the representation task. In Section 4, we propose ways to represent various knowledge representation ontological cases. We only use FCG in the examples of Section 4 since the translation of FCG into RDF/XML and FE is shown in Section 3.

2. General conventions

This article is not only about RDF but knowledge representations in general. However, to refer to components we assume the representations can be translated into a general directed graph model such as RDF and we use the more common terminology of Conceptual Graphs [CGs] where a "concept" refers to a node ("resource" in RDF; this may represent one object or one object collection, which may be named or not), a "relation" refers to a relationship between concepts (a "property" in RDF), and a "type" ("class" in RDF) refers to a formal term representing a kind of concept or relation. Inside an ontology ("schema" in RDF), the types may be organized by generalization relations ("subclass" relations in RDF) or other kinds of relations defining or restricting their uses (e.g. exclusion relations and signatures of relations).

2.1. Lexical normalisation

2.1.1. InterCap style for identifiers

Identifiers in RDF/XML must have legal XML names. The "InterCap style" has also been adopted for property names (i.e. relation names) in [RDFModel, Appendix C], e.g. subClassOf. In the [RDFSchema, Section 1.2.2], this convention is extended to names of object classes but their first letter must also be capitalized, e.g. rdfs:Resource and rdfs:ConstraintProperty. These two naming conventions are also adopted by the Meta Content Framework Using XML [MCF/XML].

2.1.2. High-level lexical facilities

Higher languages or query interfaces can provide lexical facilities for the user. For instance, an analyzer for FCG and FE should be able to automatically convert identifiers that include uppercase letters, dashes or underscores into the Intercap style, as well as exploit user-defined aliases. Such analyzers should also accept queries or representations that use undeclared type names (e.g. common words) when the relevant type names can be automatically guessed via the structural and semantic constraints in the queries or representations and the ontologies they are based upon. When different interpretations are possible, the user should be alerted to make a choice. This last facility, detailed in [Martin & Eklund, 1999], is particularly interesting when the exploited ontologies reuse a natural language lexical database such as WordNet [WN]: it spares the user the complex and tedious work of declaring and organizing all the terms. This facility (along with high-level notations and interfaces) seems an essential step to encourage Web (human) users to build knowledge representations. It should also be noted that the XML name space specification should allow the exploitation of CGI servers (or any other way of storing and accessing large databases such as WordNet) instead of static files.

Similar ideas for the exploitation of lexical databases such as WordNet are developed in [Guarino et al., 1999] and implemented in Ontoseek for searching yellow pages and product catalogs. The authors show that structured content representations coupled with linguistic ontologies increase both recall and precision of content-based retrieval. Search is performed via graph matching and exploits generalization relations between WordNet types referred to by nouns and between WordNet types referred to by verbs. The graphs must be rooted (i.e., they must have a head node) and the identifiers must be either nouns or verbs. Semantic checks are automatically performed using the distinction between essential types and role types (e.g. essential types are exclusive unless they are related by generalization relations).

2.1.3. Nouns for identifiers

Generally a sentence can be rephrased to avoid the use of adjectives and verbs (with the exception of "to be" and "to have"). For instance, "A cat named Tom jumps toward a wooden table" may be rephrased into "The cat which has for name Tom is agent of a jump that has for destination a table the material of which is some wood". This sentence (which is a correct sentence in FE) seems unnatural but makes the concepts and their relations explicit and therefore exploitable by an automated analyzer. In this FE sentence, determiners (e.g. "a" and "the") explicitly quantify concepts when instance identifiers are not used (this explains why "wood" is preceded by "some").

The convention of using nouns, compound nouns or nominal forms of verbs whenever possible within representations not only makes them more explicit, it also efficiently reduces the lexical and structural ways they may be expressed. It therefore increases the possibilities of matching them.
Concept types referred to by adjectives can rarely be organized by generalization relations but may be decomposed into concept types referred to by nouns. Concept types referred to by verbs can be organized by generalization relations (though the organization of top-level types is difficult) but cannot be inserted into the hierarchy of concept types referred to by nouns (and therefore cannot be compared with them) unless verb nominal forms are used. These nominal forms, e.g. Driving, also recall the need to represent the time frame or frequency of the referred processes. For similar reasons, value restrictors should also be represented via nouns, e.g. ImportantWeightForAMouse and ImportantWeightForAnElephant, rather than via adjectives such as Important.

Most identifiers in current ontologies are nouns (e.g. the Dublin Core [DC] or the Upper Cyc Ontology [CYC]), even in relation type ontologies such as the Generalized Upper Model relation hierarchy. Avoiding adverbs for relation type names is sometimes difficult, e.g. for spatial/temporal relations. But this does not create any problem in organizing relation types by generalization relations. What should be avoided is the introduction of relation types names such as isDefinedBy and seeAlso (both proposed in [RDFschema]). Better names seem to be Definition and AdditionalInformation. These new names are consistent with the usual conventions (including RDF conventions) for reading graph triplets {concept source, relation, concept destination}: "<concept source> HAS FOR <relation> <concept destination>" or "<concept source> IS <relation> <concept destination>" or "<concept destination> IS THE <relation> OF <concept destination>".

In RDF/XML, relations cannot be reversed, as for instance in the way the keyword "of" reverses the relation in the last above reading. The user must write his/her graphs in other ways or introduce new relation type names, e.g. partOf. This last case decreases graph matching possibilities unless the links between the type names and their inverses are specified by the users and the analyzers exploit such rules. This extra work for users and analyzers could be avoided by extending the RDF/XML syntax for relations, e.g. with an attribute of. FCG and FE allow the use of the keyword "of" after the relation type name to indicate a change of direction.

2.1.4. Singular nouns for identifiers

Most identifiers in ontologies are singular nouns. Category names must be in the singular in the Meta Content Framework Using XML [MCF/XML]. For the sake of normalization, it is therefore better to avoid the use of plural identifiers whenever possible, for instance when representing relations about each member of a collection rather than about the collection itself. In RDF/XML, the keyword to use is "aboutEach" in the first case and "about" in the second case. Unfortunately, "about" is also normally used for describing relations from/to objects which are not collections. This leads to mistakes as in the following excerpt of an example about the "Dublin Core Metadata" given in [RDFModel, Section 7.4.]. D-Lib Program - Research in Digital Libraries Research; statistical methods Education, research, related topics Library use Studies World Wide Web Home Page

This representation asserts that "" has for subject a bag. The keyword "aboutEach" should have been used instead of "about" to assert that "" has for subject the different members of the bag. Here, the declaration and use of the relation type name subjects or the concept type name Subjects would not be a good idea.

2.2 Structural and semantic normalisation

We have seen that lexical conventions and facilities influence the structural and semantic aspects of representations. We now focus on these aspects.

2.2.1. First-class relations

Relations (or "properties") in RDF are not local to an object, they are also known outside the scope of an object. A relation is a first-class object itself and can be connected to any instance of the types given in the signature of the relation. This permits distributive developments (anyone may represent anything about any object) and eases the process of comparing representations (and therefore merging them). This still permits the representation of relations typically or necessarily associated to objects of a particular type. Thus, though most frame-based systems also allow local relations but it is better to avoid them for the sake of knowledge reuse.

2.2.2. Binary basic relations

As with most frame-based models, RDF only has binary and unary relations. Relationships of greater arity may be represented by using structured objects or collections, or using more primitive relations. For instance, "the point A is between the points B and C" may be represented using the relation type between and a collection object grouping B and C, or using the relation types left and right, above and under, etc. Most often, decomposition makes a representation more explicit, precise and comparable with other representations.

Thus, relations should rather refer to simple relationships because complex relationships cannot be compared unless special rules are added. For instance, a direct represention of "Tom is watching a mouse" with the relation type watching will not be directly comparable with assertions or queries for "a watching that has for agent an animal and for object an animal" which involves the concept type Watching and the relation types agent and object. Similarly, only the last case permits another user to easily refine the "watching" object by adding relations to it. As a rule of thumb, relations should not refer to processes and should rather be named with simple "relational nouns", e.g. part and characteristic. Some complex relational nouns such as child and driver are often too handy to be avoided but imply additional lexical or structural facilities (e.g. those of Ontoseek).

2.2.3. Avoid disjunctions, negations and collections

Representations including disjunctions, negations or collections are generally less efficiently exploitable for logical inferencing than conjunctive existential formulas and IF-THEN rules based on these formulas [BRML].

It is often possible to avoid disjunctions and negations without loss of expressivity using IF-THEN rules or by exploiting type hierarchies. For instance, instead of writing that an object X is an instance of DirectFlight OR of IndirectFlight, it is better to declare X as an instance of a type Flight that has DirectFlight and IndirectFlight as exclusive subtypes (i.e. types that cannot have common subtypes or instances). Exclusion links between types or in some cases between whole formulas are kinds of negations that can be handled efficiently, and are included in many expressive but efficient logic models, e.g. Courteous logic on which the Business Rules Markup Language [BRML] is based.

The introduction of identifiers for collections may also often be avoided using "distributive collections", i.e. by using a keyword such as "aboutEach". Distributive collections are often easy to handle since they can be considered as syntactic shortcuts for representing relations about each of their members. Type definitions describing typical or necessary relations associated with the type instances are also a way of representing facts about collections of objects that knowledge representation systems generally handles more efficiently than if these relations were directly represented using (real) collections inside other assertions.

Contexts are most often unavoidable for expressivity sake but they can be handled efficiently if treated as positive contexts [Chein & Mugnier, 1997], that is, if only their structures are taken into account, and not the special semantics of the terms they include.

For efficiency reasons again, many knowledge representation systems, especially frame-based systems, work on rooted graphs. In a rooted graph, there is a head node representing the "central" object of the representation, the object the other nodes detail. Therefore, to help the translations of representations for various kinds of systems, it seems better to begin each representation with its central object, whatever the language used (RDF/XML, FCG, FE, etc.).

2.2.4. Precision, term definitions and constraints

The more precise the representations are, the less chance they conflict with each other, and the more they can be cross-checked, merged and exploited to answer queries adequately. Representations should rather be contextualized in space, time and author origin. No relevant concepts should be left implicit. For instance, instead of representing that "birds fly", it seems better to represent that "a study made by Dr Foo found that in 1999, 93% of healthy birds can fly" and to categorize the species of birds under the exclusive subtypes BirdWhichCanNormallyFly and BirdWhichCannotNormallyFly. In FE:
```93% of [birds with chrc a good health] can be agent of a flight' time 1999'.
with source the study that has for author'.
Bird has for partition {{BirdWhichCanNormallyFly, BirdWhichCannotNormallyFly}}.

(Notes: (i) the FE analyzer interprets "93% of birds ..." as a statement about the instances of the type Bird; (ii) "good" is a keyword in FE, not a type name; (iii) it would have been very difficult to represent "birds of most species can fly" and the result hardly exploitable for logical inferences).

It is stated in [RDFModel, Section 2.3] that for some uses, writing property values without qualifiers is appropriate, e.g. "the price of that pencil is 75" instead of "the price of that pencil is 75 U.S. cents". However, a representation of the first sentence would be ambiguous, not comparable with other prices (except perhaps by US applications concerning pencils) and therefore not reusable. This violates the original purpose of RDF.

To improve precision and allow consistency checks, it is important to use precise types and associate constraints about their use. At least, signatures should be associated to relation types, and exclusion between types represented. Top-level ontologies and domain ontologies organize terms and provide constraints for their use.

To improve the retrieval of technical types (terms) or representations using them, it is important that they specialize usual terms. A way to do this is to specialize formal terms from a natural language ontology such as WordNet with the domain-oriented terms. Extending such an ontology is often quicker and safer than creating an ontology from scratch, ensures a better reusability of the representations and automatic comparisons with representations based on the same ontology. These issues are discussed and implemented in Ontoloom/Powerloom [LOOM].

As is now apparent, high-level notations such as FCG or FE and the lexical and syntactic facilities associated with them make the edition of knowledge representations easier and more reusable than the direct use of low-level notations such as RDF/XML. Authoring tools can provide additional semantic facilities such as semantic checks or the automatic contextualisation of representations with their creation dates and author e-mail addresses.

3. Terms and notations for logical cases

We now propose RDF/XML representations for the most common knowledge representation syntactic/logical cases. When necessary, we extend RDF/XML or introduce new types. We hope these representations will provide an initial basis for discussion about conventions to adopt in these different cases.

We also show how the FE and FCG high-level languages can be applied in such cases to ease the design of representations and increase their reusability. The EBNF, Yacc and Lex grammars of FE and FCG are available at

3.1. Simple graphs: individuals and existential quantifiers

Existential statements are statements that can be translated to a conjunctive existentially quantified formula of first order logic. For instance, the FE sentence "there is a document which has for author the person Tom" can be translated into the KIF logic formula "(Exists ((Document ?d)) (And (Person Tom) (Author ?d Tom)))". Existential statements are simple and RDF need not to be extended to represent them: they can only refer to individual objects via their identifiers or existentially quantified variables. However, existential statements are common knowledge representations and logical inferences can be drawn upon by simple structural generalization. For instance, from the previous statement, it is possible to deduce that "there is a document" and "there is a document which has for author a person". By exploiting the generalization hierarchy over types, it is also possible to deduce that "there is a thing that has for author a living_entity" or "Tom is author of a thing". Conversely, if any of these last four sentences is used as a query (e.g. in FE: "Is Tom the author of a thing?", the first can be retrieved and is an exact answer. Even though logical deductions cannot be performed in the same way with more complex statements, the same technique can be applied to efficiently retrieve them. For instance, the statement "`Tom is the author of a document' is false" does not entail "Tom is the author of a thing" but is still an interesting answer to the query "Is Tom the author of a thing?".

Here is an existential statement in English (E) followed by translations in FE, FCG and RDF. For simplicity, formal terms (except RDF terms) are not prefixed by an ontology identifier, they are assumed to be in the default name space (ontology). In FE and FCG, terms need not be prefixed: the analyzer generates an error message if a non-prefixed term exists in different loaded ontologies. All the terms we use in our examples - apart from keywords and URLs and some relation types - are WordNet nouns. The cases of letters are not important in FE and FCG except for individual identifiers which should begin with an uppercase. The FE/FCG keywords "a", "the", and "some" all specify an existential quantifier. The FE keywords "there", "which", "who" and "that" are optional and ignored by the analyzer. E: Mary likes herself and the cat named "Tom". FE: There is a liking which has for agent Mary, for object Mary and for object the cat that has for name "Tom". FE: Mary is liking Tom and is liking the cat that has for name "Tom". FCG: [Mary, agent of: (a liking, object: Mary, object: (the cat, name: "Tom"))] RDF:

This example shows two equivalent FE translations, the second using the form "is" + NAME_ENDING_BY_ing as an abbreviation for the explicit use of the relations agent and object. Without the keywords "and" and ",", an FE sentence is a chain of concepts and relations. The equivalent keywords "and" and "," must be placed before a relation to specify that it is not be connected to the previous concept but its predecessor in the chain. If ", and" is used, the predecessor of the predecessor is referred to, and so on (e.g. ",,and"). Thus, graph structures may be represented, not only chains. In FCG, this structuration is specified via parenthesis.

This example also shows how a coreference can be represented via an individual identifier. A name (e.g. "Tom") cannot be used as an identifier in FE, FCG and RDF. The following example illustrates how a coreference can be represented via an existentially quantified variable, how it can connect separated statements, and how the use of the keyword "of" reverses the direction of a relation in FE and FCG. E: A person created the home page This person is named Ora Lassila. FE: A person *p is the creator of the home_page *p has for name "Ora Lassila". FCG: [a person *p, creator of: the home_page] [*p, name: "Ora Lassila"] FCG: [a person, name: "Ora Lassila", creator of: the home_page] RDF:

The introduction of the keyword "of" in RDF/XML as in the following example seems interesting though here the head of the graph should rather be the HomePage concept.
RDF: <Person name="Ora"> <creator of><HomePage/></creator> </Person>

In our examples, variables and most of individuals are typed. We use simple notations but we could also explictly use the special relation type. The keyword "is" may also be used in the FE notation. E: Joe is a smoker. FE: The smoker Joe. FE: Joe is a smoker. FE: Joe has for type Smoker. FCG: [Joe, type: Smoker]. RDF: RDF:

The scope of an existentially quantified variable extends to the end of the file in which it is defined or until it is redefined (it also needs to be defined before being referenced). When a variable is declared, it is possible to specify that it does not refer to another individual (referred to by its identifier or a variable) as in the following example. E: A cat "c" that is not Tom is on a mat. FE: There is a cat *c that is different_from Tom and that is on a mat. FE: A cat *c!=Tom is on a mat. FCG: [a cat *c!=Tom, on: a mat] RDF:

In FE and FCG, identifiers and operators may be aliased. For instance, "Mary is giving a letter to Tom" could be represented in FE by:
alias to "and with recipient"; Mary is_ giving a_ letter to Tom.
The previous FE sentence is automatically expanded into "Mary is agent of a giving with object a letter and with recipient Tom". A user may find aliases useful to abbreviate constructs that are common in his/her representations. XML, and therefore RDF/XML, already permits some forms of aliases, e.g. names for identifiers in different languages. Considering RDF/XML should rather not be used directly by the user, such aliases are probably sufficient.

3.2. Contexts

Syntaxically, a context is a concept which embedds other concepts (possibly linked by relations). Semantically, a context represents a situation (i.e. relationships between objects in a real or imaginary world) or a statement (i.e. a description of a situation). As any other concept, a context may be referred to via an individual identifier or an existentially quantified variable. Thus, it is possible to describe relations from/to them and therefore about their content. The embedding of statements can be done via backquotes and quotes in FE, square brackets in FCG, and the use of the keyword "aboutEach" in RDF to reference relations inside a representation.

Some of these relations involve situations, e.g. to situate them in time, while others involve statements, e.g. to state that they have been authored by someone at a certain time. The signatures of such relations is important information for checking or classifying the types of contexts connected by such relations. However, explicitly typing contexts with Situation or Statement to comply with the relation signatures is not intuitive, and it leads to lengthy representations and rather arbitrary decisions. For instance, can logical relations be directly connected to Situation concepts or is an intermediary Statement concept always necessary? This problem is not yet tackled by the RDF specifications [RDFModel, RDFSchema], only one type of context is used: rdf:Description.

We propose that the RDF parsers still accept rdf:Description as a generic type for contexts, but automatically deduce their adequate types (Situation or Statement) and the implicit intermediate contexts. A similar facility exists for FE and FCG. Even with this facility, the RDF/XML notation for contexts quicky becomes cumbersome. Here is an example. E: Tom believes that Mary now likes him (in 1999) and that before she did not. FE: alias time in; Tom is believer of ` *s `Mary is liking Tom' in 1999' and is believer of `~*s is before 1999'. FCG: [Tom, believer of: [*s [Mary,agent of:(a liking,object:Tom)], time:1999], believer of: [~*s, before: 1999] ] RDF:

In FE and FCG, an existentially quantified variable defined in a context is known both in this context and inner contexts. It may be redefined in these inner contexts.

We have represented negation using the relation type truth. A better convention might be to allow the typing of the rdf:Description concept with a concept type Negated_description or simply Not. [Berners-Lee, 1999] also proposes these terms (truth and Not), plus an IF-THEN construct to allow the representations of rules. Though we reuse this construct for some RDF representations below, an alternative is to use relations with types such as implication or equivalence. In FE and FCG, the operator "=>" and "<=>" may be used instead of these types.

Since individual identifiers may be URLs in FE, FCG and RDF, documents or certain parts of documents may be referred to inside representations and their content or characteristics described. However, as shown by the following example, explicitly describing the representation process itself is a more generic and precise approach. The FCG in this example represents how, when, by whom and with which ontology the title of a certain document has been represented. The title (in English) and its representation (in FE) are enclosed within the strings "$(" and ")$". Such uncommon strings permit the delimitation of arbitrary parts of documents. (The title is also about a representation process but that is a coincidence). [a representation, agent:, language: FE, ontology:, creationDate: "21/01/1999", expirationDate: "22/7/9999", source: [, part: the title], object: $( KADS-I models in CGs )$, result: $( KADS-I has for part several models object of a representation which has for language CG. )$ ]

[Berners-Lee, 1999] proposes a special construct for ontology inclusion. The above solution seems more general/elegant. Berners-Lee also goes beyond logic-based representation by proposing a trust model based on digital signatures.

Contexts can also be used for connecting representations by rhetorical relations, argumentation relations, and for representing modalities. Some terms are proposed in Section 4. It should be noted that contexts do not permit the expression of higher-order statements, such as for instance the logical properties of transitive relations.

3.3. Collections and intervals

Relations from a collection considered as a whole (i.e. as a single object) can be represented via existential statements. The keywords "group of" or "together" must be used in FE and FCG to specify that these relations do not apply to each member of the collection. The following example uses sets to specify that the members cannot be identical. Since RDF does not introduce it, we use the type name Set without the prefix "rdf:". The existentially quantified variable *g refers to the entire collection. The variable *e refer to each element. E: Fred, Wilma and another man collectively approved a resolution. FE: A resolution has for approver *g the group of *e {Fred,Wilma,a man}. FE: A resolution has for approver a group that has for member Fred, for member Wilma and for member a man different_from Fred and different_from Wilma. FE: Together *c {Fred,Wilma,a man} is approver of a resolution. FCG: [*g the group of *e {Fred,Wilma,a man}, approver of: a resolution] RDF:

Representing relations applying to each member of a collection requires the use of a universal quantifier over the collection. Therefore, the scope of the various quantifiers in a statement must be expressed via nested contexts and variables, or via the order of the involved concepts in a serialization syntax, as in natural languages. Since the last solution is more intuitive, we have adopted it in the FE and FCG linear notations. No keyword (such as "certain" in English) can be used to modify the scope of the quantifiers. Here are two examples that illustrate the influence of the concept order in English, FE and FCG. For the second RDF representation, we have re-used the constructs proposed by [Berners-Lee, 1999] for universal and existential quantification. E: Fred, Wilma and another man have each approved a certain resolution. E: A resolution has been approved by Fred, Wilma and another man. FE: A resolution has for approver the persons {Fred, Wilma, a man}. FE: A resolution has for approver *e. //reference to *e in previous example FCG: [a resolution, approver: the persons *e] RDF: E: Fred, Wilma and another person have each approved a resolution. FE: {Fred, Wilma, a person} are approvers of a resolution. FCG: [{Fred,Wilma, a person}, approver of: a resolution] RDF: #p

In the last example, the members of the collection may or may not have approved the same resolution. It is the same if an individual resolution is specified via an identifier or an existentially quantified variable that has been declared outside the scope of the collection (outside the "forall" construct in RDF; before the collection or in a different context or statement in FE and FCG). In such cases, the two above RDF constructs should be treated as equivalent by the RDF analyzers which handle them. There is no keyword in FE, FCG and RDF that would allow us to specify that the resolutions approved by each member of the collection is different.

As pointed out earlier, the 's' at the end of the terms used for representing collections are automatically removed in FE and FCG. Commas are used for separating elements of a set but the strings "&", "<", "=<", "|" and "/" may also be used to respectively represent a bag, an ordered set, an ordered bag ("sequence" in RDF: rdf:Seq), an OR-bag ("alternatives" in RDF: rdf:Alt) and an XOR-set ([RDFModel] does not provide a term for it but XORset seems to be an adequate type name). As in RDF, the elements of these collections may be any concept, e.g. collections or contexts. For instance: E: Tom is on a mat or on the floor. FE: `Tom is on a mat' or `Tom is on the floor'. FE: {`Tom is on a mat' | `Tom is on the floor'}. FCG: [{[Tom, on: a mat] | [Tom, on: the floor]}] RDF:

RDF introduces a special keyword to allow statements about files with a certain prefix. This may also be done in FE and FCG by adding the wildcard '*' at the end of the prefix. E: The files prefixed by "" have been authored by Foo. FE: The files "*" have for author Foo. FCG: [the files "*", author: Foo] RDF:

FE and FCG provide many other keywords to represent collections. For instance: "several", "many", "most", "mostly", "at most", "at least", "dozens", "hundreds", "thousands", "millions", "billions". Numbers are also allowed. Similar keywords could be used in RDF constructs for collections or universal quantifiers as in the following example. (Note: this example refers to a pre-declared set of resolutions *r). E: At least 3 persons, including Fred, approved most of the resolutions. FE: Most of *r have for approver at least 3 persons {Fred,*}. FCG: [most of *r, approver: at least 3 persons {Fred,*}] RDF:

As shown in the next two examples, intervals can be represented using XOR-sets or normal sets. E: Tom ran between 14 min and 15 min. FE: Tom is agent of a run with duration 14 to 15 minutes. FE: Tom is agent of a run with duration between 14 and 15 minutes. FCG: [Tom, agent of: (a run, time: 14 to 15 minutes)] RDF: E: Tom ran between 14hrs and 15hrs (2pm to 3pm). FE: Tom is agent of a run with time from 14 to 15 hour__time_of_day. FCG: [Tom, agent of: (a run, time: from 14 to 15 hour__time_of_day)] RDF:

3.4. Universal quantifiers, type definitions, typical relations

The constructions or keywords usable for quantifying over the members of a collection may be reused to quantify over the instances of a type. Restrictions may be made via keywords or additional relations on the instances as in the two following examples. The first example mixes universal and existential quantifiers. Their order is therefore important. E: Every person in this office has approved a resolution. FE: Every [person that is in *o] is approver of a resolution. FCG: [every (person, in: *o), approver of: a resolution] RDF: #p E: At least 2 % of persons like most of cats. FE: At least 2 % of persons are liking most of cats. FCG: [at least 2 % of persons, agent of: (a liking, object: most of cats)] RDF: #p

The keywords "Every" and "any" can be used in the descriptions of relations "necessarily" connected to the instances of a certain type, that is, relations which "by definition" must be connected to each instance (if an object does not have the described characteristics, it is not of this type). For example, "every rectangle has for part 4 sides" and "any tree has for part a trunk".

Types for real-world objects can rarely be defined, i.e. "necessary" relations can rarely be used with them. One may be tempted to use "typical" relations. It is however more precise to describe relations that apply to a certain percentage of individuals of this type. Intervals such as "23% to 40%" or "at least 70%" are allowed in FE and FCG. "Most" can be used as an abbreviation for "at least 75%". Number intervals and keywords such as "at least" and "at most" may be used for representing relationships of "entity-relationships" models, as in "every company has for employee at least 1 person".

Often, one simply wants to represent the fact that every concept of a type t1 can be connected to another concept of type t2 via a relation of a type tr. The following example shows how this can be addressed in FE and FCG and how we propose this can be done in RDF. It should be noted that such statements are more precise than simply representing that relations of type tr have for range concepts of type t1 and destination concepts of type t2. E: Any characteristic can be the characteristic of something. FE: Any characteristic can be chrc of a thing. FCG: [any characteristic, chrc of <= a thing] RDF: #c

Relations "necessarily" connected to the instances of a certain type are also traditionally represented by directly connecting relations to the type itself, especially to represent generalization relation between types, and domain and range for relations. This is the case in [RDFModel] and is therefore also possible in FE and FCG. Here are three examples. We use the terms "subtype of" and "partition" instead of "subClassOf" and "subPropertyOf" in the FE and FCG representations. A relation of type partition relates a type to a set of exclusive subtypes. E: A minivan is a van and a passenger vehicle. FE: Minivan is subtype of Van and subtype of passenger_vehicle. FCG: [Minivan, subtype of: {Van, Passenger_vehicle}] RDF: E: Motorized and unmotorized vehicles are exclusive kinds of vehicles. FE: Vehicle has for partition {{Motorized_vehicle, Unmotorized_vehicle}}. //Set of set: { {...}, {...}, ... } FCG: [Vehicle, partition: {{Motorized_vehicle, Unmotorized_vehicle}} RDF: E: The relation Age relates a person to an integer and is a physical characteristic relationship. FE: Age has for domain Person, for range Integer and for supertype PhysChrc. FCG: [Age, domain: Person, range: Integer, type: PhysChrc]. RDF:

Finally, here is how many types for an individual, and many instances for a type, can be declared. E: Joe is a smoker and a driver. FE: Joe is a smoker and is a driver. FE: Joe has for type Smoker and for type Driver. FCG: [Tom, type:{Smoker,Driver}]. RDF: E: Joe, Jack and William are smokers. FE: {Joe,Jack,William} have for type Smoker. FCG: [{Joe,Jack,William}, type: Smoker] FCG: [the smokers {Joe,Jack,William}] RDF:

3.5. Second-order statements

In first-order statements, it is possible to use quantifiers over individuals but not over types. It is therefore possible to describe that "the part of a part is also a part" but not possible to describe in general what a transitive relation is. [Berners-Lee, 1999] proposed a syntax in RDF to refer to a type via a variable and represent the definition of the "range of a relation". Here it is, preceeded by the FE and FCG versions. E: If rt is a relation type connected to the type ct by the relation "range" then if r is a relation r of type rt and with destination the concept y then y has for type ct. FE: `a relationType *rt has for range a conceptType *ct' = ` `a thing *x has for *rt a thing *y' = `*y has for type *ct' '. FCG: [ [a relationType *rt, range: a conceptType *ct], = [ [a thing *x, *rt: a thing *y], = [*y, type: *ct] ] ] RDF: #ct #y #ct

4. Terms and conventions for ontological cases

We now focus on how different kinds of objects can be represented. We only use the FCG notation in our examples since its translation into FE and RDF/XML has been shown.

4.1. Some general categories

Relatively few top-level concept types are required for the signatures of most kinds of relations useful for general information, e.g. sentences or images from documents. These concept types and the constraints associated to them (e.g. the exclusion links between them) are useful for organizing ontologies, guiding knowledge modelling and preventing certain inconsistencies. We detailed this in [Martin and Eklund, 1999] and introduced a top-level ontology of 150 top-level concept types and 150 basic relation types. We have used these concept types to organize the upper levels of the WordNet ontology. In the remainder of this article, we focus on the most generic of these concept types and represent the necessary or possible relationships from their instances. In this way, we also propose a model for knowledge representation.

Before doing so, let us highlight with an example how these general types help render precise the knowledge representation process. We obtained a hierarchy of terms about computer and network technology. It appeared that the hierarchy was not a generalization hierarchy since it mixed various kinds of objects and therefore various kinds of relations. Here is a commented extract. Computing //process or domain? Computer_hardware //physical object! Compiler //hardware or software? Computer_software //description! Software_language //description medium! Applications //process description (may or may not be software)!

Below is a part of the ontology we have built to explicit the general category of the above terms and thus permit semantic checks and the use of relations associated to the general categories. The author of the previous hierarchy had to be contacted to determine what some of the terms meant. [Domain_object, subtype: {Computing_object Networking_object}] [Networking_object,partition:{{Network_software,Network_hardware}}] [Physical_entity, subtype: {Computer_hardware, Network_hardware}] [Hardware, subtype: {Computer_hardware, Network_hardware}] [Network_hardware, partition: {{Local area network, Switch}}] [Description, subtype: {Software, Application}] [Descr_medium, subtype: {Software_language, Network_language}]

Here are the generalization and exclusion relations between our most generic concept types. [Thing (^the supertype of all first order concept types^), partition: { {Entity, Situation}, {Something_playing_a_role} }] [Situation (^type of things that occur in a region of time and space^), partition: {{State, Process},{Phenomenon},{Situation_playing_a_role}}] [Process, subtype: Event (^an event is a process the duration of which is considered nearly instantaneous for the application, e.g. in astronomy, the explosion of a star^)] [Entity (^type of things that may be involved in a situation^), partition: { {Information_entity, Temporal_entity, Spatial_entity}, {Collection}, {Entity_playing_a_role} }] [Spatial_entity (^type of entities that occupy a space region^), partition:{{Space_location, Physical_entity, Imaginary_spatial_entity}, {Spatial_entity_with_a_remarkable_characteristic}}] [Physical_entity (^type of spatial entities made of matter^), partition: { {Inanimate_object, Living_thing}, {Goal_directed_entity(^a cognitive entity^)} }] [Information_entity (^type for the information or its representations^), partition: { {Description, Descr_container, Characteristic, Measure, Measure_unit} }] [Description (^description of a situation^), partition: { {Description_content, Description_medium} }] [Description_content,partition:{{Belief,Hypothesis,Narration,Argument}}] [Description_medium, partition: {{Symbol,Syntax,Language,Abstract_data_type,Script}}] [Description_container,partition:{{File,Image,Hologram,Document_element}}] [Characteristic (^a dimension of something^), partition: {{Psychological_characteristic, Physical_characteristic, Situation_characteristic}}] [Physical_characteristic, partition: {{Mass, Length, Form, Color}}] [Measure, partition: {{Psychological_measure, Physical_measure,Situation_measure}}] [Measure_unit, partition: {{Psychological_measure_unit, Physical_measure_unit,Situation_measure_unit}}]

4.2. Descriptions

We now focus on necessary and possible relationships from concepts of the types listed above. We begin with descritions and related objects. [any thing, descr <= a description, //anything may be abstracted/represented descr_in<= a descr_container, //e.g. in a file, a hologram, software ] [any description, //in Logic, it is called a "proposition" descr of : a thing, //a description is about something descr_medium : a descr_medium, //it uses symbols or a language descr_container: a descr_container,//it is stored, e.g. in a file author : 1 entity, //any description has a unique author believer <= a cognitive_agent, //it may have some believers modality <= a modality, //it may be put in context logical_relation <=a description,//and connected to other descriptions rhetorical_relation<=a description //by logical or rhetorical relations ] [any descr_container, descr_support: a physical_entity, descr_in of: a thing] [any descr_medium, descr_medium of: a thing] [ [a thing *t, descr_in: a descr_container *c], <= [*t, descr: (a description, descr_container: *c)] ]

Examples of logical relation types are Or and Xor. Examples of rhetorical relation types are summary, motivation and antithesis.

4.3. Characteristics and measures

When representing characteristics, it is important that the characteristic and its measure(s) are distinguished, as in [a pen, physChrc: (a length, measure: 12 cm)]. Since this habit does not come naturally, abbreviations such as the following should probably be adopted as conventions: [a pen, length: 12 cm]. However, this implies additional constraints on the ontology and its exploitation by the analyzers. For intance, Length should be declared as a subtype of a type Characteristic and this would have to be a predefined type in any analyzer. Here are relations for explicit reprentations. [any thing, chrc <= a characteristic //e.g. length, speed, ingenuity ] [any characteristic, measure <= a measure, //any characteristic may be measured chrc of <= a thing //and may be the characteristic of things ] [any measure, quantity: a number, //(note: the measure of an object's unit: a measure_unit, // characteristic may be unknown) measure of: a characteristic ] [any physical_entity, physChrc <= a physical_characteristic] [any goal_directed_entity, physChrc <= a psychological_characteristic]

A lot of concept types may be found in WordNet for physical or psychological features, e.g. Memory and Cognition, but unfortunately those types are not well organized and often mixed with types which should not be represented as such, e.g. mind, lexicon and structure.

4.4. Situations, processes and temporal entities

Here are some representations of relationships that are or may be be connected to situations, processes and temporal entities. Most of them were proposed by [Sowa, 1984]. [any situation, s_succ of: a situation, //any situation follows (an)other one(s) s_succ : a situation, //and is followed by (an)other one(s) location : a spatial_entity, //a situation occurs somewhere time : a temporal_entity, //at a time (even an imaginary situation) duration : a duration, //and for a certain duration situationChrc<= a situation_characteristic ] [any temporal_entity, time of <= a situation, duration of <= a situation, temporal_order_relation <= a temporal_entity ] [any process, triggering_event<= an event, ending_event <= an event, ending <= a state, ending of <= a state, precondition <= a state, postcondition <= a state, agent <= an entity, initiator <= a goal_directed_agent, instrument <= an entity, object <= a thing, experiencer<= a conscious_agent, recipient <= an agent, result <= a thing, sub_process<= a process, method <= a description, manner <= a situation_characteristic, source <= a spatial_entity, destination<= a spatial_entity, path <= a spatial_entity ]

The relation types input and output may repectively be declared as subtypes of object and result. Further specializations are input_output, object_to_modify and object_to_mute.

Given we may use intervals and exclusive sets for representing time concepts, only two relation types seem necessary between situations and temporal entities: time and duration. Here are examples of two typical cases. E: On the 21/12/1999, John went to his office between 13h30 and 14h. FCG: [a travel, agent: John, destination: (an office, owner: John), time: 13.5 to 14 hour] E: John usually takes 20 min or 40 min to go to his office. FCG: [most of (travel, agent: John, destination: (an office, owner: John)), duration: {20 min | 40 min}]

These examples do not violate the signatures of time and duration. The following representation, which uses a context, should be considered by the analyzers as equivalent to the first of the above two examples.
[ [John, agent of: travel, destination: (an office, owner: John)],
time: 13.5 to 14 hour]

4.5. Miscellaneous

Here are additional common kinds of relationships. [any thing, part <= a thing] //anything may be decomposed [any physical_entity, material: a physical_entity] [any collection, subset <= a collection, element<= a thing, count: a natural] [Order_relation, domain: Thing, range: Thing, partition: { {Spatial_order_relation, Temporal_order_relation}, {Meet,In,Near,Before,After} }] [Spatial_order_relation, subtype: {On,Above,Below}] [On, subtype of: {Meet,Above}]

We have defined relation types such as meet and near as direct subtypes of Order_relation and allowed them to connect any pair of concepts. Specializations of these types, e.g. spatial_meet and temporal_meet, could be defined to allow the use of more precise and constrained types upon which further semantic checks can be done. Such precise modelling may be found in the CYC and Ontolingua top-level ontologies (for instance, 2D and 3D spatial relations are distinguished). However, we should not expect the average user to spend his time looking for the most specific terms in such libraries. Nonetheless, these libraries can be exploited by authoring tools to automatically find more specific relations that do not violate the signatures of the general relation used.

Even though relations such as part or subset are partial order relations similar to subtype, for sake of precision, they should not be directly connected to concept types. For instance, [Airplane, part: Wing] might be intended to represent the fact that "any airplane has for part a wing", but many alternative interpretations are possible: "any wing is part of a plane", "a wing is part of any plane", etc.

5. Conclusion

Information can be represented in a number of different ways, especially with low-level general languages such as RDF/XML. For knowledge representations to be automatically comparable, conventions must be followed by the authors. We have proposed general lexical, structural and semantic conventions, then examined issues associated to the most common logical and ontological cases, and proposed ways to represent knowledge in these cases. We proposed two high-level notations, FE and FCG, to ease the knowledge representation task. Though we raised a number of issues, we only examined those that the average RDF/FE/FCG user is likely to encounter. As highlighted above, the precise models that are found in CYC and Ontolingua top-level ontologies are certainly useful but will doubtfully be used directly by human agents for representing information or quickly indexing documents, sentences or images. Instead, we expect people to use a small set of relation types and simply use common words for concepts types: given the signatures of the relation types, a lexical database such as WordNet may be exploited by an authoring tool to derive the relevant concept types or ask the user for more precision [Guarino & al., 1999] [Martin & Eklund, 1999]. Users will also probably use scalable multi-user knowledge servers to refer, fix and complement lexical databases.


This work is supported by a research grant from the Australian Defense, Science and Technology Organisation (DSTO).


T. Berners-Lee, The Semantic Toolbox: Building Semantics on top of XML-RDF, W3C Note, 24 May 1999.; see also the "Semantic Web Road map" at

T. Berners-Lee, D. Connolly, R. Swick., Web Architecture: Describing and Exchanging Data, W3C Note, 7 June 1999.

M. Chein and M.L. Mugnier, Positive Nested Conceptual Graphs, Proceedings of ICCS'97, 5th International Conference on Conceptual Structures (Springer Verlag, LNAI 1257, pp. 95-109), Seattle, USA, August 4-8, 1997.

N. Guarino, C. Masolo and G. Vetere, Ontoseek: Content-based Access to the Web, IEEE Intelligent Systems, Vol. 14, No. 3, pp. 70-80, May/June 1999

Ph. Martin & P. Eklund, Embedding Knowledge in Web Documents. Proceedings of WWW8, Eigth International World Wide Web Conference, special issue of The International Journal of Computer and Telecommunications Networking (in press), Toronto, Canada, May 11-14, 1999.

J.F. Sowa, Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley, 1984.

BRML Business Rules Markup Language.
CGs Conceptual Graphs.
CYC See also: Lenat, D. B. Cyc: A Large-Scale Investment in Knowledge Infrastructure, Communications of the ACM 38, no. 11 (Nov. 1995).
DC Dublin Core.
KIF Knowledge Interchange Format.
LOOM The LOOM knowledge representation system. See also MacGregor & Patil's description at
MCF/XML Meta Content Framework Using XML
RDF Resource Description Framework.
RDFModel RDF Model and Syntax Specification.
RDFSchema RDF Schema Specification.
WN WordNet.

Dr Philippe Martin is a Research Fellow at Griffith University's School of Information Technology (Australia). He received his Ph.D. in Software Engineering (1996) from the University of Nice - Sophia Antipolis (France). This Ph.D. was undertaken at the INRIA of Sophia Antipolis (France), ACACIA project. Since then, Philippe has been member of the Knowledge Visualisation and Ordering (KVO) group at the University of Adelaide in 1997 and at Griffith in 1998.

Professor Peter W. Eklund is the Foundation Chair of the School of Information Technology at Griffith University and research leader of the KVO group. He received his Ph.D. in Computer Science from Linköping University in Sweden in 1991. Peter was a visiting research scholar at Hosei University (Japan) in 1992 and Senior Lecturer in Computer Science at The University of Adelaide from December 1992 to 1997.