The World Wide Web (WWW) emerged in 1989, developed by Tim Berners-Lee who proposed to build a system for sharing information among physicists of the CERN (Conseil Européen pour la Recherche Nucléaire), the world’s largest particle physics laboratory. Currently, the WWW is primarily composed of documents written in HTML (hyper text markup language), a language that is useful for visual presentation (Cardoso & Sheth, 2005). HTML is a set of “markup” symbols contained in a Web page intended for display on a Web browser. Most of the information on the Web is designed only for human consumption. Humans can read Web pages and understand them, but their inherent meaning is not shown in a way that allows their interpretation by computers (Cardoso & Sheth, 2006). Since the visual Web does not allow computers to understand the meaning of Web pages (Cardoso, 2007), the W3C (World Wide Web Consortium) started to work on a concept of the Semantic Web with the objective of developing approaches and solutions for data integration and interoperability purpose. The goal was to develop ways to allow computers to understand Web information. The aim of this chapter is to present the Web ontology language (OWL) which can be used to develop Semantic Web applications that understand information and data on the Web. This language was proposed by the W3C and was designed for publishing, sharing data and automating data understood by computers using ontologies. To fully comprehend OWL we need first to study its origin and the basic blocks of the language. Therefore, we will start by briefly introducing XML (extensible markup language), RDF (resource description framework), and RDF Schema (RDFS). These concepts are important since OWL is written in XML and is an extension of RDF and RDFS.
Everyday, the Web becomes more attractive as an information sharing infrastructure. However, the vast quantity of data made available (for example, Google indexes more than 13 billion pages) makes it difficult to find and access the information required by the wide diversity of users. This limitation arises because most documents on the Web are written in HTML (HTML, 2007), a language that is useful for visual presentation but which is semantically limited. As a result, humans can read and understand HTML Web pages, but the contents of Web pages are not defined in a way that computers can understand them. If computers are not able to understand the content of Web pages it becomes impossible to develop sophisticated solutions to enable the interoperability and integration between systems and applications.
The aim of the Semantic Web is to make the information on the Web understandable and useful to computer applications and in addition to humans. “The Semantic Web is an extension of the current Web in which information is given well-defined meaning, better enabling computers and people to work in cooperation” (Berners-Lee et al., 2001). The Semantic Web is a vision for the future of the Web, in which information is given explicit meaning, making it easier for machines to automatically process and integrate the information available on the Web.
One of the corner stones of the Semantic Web is the OWL. OWL provides a language that can be used by/on applications that need to understand the meaning of information instead of just parsing data for display purposes. Nowadays, several projects already rely on semantics to implement their applications. Example include semantic wikis (Campanini et al., 2004), social networks (Ding, et al., 2005), semantic blogs (Cayzer & Shabajee, 2003), and Semantic Web services (McIlraith et al., 2001),
Key Terms in this Chapter
Metadata: Data that describe other data. Generally, a set of metadata describes a single set of data, called a resource.
XML: The extensible markup language (XML) is a simple, very flexible text format derived from SGML (ISO 8879). XML is accepted as a standard for data interchanged on the Web, allowing for the structuring of data but without meaning.
Semantic Web: The Semantic Web provides a common framework that allows data to be shared and reused across applications, enterprises, and community boundaries. It is a collaborative effort led by W3C with the participation of a large number of researchers and industrial partners.
OWL: A markup language for publishing and sharing data using ontologies on the Internet. OWL is a vocabulary extension of the RDF and is derived from the DAML+OIL Web Ontology Language.
Ontology: Is a description of concepts and relationships that can be used by people or software agents that want to share information within a domain. An ontology document defines the terms used to describe and represent a domain.
RDF: Resource description framework is a family of World Wide Web Consortium (W3C) specifications originally designed as a metadata model using XML but which has come to be used as a general method of modeling knowledge, through a variety of syntax formats.
RDFS: RDF schema is an extensible knowledge representation language, providing basic elements for the definition of ontologies, otherwise called RDF vocabularies, intended to structure RDF resources.