A Framework for Semantic Annotation of Arabic Language Web Sites

Project: Research

Project Details

Description

The central idea behind the semantic Web is to extend the current human-readable Web by annotating the Web resources (i.e., attaching semantic metadata to a Web resource) to encode some semantics and to make it in a machine-readable form. By encoding such kind of semantics into the Web resources, computer programs will be better able to search, process, integrate and present the content of these resources in a meaningful way and therefore can answer intelligent questions about the content of these Web resources. For the semantic annotation of the Web content to take place, a set of tools were designed to facilitate the process of annotating resources on the Web and exploiting their semantic relationships. These tools provide the described resources with a semantic metadata in a machine-readable, machine-understandable and usable form to anyone interested in using them. Unfortunately, most of the semantic Web tools are dedicated to process Latin family scripts (English, Italian, Spanish, etc.), thus, an apparent lack of Arabic script support in these technologies kept the research in the semantic Web and its applications for the Arabic language un-tackled. One more fact is that, there are various researches that have been conducted and others are ongoing in order to develop ontology-based and optimized semantic frameworks for the content of English Web resources. Unfortunately, there are very few researches for developing ontology-based frameworks for Arabic language. The limited research problem can be attributed to the lack of adequate resources in terms of skills, funding and interest in this emerging field of Arabic semantic Web tools. The allocation of research funding, the provision of resources, and interest from a committed practice community are essential if we are to overcome this problem. Arabic is the mother language of hundreds of millions of people in twenty Middle East and northern Africa countries. The Arabic language is a highly sophisticated one that may hinder the development of the tools for Semantic Web in that language. We propose this project based on the encouragement of the national policy adopted for science and technology in the Kingdom of Saudi Arabia under the information technology track to overcome such kind of hot research topics. This research project aims to address the obstacles of Arabic support within majority of semantic Web technology tools, to overcome these obstacles and to provide innovative solutions in order to transfer the success reached by the semantic Web community in multiple domains, such as Medicine, e-Commerce, e-Learning and Biology, to those Arabic Web applications. It will also address the on-going lack of Arabic domain-dependent ontologies and will develop Arabic ontologies that support the Arabic text annotation. We also will propose a new framework intended to add a semantic Web layer to the current Web based Arabic applications in order to improve the process of linking, integrating, and intelligently search for different types of Arabic language information in these Web applications. Arabic language has many particularities like short vowels, absence of capital letters, which makes it hard to identify proper names, acronyms, and abbreviations. Arabic is also highly inflectional and derivational, which makes morphological analysis a very complex task. We will provide innovative solutions to all of these obstacles.
StatusFinished
Effective start/end date1/05/151/04/16

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.