Semantic Web – A Survey
Advanced Analysis and Algorithms
Dr. Awais Adnan
Roll No 3
Institute of Management Science
Semantic Web –
It is a very big challenge for the
next coming years to find what has been searched and requested effectively and
efficiently. To find the exact information, a normal user spends much time on
it. On this problem Semantic Web Mining is very helpful. By using Semantic Web
and Web Mining both will improve mining by using semantics and to generate
semantics by mining. Both areas will help in making the web more meaningful and
Analyze the results provided by different search engines like Google, Bing and
Yahoo, the personalized access to the information available on the Web is
required (Svatopluk et al., 2005). As of 2008, the estimated size of the web’s
portion accessible by different search engines was almost one trillion pages
(Google Blog). The sheer scale of the web, together with its decentralized,
highly inessential and largely inexact nature, makes using the knowledge
within rather unmanageable. Moreover, the relevant knowledge can be throw
across many resources, which provide the attempts to make use of all the accessible
content even more complicated.
problem is mostly referred to as “information overload”. To some extent,
the problem has been address by advanced technologies based on the field of
information retrieval, which power the current web search engines and make finding
of related resources easy. The resulting information overload problem is being
faced by many technologies drawing inspirations from various fields of computer
science. Likely the most influential field in this context is information
retrieval, most visibly experienced in the form of web search engines like
Google (http://www.google.com,), Yahoo (http://www.yahoo.com) or Bing (http://www.bing.com).
The information retrieval methods cover real portion of the web content, but they
only oat on the surface of the real meaning of the data they index due to their
dependence on mere stringbase The Semantic Web try to accompaniment the rather superficial
information retrieval approach by adding meaning to the strings of the web
content with the statistics and heuristic ranking. In the next sections, we will
start with a small overview of the areas
of Semantic Web and Web Mining. After that section, we will discuss an overview
of challenges and future trends in the Semantic Web implementation.
Semantic Web is about providing
meaning to the data from different kinds of web resources to allow the machine
to interpret and understand these enriched data to precisely answer and satisfy
the web users’ requests . Semantic Web is a part of the second generation web
(Web2.0) and its original idea derived from the vision W3C’s director and the
WWW founder, Sir Tim Berners- Lee. According to Semantic Web represents the
extension of the World Wide Web that gives users of Web the ability to share
their data beyond all the hidden barriers and the limitation of programs and
websites using the meaning of the web. Overviews
of various emerging technologies of semantic web are given below.
(formerly DAML-S) is a services ontology ,within the
OWL-based framework of the Semantic Web, that provides
software agents to discover, invoke, create, and monitor Web resources.
OWL 2: OWL
2 increase the Web Ontology Language (OWL) with a useful but small set of
features. OWL 2 ontologies provide data values, properties, individuals, and
classes and are stored as Semantic Web documents.
WSMO: Web Service Modeling Ontology OR WSMO is a conceptual model for similar characteristic related to Semantic Web
Services. it provides the automation of invoking, joining, and
discovering electronic services over the
WSML: Web Service Modeling Language or WSML provides a formal syntax and semantics for the WSMO (Web Service
Modeling Ontology). it consists of several variants, such as
WSML-Rule, WSML-DL, WSML-Flight, WSML-Core, and WSML-Full.
Semantic Web Rule Language aims to be the Semantic Webs standard rule language
and is based on a combination of the OWL DL, OWL Lite, RuleML and so on.
constitutes a modular family of Web sublanguages including derivation rules,
queries and integrity constraints as well as production and reaction rules.
goal of Rule Interchange Format (RIF) is to be the standard rule language of
the Semantic Web for Rule Interchange.
1: Semantic Web Architecture
meaning of linguistics is studied by its specific sub discipline, semantics.
The meaning is analyzed at the level of sentences, phrases, words, and larger
units of discourse. Signs are the basic subjects of study in semantics, which
may be understood as discrete units of meaning (words, images, gestures,
scents, tastes, textures, sounds, etc., essentially all forms of a message in
which information can be transferred by the participants in a communication
process). Two major discrete conceptions of signs have been suggest by two key
figures required in the birth of the modern linguistics:
Dualistic signs: According to
Saussure, a sign is composed of the signifier and the signified.
former is conceived as a language demonstration of a conceivable and/or obtainable
entity or idea, while the latter is the mental demonstration or a concept of
the entity or idea that is being signified. The requisite between the signifier
and signified in a sign is completely arbitrary.
Signs as triadic relations: The idea of a stable
relationship between a signifier and its signified is rejected by Peirce.
Departing from language-based motivations, he introduced a idea of sign
motivated largely by philosophical logic. His main focus was on proposing a
theory of production of meaning instead of a theory of language itself. The
result is the idea of sign that initiate
meaning by recursive relationships between three sets, corresponding to
three basic semiotic components:
Representamen: The symbolic
representation of the denoted thing, object or idea.
Object : The object being represented by the sign.
Interpretant: The meaning of the sign,
represented by yet another sign decided by the process of interpretation.
relations between three sets of semiotic components present the ways how the meaning of a sign is
linked with its actual representation in the language in the world. The basic tools employed in the
investigations, which focus for lexical semantics, are lexical relations like
antonymy, synonymy , hyponymy or hyperonymy. The meaning of lexical units is
usually decided in a top-down way by
human experts (lexicographers) after studying relevant language resources. The
meaning itself is establish by empirical analysis of various general patterns
appearing between words in the large scale data sets. The approach of
distributional, or statistical semantics is essentially a bottomup and can be
automates to large extent.
of the meaning of single words or phrases is only the first step towards
studying the semantics of more complicated natural language structures like
sentences. The meaning of a sentence is analyzed by parsing it into its
syntactic tree first. Then components of the parse tree are transformed into a
logical form, which is in turn used for the sentence’s logical analysis by
means of associated truth conditions.
It is a very interesting research topic which
combines Data Mining and World Wide Web, two of the activated research areas. The World Wide Web is a fruitful
area for data mining research because
large amount of information is available online. The Web mining research
relates to research communities of database, information retrieval, and AI. The
World Wide Web (Web) is a popular and interactive medium to spread information
today. The Web is huge and diverse and thus raises the scalability and multimedia
data respectively. Oren Etzioni first
coined the term Web mining in his paper in 1996. Etzioni starts by making a
hypothesis that the information on the Web is acceptably
structured and outlines the sub tasks of Web mining and extend the Web mining
processes. The Web data mining can be defined as the finding and analysis of
useful information from the World Wide Web data.
AND FUTURE TRENDS
Web provides new challenges to the traditional data mining algorithms that will
work on flat Data. Some of the traditional data mining algorithms have been
extended and new algorithms have been used to work on the Web data. With
volatile growth of the information resources available on the WWW (World Wide
Web), it has become growingly necessary for users to utilize automatic tool in
order to locate the required information resources, and to track their usage
patterns. These factors give rise to the necessity of to creating server-side
intelligent systems and client-side
intelligent systems that can adequate
mine for knowledge. The analysis of huge web log files is a complicated task
not fully addressed by existing
access analyzers. However, it is hard to find exact tools for analyzing raw web
log data to retrieve important and useful information. There are some
commercially available web log analysis tools, but most of them are not liked
users and considered very slow, inflexible, cost effective, difficult to maintain
and very less in the results they can provide.
of the tools using data mining techniques to provide web log analyses are being
created, the research is still in its infancy. The current techniques for
analyzing web usage resources have different drawbacks, for example, either
large storage requirements, excessive I/O cost, or scalability problems when
some more information is introduced into the analysis.
and maintaining web based information systems, such as Web sites, is a great
challenge. On the Web, it is much easier to find inconsistent source of
information than a well structured site. There is a important relation between
structured documents, i.e, Web sites and a program; the Web is a great
candidate to experiment with some of the technologies that have been developed
in area of software engineering.
mining is a very new and quickly developing research and application area. With
more collaborative research across different disciplines like database, AI,
statistics and marketing, we will be capable to development web mining websites
and applications that are very helpful and useful to the web based information
systems. In recent years Web Mining has been an important topic in data mining research
from the standpoint of supporting human-centered uncovering of knowledge. The current
day model of web mining effected from a number of shortcomings as listed
earlier. As services over the web continue to rise, there will be a continuing
need to make them fast, robust, scalable and efficient.