Semantic Web – A Survey Advanced Analysis and AlgorithmsDr. Awais Adnan Amir IqbalMSCS 2017-2019Roll No 3First SemesterInstitute of Management ScienceIMSciences, Hayatabad Semantic Web – A Survey AbstractIt is a very big challenge for thenext coming years to find what has been searched and requested effectively andefficiently. To find the exact information, a normal user spends much time onit.
On this problem Semantic Web Mining is very helpful. By using Semantic Weband Web Mining both will improve mining by using semantics and to generatesemantics by mining. Both areas will help in making the web more meaningful andsemantic.
INTRODUCTIONToAnalyze the results provided by different search engines like Google, Bing andYahoo, the personalized access to the information available on the Web isrequired (Svatopluk et al., 2005). As of 2008, the estimated size of the web’sportion accessible by different search engines was almost one trillion pages(Google Blog). The sheer scale of the web, together with its decentralized,highly inessential and largely inexact nature, makes using the knowledgewithin rather unmanageable. Moreover, the relevant knowledge can be throwacross many resources, which provide the attempts to make use of all the accessiblecontent even more complicated.
Thisproblem is mostly referred to as “information overload”. To some extent,the problem has been address by advanced technologies based on the field ofinformation retrieval, which power the current web search engines and make findingof related resources easy. The resulting information overload problem is beingfaced by many technologies drawing inspirations from various fields of computerscience. Likely the most influential field in this context is informationretrieval, most visibly experienced in the form of web search engines likeGoogle (http://www.
google.com,), Yahoo (http://www.yahoo.
com) or Bing (http://www.bing.com).
The information retrieval methods cover real portion of the web content, but theyonly oat on the surface of the real meaning of the data they index due to theirdependence on mere stringbase The Semantic Web try to accompaniment the rather superficialinformation retrieval approach by adding meaning to the strings of the webcontent with the statistics and heuristic ranking. In the next sections, we willstart with a small overview of the areasof Semantic Web and Web Mining. After that section, we will discuss an overviewof challenges and future trends in the Semantic Web implementation.
SEMANTICWEBSemantic Web is about providingmeaning to the data from different kinds of web resources to allow the machineto interpret and understand these enriched data to precisely answer and satisfythe web users’ requests . Semantic Web is a part of the second generation web(Web2.0) and its original idea derived from the vision W3C’s director and theWWW founder, Sir Tim Berners- Lee. According to Semantic Web represents theextension of the World Wide Web that gives users of Web the ability to sharetheir data beyond all the hidden barriers and the limitation of programs andwebsites using the meaning of the web. Overviewsof various emerging technologies of semantic web are given below.
OWL-S: OWL-S(formerly DAML-S) is a services ontology ,within theOWL-based framework of the Semantic Web, that providessoftware agents to discover, invoke, create, and monitor Web resources.OWL 2: OWL2 increase the Web Ontology Language (OWL) with a useful but small set offeatures. OWL 2 ontologies provide data values, properties, individuals, andclasses and are stored as Semantic Web documents.WSMO: Web Service Modeling Ontology OR WSMO is a conceptual model for similar characteristic related to Semantic WebServices. it provides the automation of invoking, joining, anddiscovering electronic services over theWeb. WSML: Web Service Modeling Language or WSML provides a formal syntax and semantics for the WSMO (Web ServiceModeling Ontology). it consists of several variants, such asWSML-Rule, WSML-DL, WSML-Flight, WSML-Core, and WSML-Full.SWRL: TheSemantic Web Rule Language aims to be the Semantic Webs standard rule languageand is based on a combination of the OWL DL, OWL Lite, RuleML and so on.
RuleML: RuleMLconstitutes a modular family of Web sublanguages including derivation rules,queries and integrity constraints as well as production and reaction rules.RIF: Thegoal of Rule Interchange Format (RIF) is to be the standard rule language ofthe Semantic Web for Rule Interchange. Figure1: Semantic Web Architecture LINGUISTICSThemeaning of linguistics is studied by its specific sub discipline, semantics.The meaning is analyzed at the level of sentences, phrases, words, and largerunits of discourse.
Signs are the basic subjects of study in semantics, whichmay be understood as discrete units of meaning (words, images, gestures,scents, tastes, textures, sounds, etc., essentially all forms of a message inwhich information can be transferred by the participants in a communicationprocess). Two major discrete conceptions of signs have been suggest by two keyfigures required in the birth of the modern linguistics: Dualistic signs: According toSaussure, a sign is composed of the signifier and the signified.Theformer is conceived as a language demonstration of a conceivable and/or obtainableentity or idea, while the latter is the mental demonstration or a concept ofthe entity or idea that is being signified. The requisite between the signifierand signified in a sign is completely arbitrary. Signs as triadic relations: The idea of a stablerelationship between a signifier and its signified is rejected by Peirce.Departing from language-based motivations, he introduced a idea of signmotivated largely by philosophical logic. His main focus was on proposing atheory of production of meaning instead of a theory of language itself.
Theresult is the idea of sign that initiate meaning by recursive relationships between three sets, corresponding tothree basic semiotic components:· Representamen: The symbolicrepresentation of the denoted thing, object or idea.· Object : The object being represented by the sign.· Interpretant: The meaning of the sign,represented by yet another sign decided by the process of interpretation. Therelations between three sets of semiotic components present the ways how the meaning of a sign islinked with its actual representation in the language in the world. The basic tools employed in theinvestigations, which focus for lexical semantics, are lexical relations likeantonymy, synonymy , hyponymy or hyperonymy.
The meaning of lexical units isusually decided in a top-down way byhuman experts (lexicographers) after studying relevant language resources. Themeaning itself is establish by empirical analysis of various general patternsappearing between words in the large scale data sets. The approach ofdistributional, or statistical semantics is essentially a bottomup and can beautomates to large extent. Studyof the meaning of single words or phrases is only the first step towardsstudying the semantics of more complicated natural language structures likesentences.
The meaning of a sentence is analyzed by parsing it into itssyntactic tree first. Then components of the parse tree are transformed into alogical form, which is in turn used for the sentence’s logical analysis bymeans of associated truth conditions. WEBMININGIt is a very interesting research topic whichcombines Data Mining and World Wide Web, two of the activated research areas. The World Wide Web is a fruitfularea for data mining research becauselarge amount of information is available online.
The Web mining researchrelates to research communities of database, information retrieval, and AI. TheWorld Wide Web (Web) is a popular and interactive medium to spread informationtoday. The Web is huge and diverse and thus raises the scalability and multimediadata respectively. Oren Etzioni firstcoined the term Web mining in his paper in 1996. Etzioni starts by making ahypothesis that the information on the Web is acceptablystructured and outlines the sub tasks of Web mining and extend the Web miningprocesses. The Web data mining can be defined as the finding and analysis ofuseful information from the World Wide Web data. CHALLENGESAND FUTURE TRENDSTheWeb provides new challenges to the traditional data mining algorithms that willwork on flat Data.
Some of the traditional data mining algorithms have beenextended and new algorithms have been used to work on the Web data. Withvolatile growth of the information resources available on the WWW (World WideWeb), it has become growingly necessary for users to utilize automatic tool inorder to locate the required information resources, and to track their usagepatterns. These factors give rise to the necessity of to creating server-sideintelligent systems and client-sideintelligent systems that can adequatemine for knowledge. The analysis of huge web log files is a complicated tasknot fully addressed by existingwebaccess analyzers. However, it is hard to find exact tools for analyzing raw weblog data to retrieve important and useful information. There are somecommercially available web log analysis tools, but most of them are not likedbytheirusers and considered very slow, inflexible, cost effective, difficult to maintainand very less in the results they can provide.
Someof the tools using data mining techniques to provide web log analyses are beingcreated, the research is still in its infancy. The current techniques foranalyzing web usage resources have different drawbacks, for example, eitherlarge storage requirements, excessive I/O cost, or scalability problems whensome more information is introduced into the analysis. CONCLUSIONCreatingand maintaining web based information systems, such as Web sites, is a greatchallenge. On the Web, it is much easier to find inconsistent source ofinformation than a well structured site. There is a important relation betweenstructured documents, i.e, Web sites and a program; the Web is a greatcandidate to experiment with some of the technologies that have been developedin area of software engineering. Webmining is a very new and quickly developing research and application area.
Withmore collaborative research across different disciplines like database, AI,statistics and marketing, we will be capable to development web mining websitesand applications that are very helpful and useful to the web based informationsystems. In recent years Web Mining has been an important topic in data mining researchfrom the standpoint of supporting human-centered uncovering of knowledge. The currentday model of web mining effected from a number of shortcomings as listedearlier. As services over the web continue to rise, there will be a continuingneed to make them fast, robust, scalable and efficient.