General description
This course is divided into 2 parts.
The major objectives of the first part of this course are to introduce Web mining technology from a practical point of view and for the students to obtain a solid grasp of how techniques in Web mining technology can be applied to solve problems in real-world applications.
Web mining aims to discover useful information or knowledge from Web hyperlinks, page contents and usage data. Due to the richness and diversity of information and other Web specific characteristics, Web mining is not just an application of data mining. Web mining has developed many of its own methods, ideas, models and algorithms. This course will cover the following topics:
• Introduction to WWW and Web Mining Systems
• Learning and Knowledge Discovery from the Web
• Information Retrieval (IR) and Web Search
• Web Crawling and Information Integration
• Web Link Analysis such as Social Network Analysis, PageRank and HITS
• Opinion and Sentiments Mining
• Web Aspect Search and Mining
• Web Usage Mining
• Web Mining Applications such as Web Blogs Mining and Online Medical Data Analysis
The aim of the second part of the course is that students understand the representation and the process of knowledge in the Semantic Web. The Semantic Web is an evolving extension of the World Wide Web, in which the meaning of the information and the services on the web are defined. Several enabling technologies has been developed to define standard specifications in the data exchanged on the Internet. Such technologies are the resource description framework (RDF), a variety of shapes of data exchange (e.g. RDF / XML, N3, Turtle, n-tripling), and notes like the shape RDF (RDFS) and the ontology language for the Web (OWL). All Theses aim to provide a formal description of concepts, terms and relationships within a given area of knowledge