Location:Home > Engineering science > Computer Science > Applied Computer Technology > Research on Context-Based Statistical Relational Learning

Research on Context-Based Statistical Relational Learning

Downloads: []
Tutor: GaoWen
School: Institute of Computing Technology
Course: Applied Computer Technology
Keywords: Statistical relational learning,context models,multiscale mining,contextual depe
Type: PhD thesis
Year:  2005
Facebook Google+ Email Gmail Evernote LinkedIn Twitter Addthis

not access Image Error Other errors

The vast majority of work in statistical machine learning methods has focused on¡°flat¡±data¨C data consisting of identically-structured entities, typically assumed to be independent and identically distributed (IID). However, many real-world datasets are innately relational: hypertext, web pages or sites, web images, scientific papers, e-books, educational resources and more. Such semi-structured relational data consist of entities of different types, where each entity is characterized by a different set of attributes and generally has complex internal structure. Entities are related to each other via different types of relations. The relational structure is an important source of semantic information, which is often ignored by the traditional statistical learning methods. Thus the paper focuses mainly on how to explicitly exploit such relational information in statistical learning tasks so as to build more effective and more robust models.The main methodology used in this paper stems from the context-based modeling and analysis. Here the context is defined as a collection of relevant objects and surrounding influences that make the semantics of an object unique and comprehensible. Accordingly, the contextual dependency can be regarded as a special relationship among related objects that conveys explicit semantic correlation. Starting with an in-depth discussion of the related work on context analysis methods and statistical relational learning, the paper investigates several statistical contextual learning methods on different application domains. The creativities and contributions are discussed in detail as follows:First, the paper proposes a novel web site representation and mining algorithm using multiscale semantic models. In general, a web site can be regarded as a hypertext document with complex internal structure. The paper uses a multiscale tree as the representation model of web sites, and proposes four kinds of context models to characterize the topical correlation among nodes in the multiscale site tree. Using this model, the paper presents an HMT-based two-phase classification algorithm and a multiscale classification algorithm for web sites, both of which employ the hidden Markov tree model as the statistical model of tree-based data structure, and explicitly exploit the contextual topical correlation among nodes to improve the classification accuracy of web sites. For further improving performance while reducing the classification overheads, a two-stage denoising procedure is adopted to remove the noise information within sites, and an entropy-based strategy is introduced to dynamically prune the
Dissertation URL:
Related Dissertations
Last updated
Sponsored Links
Home |About Us| Contact Us| Feedback| Privacy | copyright | Back to top