عنوان

Feature Representation in Mining and Language Processing

پدید آورنده

Vu, Thuy

موضوع

رده

کتابخانه

Center and Library of Islamic Studies in European Languages

محل استقرار

استان: Qom ـ شهر: Qom

تماس با کتابخانه : 32910706-025

NATIONAL BIBLIOGRAPHY NUMBER

Number

TL9g54h388

LANGUAGE OF THE ITEM

.Language of Text, Soundtrack etc

انگلیسی

TITLE AND STATEMENT OF RESPONSIBILITY

Title Proper

Feature Representation in Mining and Language Processing

General Material Designation

[Thesis]

First Statement of Responsibility

Vu, Thuy

Subsequent Statement of Responsibility

Parker, D. Stott

.PUBLICATION, DISTRIBUTION, ETC

Date of Publication, Distribution, etc.

2017

DISSERTATION (THESIS) NOTE

Body granting the degree

Parker, D. Stott

Text preceding or following the note

2017

SUMMARY OR ABSTRACT

Text of Note

Feature representation has been one of the most important factors for the success of machine learning algorithms. Since 2006, deep learning has been widely considered for various problems in different disciplines and, most of the time, has reset state-of-the-art results --- thanks to its excellent ability to learn highly abstract representations of data. I focus on extracting additional structural features in network analysis and natural language processing (NLP) --- via learning novel vector-based representations, usually known as embeddings. For network analysis, I propose to learn representations for nodes, node embeddings, for social network applications. The embeddings are computed using attributes and links of nodes in the network. Experimental studies on community detection and mining tasks suggest that node embeddings can further reveal deeper structure of the network. For NLP, I address the learning of representations at three levels: words, word relations, and linguistic expressions. First, I propose to extend the standard word embedding training process into two phases, treating context as second order in nature. This strategy can effectively compute embeddings for polysemous concepts of words, adding an extra conceptual layer for standard word embeddings. Second, I introduce the representations of ``semantic binders'' for words. These representations are learned using categorial grammar and are shown to effectively handle disambiguation, especially when meaning of a word largely depends on a specific context. Finally, I present a three-layer framework to learn representation for linguistic expressions --- for solving the semantic compositionality problem, using recurrent neural networks driven by categorial-based combinatory rules. This strategy specifically addresses the limitations of recurrent neural network approaches in deciding how --- and when --- to include individual information in the compositional embedding. The framework is flexible and can be integrated with the proposed representations. I study the efficiency of the proposed representations in different NLP applications: word analogies, subject-verb-object agreement, paraphrasing, and sentiment analysis.

PERSONAL NAME - PRIMARY RESPONSIBILITY

Pierce, Hayley

PERSONAL NAME - SECONDARY RESPONSIBILITY

Vu, Thuy

CORPORATE BODY NAME - SECONDARY RESPONSIBILITY

UCLA

ELECTRONIC LOCATION AND ACCESS

Electronic name