SB AdvisoryMay 20, 2020 update: Keep up with the latest from Stony Brook about the coronavirus situation.  More information
Skip Navigation
Search
Steve Skiena, PhD
Director of the Institute for AI-Driven Discovery and Innovation & Distinguished Professor
Stony Brook University

Steven Skiena is Distinguished Teaching Professor of Computer Science and Director of the Institute for AI-Driven Discovery and Innovation at Stony Brook University. His research interests include the design of graph, string, and geometric algorithms, and their applications (particularly to biology). He is the author of six books, including "The Algorithm Design Manual," "The Data Science Design Manual", and "Who's Bigger: Where Historical Figures Really Rank". Skiena received his Ph.D. in Computer Science from the University of Illinois in 1988, and the author of over 150 technical papers. He is a former Fulbright scholar, and recipient of the ONR Young Investigator Award and the IEEE Computer Science and Engineer Teaching Award. More info at http://www.cs.stonybrook.edu/~skiena/.

Contact Information:  steven.skiena@stonybrook.edu

Abstract

Word and Graph Embeddings for Machine Learning Models

Distributed word embeddings (word2vec) provides a powerful way to reduce large text corpora to concise features readily applicable to a variety of problems in NLP and data science.   I will introduce word embeddings, and review several of our recent efforts to apply them for natural language processing (NLP) including the Polyglot system for  entity recognition, POS tagging, and sentiment analysis) for over 100 different languages.   

DeepWalk is an approach we have developed to construct vertex embeddings: vector representations of vertices which be applied to a very general class of problems in data mining and information retrieval.  DeepWalk exploits an appealing analogy between sentences as sequences of words and random walks as sequences of vertices to transfer deep learning (unsupervised feature learning) techniques from natural language processing to network analysis.   DeepWalk has become extremely popular, having been cited by over 2000 research papers since its publication at KDD 2014.   In this talk, I will introduce the notion of graph embeddings, explain how DeepWalk constructs them, and demonstrate why they make such powerful features for machine learning applications.

View Presentation