3/10/14: Shlomo Dubnov

Science of Story

Shlomo Dubnov
Department of Music

Abstract: Story is a delicate balance. Can a computer appreciate a good story? Probably not, at least not yet today. But it can help weigh the options and make cold-blooded calculations of how well story elements combine according to some commonly accepted script-writing formulas. In the talk I will describe an ongoing research on using NLP to track the structure of narrative in film scripts by embedding scenes in a semantic space and tracing their evolution over time. The method allows matching theoretical elements of story structure, such as theme, turning points, B-story, climax and resolution, and many other elements outlined in so called "beat-sheet" formulas to actual changes in word statistics over the duration of a film script. This automated analysis can be compared to previous research on green lighting movie scripts that uses human evaluations as predictors for commercial success of movies. Some speculations on universality of story structure in relation to human perception of epic / myth and musical form will be discussed.

Shlomo Dubnov is a Professor in music technology at UCSD. He received his PhD in Computer Science from Hebrew University and was a researcher in IRCAM, Paris and faculty in Communication Systems Engineering in Ben-Gurion University, Israel. Among his main contributions are new methods for statistical audio analysis/synthesis, modeling of emotions and aesthetics, and machine learning systems for musical improvisation. He co-edited a book “The Structure of Style: algorithmic approaches to understanding manner and meaning” and served as a secretary of IEEE Technical Committee on Computer Generated Music. Currently he serves as a co-lead editor of ACM Computers in Entertainment and directs Qualcomm Institute's Center for Research on Entertainment and Learning (CREL).

3/3/14: Ery Arias-Castro

Community detection in a random network

Ery Arias-Castro
Department of Mathematics

Abstract: We formalize the problem of detecting a community in a network into testing whether in a given (random) graph there is a subgraph that is unusually dense.  Specifically, we observe an undirected and unweighted graph on N nodes.  Under the null hypothesis, the graph is a realization of an Erdös-Rényi graph with probability p_0.  Under the (composite) alternative, there is an unknown subgraph of n nodes where the probability of connection is p_1 > p_0.  We derive detection lower bounds for detecting such a subgraph in terms of (N, n, p_0, p_1) in various regimes, and exhibit a number of tests that achieve that lower bound in some particular regime: the scan statistic and variants, the size of the largest connected component, the number of triangles, the eigengap of the adjacency matrix, etc.  We also consider the problem of testing in polynomial-time.  Our detection bounds are sharp, except in the Poisson regime where we were not able to fully characterize the constant arising in the bound. Joint work with Nicolas Verzelen (INRA, France).

Ery Arias-Castro is an associate professor of statistics at UCSD. He received his PhD in Statistics from Stanford University in 2004. His M.A. is in artificial intelligence, from the Ecole Normal Superieure de Cachan (France) and Washington University, Saint Louis, while his B.S. is in mathematics from the  Ecole Normal Superieure de Cachan. He joined the faculty in the mathematics department at UCSD in 2005. His research interests are in high-dimensional statistics, machine learning, spatial statistics, image processing, and applied probability.

2/24/14: Chun-Nan Hsu 

Identifying Transformative Research

Chun-Nan Hsu 

Abstract. Transformative research refers to research that shifts or disrupts established scientific paradigms. Identifying potential transformative research early and accurately is important for funding agencies to maximize the impact of their investments. It also helps scientists identify and focus their attention on promising emerging works. In this talk, I will present a data-driven approach where citation patterns of scientific papers are analyzed to quantify how much a potential challenger idea shifts an established paradigm. I will present experimental results showing that some successful transformative research works disrupt established paradigms in Physics, Biomedical Sciences and Computer Science, regardless of whether the challenger paradigm is an instant hit or a classic whose contribution is formally recognized with a Nobel Prize decades later.

Chun-Nan Hsu has been Associate Professor in the Division of Biomedical Informatics, UC San Diego since November 2013. He is interested in biomedical data mining, text mining and cell image analysis. He has developed several widely-used bioinformatics tools and services, including top-performing text mining systems in BioCreative international contest series. His recent work focuses on advanced text-mining algorithms to establish knowledge-bases of phenotypes and genetic diseases. He was elected as the president of the Taiwanese Association for Artificial Intelligence in 2009 and a IBM faculty award recipient. He is a senior member of ACM.

2/17/14: No seminar

Monday February 17 is Presidents' Day and an official holiday at UCSD. 

2/10/14: Amit K. Roy Chowdhury 

Situation Awareness from Wide-Area Vision Networks

Amit K. Roy Chowdhury 
University of California, Riverside

Abstract. Over the past decade, large-scale camera networks have become increasingly prevalent in a wide range of applications, such as security and surveillance, disaster response, and environmental modeling. However, the analysis of the acquired videos has been largely manual and post-facto. Thus, the development of algorithms capable of analyzing a scene covering a wide area is extremely important. In this talk, we will focus on three inter-related problems in this domain.
i)      The performance of the analysis algorithms often suffers because of the inability to effectively acquire the desired images. We will discuss our recent work on integrated sensing and analysis in a distributed camera network so as to maximize various scene-understanding performance criteria (e.g., tracking accuracy, best shot, and image resolution). We will show how the existing work in autonomous multiagent systems can be leveraged for this purpose - more specifically, game theory-based distributed optimization algorithms for dynamic camera network reconfiguration.
ii)     In application domains where there is no central processor accumulating all the data (e.g., disaster response), inferences have to be drawn through local decisions at the camera nodes and negotiations with the neighbors. We will present our work on distributed reasoning in vision networks, especially the recently proposed Information Weighted Consensus Filter (ICF). The application of ICF to multi-target tracking will then be presented.
iii)    Finally, we will address the issue of higher-level scene understanding. The recognition of activities in video is an essential step in this regard. Activities happening over a wide-area are often related in space and time. We will show how graphical inference methods can be used to robustly recognize such activities, specifically taking into account the contextual relationships between them.

Amit K. Roy Chowdhury received his undergraduate degree in electrical engineering from Jadavpur University, Calcutta, India, his Masters degree in systems science and automation from the Indian Institute of Science, Bangalore, India, and the Ph.D. degree in electrical engineering from the University of Maryland, College Park. He is a Professor of Electrical Engineering and a Cooperating Faculty in the Department of Computer Science at the University of California, Riverside. His broad research interests include the areas of image processing and analysis, computer vision, and statistical signal processing and pattern recognition. Together with his students and collaborators, he has over 100 technical publications in these areas, including one best student paper award. His current research projects include intelligent camera networks, wide-area scene analysis, motion analysis in video, activity recognition and search, video-based biometrics (face and gait), and biological video analysis. He is the first author of the book - Camera Networks: The Acquisition and Analysis of Videos over Wide Areas - the first research monograph on this topic. He has been on the organizing and program committees of multiple computer vision and image processing conferences and is serving on the editorial boards of multiple journals.

2/3/14: Christian Shelton

Machine Learning and Critical Care Pediatrics

Christian Shelton
University of California, Riverside

Electronic health records provide the opportunity for data-driven medical discovery, even with all of their current flaws.  Intensive care units are particularly interesting microcosm as their data are relatively frequent and some types of outcomes are more quickly known. In this talk, I will first outline the type of data, scientific questions, and domain challenges Children's Hospital Los Angeles and my group have been working on to improve critical care.  Then I will describe one project on estimating blood gas levels for children on mechanical ventilation. This project aims to remove invasive tests and provide faster weaning off of ventilation to decrease costs and improve health.

Christian Shelton is an Associate Professor of Computer Science at the University of California at Riverside. He joined the faculty in 2003. His research interest is in statistical approaches to artificial intelligence, mainly in the areas of machine learning and dynamic processes. He has been the Managing Editor of the Journal of Machine Learning Research and on the editorial board of the Journal of Artificial Intelligence Research. Dr. Shelton received his B.S. in Computer Science from Stanford University in 1996 and his Ph.D. from MIT in 2001. From 2001 to 2003, he was a postdoctoral scholar back at Stanford. He has been a visiting researcher atIntel Research (2003-2004) and Children's Hospital Los Angeles (2012-2013).


Reports of Interesting Research from NIPS

In this session, UCSD grad students will describe especially interesting research presented at NIPS in December 2013.

1/20/14: No seminar.

No meeting because of Martin Luther King day.

1/13/14: Sumithra Velupillai

Shades of Certainty -- Working with Swedish Medical Records

Sumithra Velupillai
Division of Biomedical Informatics

Abstract. Different levels of knowledge certainty, or factuality levels, are expressed in clinical health record documentation. This information is currently not fully exploited, as the subtleties expressed in natural language cannot easily be machine analyzed. 

Two annotated corpora have been created for capturing speculations and uncertainties in Swedish medical records. One model distinguishes certain and uncertain expressions on a sentence level, and is applied on medical documentation from several clinical departments. Differences between clinical practices are also studied. More fine-grained certainty level distinctions are presented in a second model, with two polarities along with three levels of certainty, and is applied on a diagnostic statement level from an emergency department. Overall agreement results for both models are promising, but differences are seen depending on clinical practice, the definition of the annotation task and the level of domain expertise among the annotators. 

Using annotated resources for automatic classification of certainty levels is also studied by employing machine learning techniques. Encouraging overall results using local context information are obtained. The fine-grained certainty level model is also used for building classifiers for coarser-grained, real-world e-health scenarios, showing that fine-grained annotations can be used for several e-health scenario tasks. 

This talk will also present ongoing research on Swedish medical records and the Stockholm EPR Corpus from the Clinical Text Mining Group at the Department of Computer and Systems Sciences, Stockholm University. 

Sumithra Velupillai, Ph.D., is a postdoctoral researcher at UCSD, coming from the Department of Computer and Systems Sciences at Stockholm University. She has been awarded an international postdoctoral fellowship from the Swedish Research Council along with a Fulbright scholarship, in which she will base a majority of her research at UCSD during 2014-2015. She successfully defended her Thesis "Shades of Certainty – Annotation and Classification of Swedish Medical Records" on April 27th, 2012. Velupillai has participated in several national and international research projects, among others the Interlock project - a research collaboration between Stockholm University and DBMI at UCSD. Velupillai has a background in Computational Linguistics and specializes in research covering Language Technology, Information Access and Extraction, and Health Informatics.

1/6/14: Steve Gallant

Representing Free Text for Machine Learning Using High-Dimensional, Distributed Vectors

Steve Gallant
MultiModel Research
Cambridge, MA

We want a way to represent natural language, including sentence structure, to make it easy for machine learning (back propagation, perceptron learning). This places certain constraints upon the Representation. Here the Binding problem is especially important; for example, we need to represent the binding of adjectives to the nouns they modify, and nouns to their roles (actor, agent). The talk will present a neurally-inspired representation, MBAT (matrix binding of additive terms), that represents a sentence by a single high-dimensional distributed vector. We argue that this representation satisfies the constraints. It also permits us, to prove that certain concepts can be learned, rather than relying on simulations. A project is underway to test MBAT techniques with real-world problems, such as sentiment analysis. The architecture also has implications for Cognitive Science.

Steve Gallant developed several neural network learning algorithms when Associate Professor of Computer Science at Northeastern University. He showed how to interpret a neural network as an expert system, and was one of the first to extract rules from a neural network. He led a document retrieval project at HNC in San Diego based upon context vectors (which he invented), a precursor to current research that could not represent sentence structure. (HNC created a spinoff based upon this technology.)

Currently Steve leads an NSF supported research project on representation and machine learning at a Cambridge, MA startup: MultiModel Research. He has over 40 publications and a book on “Neural Network Learning.”

Education: MIT: Undergrad (Math), Stanford: Ph.D. (Operations Research), University of Waterloo: Post Doc (Combinatorics & Optimization).