Graduate Research Conference Program (GRC) on Wed. 3/25

UMBC’s Graduate Research Conference Program (GRC) will be held on campus on Wednesday, March 25, from 9:00 am to 5:00 pm. There will be a variety of presentations for faculty and students (both graduate and undergraduate). Featured events include professional development workshops, a keynote panel, and a research information fair.

Twenty-eight CSEE graduate students will describe their research in oral or poster presentations. Feel free to attend as many sessions as your schedule allows.

Please note that registration is required for both presenters and attendees. Registration is particularly important, in regards to securing a seat for the lunch and for the professional development workshops.

For more information, visit the GRC web site or email .

To register, please go to https://www.eventbrite.com/e/37th-annual-graduate-researchconference-registration-13201250295.

Please find a link to the program guide and the events flyer listed below:

talk: Visual Exploration of Big Urban Data, Noon Thr. 3/12, ITE325b, UMBC

Visual Exploration of Big Urban Data

Dr. Huy Yo
Center for Urban Science and Progress, New York University

12:00-1:00pm Thursday, 12 March 2015, ITE 325b

About half of humanity lives in urban environments today and that number will grow to 80% by the middle of this century. Cities are thus the loci of resource consumption, of economic activity, and of innovation; they are the cause of our looming sustainability problems but also where those problems must be solved. Data, along with visualization and analytics can help significantly in finding these solutions.

In this talk, I will discuss the challenges of visual exploration of big urban data; and showcase our approaches in a study of New York City taxi trips. Taxis are valuable sensors and can provide unprecedented insight into many different aspects of city life. But analyzing these data presents many challenges. The data are complex, containing geographical and temporal components in addition to multiple variables associated with each trip. Consequently, it is hard to specify exploratory queries and to perform comparative analyses. This problem is largely due to the size of the data. There are almost a billion records of taxi trips collected in a 5-year period. I will present TaxiVis, a tool that allows domain experts to visually query taxi trips at an interactive speed and performing tasks that were unattainable before. I will also discuss our key contributions in this work: the visual querying model and novel indexing scheme for spatio-temporal datasets.

Dr. Huy Vo is a Research Scientist at the Center for Urban Science and Progress (CUSP), New York University. His research focuses on large-scale data analysis and visualization, big data systems, and scalable displays. He is also a Research Assistant Professor of Computer Science and Engineering at NYU’s Polytechnic School of Engineering since 2011. He is one of the co-creators of VisTrails, an open-source scientific workflow and provenance management system, where he led the design of the VisTrails Provenance SDK. He received his B.S. in Computer Science (2005) and PhD in Computing (2011) from the University of Utah and was a two time recipient of the NVIDIA Fellowship awards (2009-2010 and 2010-2011).

Host: Jian Chen

talk: Physics, Simulation, and Computer Animation, Noon Mon 3/9, ITE325b

Physics, Simulation, and Computer Animation

Professor Adam W. Bargteil
University of Utah

12:00-1:00 Monday, 9 March 2015, ITE325b

Physics-based Computer Animation has revolutionized the world of special effects. I will talk about several success stories including my academy award winning work on fracture, particle skinning, and large-scale splashing liquids. I will also talk about moving beyond cinematic special effects to create tools for artistic authoring of interactive animations and enabling visually predictive simulations that promise to revolutionize industrial design.

Adam W. Bargteil is an assistant professor at the University of Utah. His primary research interests lie in the area of physics-based animation. He earned his Ph.D. in computer science from the University of California, Berkeley and spent two years as a post-doctoral fellow in the School of Computer Science at Carnegie Mellon University. From 2005 to 2007, he was a consultant at PDI/DreamWorks, developing fluid simulation tools that were used in “Shrek the Third” and “Bee Movie.”

talk: Topic Modeling with Structured Priors for Text-Driven Science

mp

Topic Modeling with Structured Priors for Text-Driven Science

Michael Paul, JHU

12:00pm – 1:00pm, Monday, 2 March 2015, ITE 325

Many scientific disciplines are being revolutionized by the explosion of public data on the web and social media, particularly in health and social sciences. For instance, by analyzing social media messages, we can instantly measure public opinion, understand population behaviors, and monitor events such as disease outbreaks and natural disasters. Taking advantage of these data sources requires tools that can make sense of massive amounts of unstructured and unlabeled text. Topic models, statistical models that describe low-dimensional representations of data, can uncover interesting latent structure in large text datasets and are popular tools for automatically identifying prominent themes in text. However, to be useful in scientific analyses, topic models must learn interpretable patterns that accurately correspond to real-world concepts of interest.

In this talk, I will introduce Sprite, a family of topic models that can encode additional structures such as hierarchies, factorizations, and correlations, and can incorporate supervision and domain knowledge. Sprite extends standard topic models by formulating the Bayesian priors over parameters as functions of underlying components, which can be constrained in various ways to induce different structures. This creates a unifying representation that generalizes several existing topic models, while creating a powerful framework for building new models. I will describe a few specific instantiations of Sprite and show how these models can be used in various scientific applications, including extracting self-reported information about drugs from web forums, analyzing healthcare quality in online reviews, and summarizing public opinion in social media on issues such as gun control.

Michael Paul is a PhD candidate in Computer Science at Johns Hopkins University. He earned an M.S.E. in CS from Johns Hopkins University in 2012 and a B.S. in CS from the University of Illinois at Urbana-Champaign in 2009. He has received PhD fellowships from Microsoft Research, the National Science Foundation, and the Johns Hopkins University Whiting School of Engineering. His research focuses on exploratory machine learning and natural language processing for the web and social media, with applications to computational epidemiology and public health informatics.

— more information and directions: http://bit.ly/UMBCtalks

Two technical talks by Amazon senior staff, 4-6:30pm Tue 3/3

Senior Amazon staff members will give two technical talks on next week on Tuesday, March 3, in the UC Ballroom on topics of great practical interest and utility.

  • Lydia Fitzpatrick, Senior Technical Program Manager for Amazon Mobile Business will give a talk on “Web Performance Optimization” from 4:00pm to 5:00pm.
  • Leo Zhadanovsky, Senior Solutions Architect for Amazon Web Services will present an “Introduction to Amazon Web Services (AWS)” from 5:30pm to 6:30pm. The talk with introduce cloud computing and  discuss the various Networking, Compute, Database, Storage, Application, Deployment and Management services that AWS offers. It will demonstrate how to launch a full three tier LAMP stack in minutes, as well as how to setup a simple web server on AWS.  The presentation will also discuss several use-cases, demonstrating how customers such as Enterprises, Startups, and Government Agencies are using AWS to power their computing needs.

The talks will be preceded and followed by an open networking opportunity with Amazon Human Resource representatives. Amazon is interested in students for internships and full-time position who are majoring in Information Systems, Business Technology Administration, Computer Engineering, Computer Science, and Cybersecurity.

PhD proposal: User Identification in Wireless Networks

Ph.D. Dissertation Proposal

User Identification in Wireless Networks

Christopher Swartz

9:00-11:00pm Friday, 27 February 2015, ITE 325B

Wireless communication using the 802.11 specifications is almost ubiquitous in daily life through an increasing variety of platforms. Traditional identification and authentication mechanisms employed for wireless communication commonly mimic physically connected devices and do not account for the broadcast nature of the medium. Both stationary and mobile devices that users interact with are regularly authenticated using a passphrase, pre-shared key, or an authentication server. Current research requires unfettered access to the user’s platform or information that is not normally volunteered.

We propose a mechanism to verify and validate the identity of 802.11 device users by applying machine learning algorithms. Existing work substantiates the application of machine learning for device identification using Commercial Off-The-Shelf (COTS) hardware and algorithms. This research seeks the refinement of and investigation of features relevant to identifying users. The approach is segmented into three main areas: a data ingest platform, processing, and classification.

Initial research proved that we can properly classify target devices with high precision, recall, and ROC using a sufficiently large real-world data set and a limited set of features. The primary contribution of this work is exploring the development of user identification through data observation. A combination of identifying new features, creating an online system, and limiting user interaction is the objective. We will create a prototype system and test the effectiveness and accuracy of it’s ability to properly identify users.

Committee: Drs. Joshi (Chair/Advisor), Nicholas, Younis, Finin, Pearce, Banerjee

talk: Visual understanding of human actions, 12-1 Fri 2/27, ITE325b

Visual understanding of human actions

Dr. Hamed Pirsiavash

Postdoctoral Research Associate
Computer Science and Artificial Intelligence Laboratory
Massachusetts Institute of Technology

12:00-1:00pm Friday, 27 February, 2015, ITE 325B

The aim in computer vision is to develop algorithms for computers to “see” the world as humans do. Central to this goal is understanding human behavior as an intelligent agent functioning in the visual world. For instance, in order for a robot to interact with us, it should understand our actions to produce the proper response. My work explores several directions towards computationally representing and understanding human actions.

In this talk, I will focus on detecting actions and judging their quality. First, I will describe simple grammars for modeling long-scale temporal structure in human actions. Real-world videos are typically composed of multiple action instances, where each instance is itself composed of sub-actions with variable durations and orderings. Our grammar models capture such hierarchical structure while admitting efficient, linear-time parsing algorithms for action detection. The second part of the talk will describe our algorithms for going beyond detecting actions to judging how well they are performed. Our learning-based framework provides feedback to the performer to improve the quality of his/her actions.

Host: Mohamed Younis

PhD proposal: Scalable Storage System for Big Scientific Data

Ph.D. Dissertation Proposal

MLVFS: A Scalable Storage System For Managing Big Scientific Data

Navid Golpayegani

3:00-5:00pm Tuesday 24 February 2015, ITE 346

Managing peta or exabytes of data with hundreds of millions to billions of files is a necessary first step towards an effective big data computing and collaboration environment for distributed systems. Current file system designs have focused on providing better and faster data distribution. Managing the directory structure for data discovery becomes an essential element of the scalability problems for big data systems. Recent designs are addressing the challenge of exponential growth of files. Still largely unexplored is the research for dealing with the organizational aspect of managing big data systems with hundreds of millions of files. Most file systems organize data into static directory structures making data discovery, when dealing with large data sets, hard and slow.

This thesis will propose a unique Multiview Lightweight Virtual File System (MLVFS) design to primarily deal with the data organizational management problem in big data file systems. MLVFS is capable of the dynamic generation of directory structures to create multiple views of the same data set. With multiple views, the storage system is capable of organizing available data sets by differing criteria such as location or date without the need to replicate data or use symbolic links. In ad- dition, MLVFS addresses scalability issues associated with the growth of the stored files by removing the internal metadata system and replacing it with generally avail- able external metadata information (i.e. data base servers, project compute servers, remote repositories, etc.). This thesis, moreover, proposes to add, plug in capabilities not normally found in file systems that make this system highly flexible, in terms of specifying sources of meta data information, dynamic file format streaming and other file handling features.

The performance of MLVFS will be tested in both simulated environments as well as real world environments. MLVFS will be installed on the BlueWave cluster at UMBC for simulated load testing to measure the performance for various loads. Simultaneously, stable version of MLVFS will run in real world production environ- ments such as those of the NASA MODIS instrument processing system (MODAPS). The MODAPS system will be used to show examples of real world use cases for MLVFS. Additionally, there will be other systems explored for the real world use of MLVFS, such as at NIST for research into Biomedical Image Stitching.

Committee: Drs. Milton Halem (Chair, Advisor), Yelena Yesha, Charles Nicholas, John Dorband, Daniel Duffy

talk: Understanding Social Spammers, Noon Tue 2/24, ITE325

Understanding Social Spammers: A Data Mining Perspective
Xia “Ben” Hu

Computer Science and Engineering
Arizona State University

12:00-1:00 Tuesday, 24 February 2015

With the growing popularity of social media, social spamming has become rampant on all platforms. Many (fake) accounts, known as social spammers, are employed to overwhelm legitimate users with unwanted information. Social spammers are unique due to their coordinated efforts to launch attacks such as distributing ads to generate sales, disseminating pornography and viruses, executing phishing attacks, or simply sabotaging a system’s reputation. In this talk, I will introduce a novel and systematic analysis of social spammers from a data mining perspective to tackle the challenges raised by social media data for spammer detection. Specifically, I will formally define the problem of social spammer detection and discuss the unique properties of social media data that make this problem challenging. By analyzing the two most important types of information, network and content information, I will introduce a unified framework by collectively using heterogeneous information in social media. To tackle the labeling bottleneck in social media, I will show how we can take advantage of the existing information about spam in email, SMS, and on the web for spammer detection in microblogging. I will also present a solution for efficient online processing to handle fast-evolving social spammers.

Xia Hu is a Ph.D. candidate in Computer Science and Engineering at Arizona State University, supervised by Professor Huan Liu. His research interests include data mining, machine learning, social network analysis, etc. As a result of his research work, he has published nearly 40 papers in several major academic venues, including WWW, SIGIR, KDD, WSDM, IJCAI, AAAI, CIKM, SDM, etc. One of his papers was selected for the Best Paper Shortlist in WSDM’13. He is the recipient of IEEE “Atluri Award” Scholarship, 2014 ASU’s President’s Award for Innovation, and Faculty Emeriti Fellowship. He has served on program committees for several major conferences such as WWW, IJCAI, SDM and ICWSM, and reviewed for multiple journals, including IEEE TKDE, ACM TOIS and Neurocomputing. His research attracts wide range of external government and industry sponsors, including NSF, ONR, AFOSR, Yahoo!, and Microsoft.

— more information and directions: http://bit.ly/UMBCtalks

talk: Labrou on Studying Internet Latency via TCP Queries to DNS, 1:30pm Fri 2/27

ACM Tech Talk

Studying Internet Latency via TCP Queries to DNS

Dr. Yannis Labrou
Principal Data Architect, Verisign

1:30-2:30pm Friday, 27 February 2015, ITE 456, UMBC

Every day Verisign processes upwards of 100 billion authoritative DNS requests for .COM and .NET from all corners of the earth. The vast majority of these requests are via the UDP protocol. Because UDP is connectionless, it is impossible to passively estimate the latency of the UDP-based requests. A very small percentage of these requests though, are over TCP, thus providing the means to estimate the latency of specific requests and paths for a subset of the hosts that interact with Verisign’s network infrastructure.

In this work, we combine this relatively small number of datapoints from TCP (on the order of a few hundred million per day) with the much larger dataset of all DNS requests. Our focus is the process of data analysis of real world, imperfect data at very large scale with the goals of understanding network latency at an unprecedented magnitude, identifying large volume, high latency clients and improving their latency. We discuss the techniques we used for data selection and analysis and we present the results of a variety of analyses, such as deriving regional and country patterns, estimations for query latency for different countries and network locations, and techniques for identifying high latency clients.

It is important to note that latency results we will report are based on passive measurements from, essentially, the entire Internet. For this experiment we do not have control over the client side — where they are, which software, their configuration, their network congestion. This is significantly different from latency studied in any active measurement infrastructure such as Planet Lab, RIPE Atlas, Thousand Eyes, Catchpoint, etc.

 

Dr. Yannis Labrou is Principal Data Architect at Verisign Labs where he leads efforts to create value from the wealth of data that Verisign’s operations generate every day. He brings to Verisign 20 years of experience in conceiving, creating and bringing to fruition innovations; combining thinking big with laboring through the pains of materializing ideas. He has done so in an academic environment, at a startup company, while conducting government and DoD/DARPA sponsored research and for a global Fortune 200 company.

Before joining Verisign, Dr. Labrou was a Senior Researcher at Fujitsu Laboratories of America, Director of Technology and member of the executive staff of PowerMarket, an enterprise application software start-up company and a Research Assistant Professor at UMBC. He received his Ph.D. in Computer Science from UMBC, where his research focused on software agents, and a Diploma in Physics from the University of Athens, Greece. He has authored more than 40 peer-reviewed publications, with almost 4000 citations and he has been awarded 14 patents from the USPTO. His current research focus is data through the entire lifecycle from generation to monetization.

— more information and directions: http://bit.ly/UMBCtalks

1 25 26 27 28 29 58