Talk: Large Data Transfer over the Wide Area Network, 4/26

UMBC CSEE Colloquium

Large Data Transfer over the Wide Area Network

Jim Finlayson
Laboratory for Physical Sciences
Advanced Computing Systems Group

1:00pm Friday, 26 April 2013, ITE 227, UMBC

The Department of Defense has challenges related to the transfer of large data sets over distance. This talk will go over some of the investigations into potential solutions in this space.

Jim Finlayson is a File Systems and I/O researcher for the Laboratory for Physical Sciences' Advanced Computing Systems Group. Mr. Finlayson has a long history in data storage infrastructure. He graduated from the University of Maryland, College Park with a BS in Computer Science and later received his MS in Computer Science from The Johns Hopkins University's Whiting School of Engineering.

Talk: Aho on Quantum Computer Compilers, 3pm 4/25

Center for Hybrid Multicore Productivity Research
Distinguished Computational Science Lecture Series

Quantum Computer Compilers

Professor Alfred V. Aho

Department of Computer Science, Columbia University

3:00pm Thursday, 25 April 2013, ITE 456, UMBC

Quantum computing is an exciting emerging field that offers great potential for next generation information processing but also presents great scientific and engineering challenges. Assuming that someday we will be able to build scalable and reliable quantum computers, we will need to create programming languages and compilers that will allow programmers to harness quantum phenomena. In this talk, Alfred Aho will look at quantum computing from a compiler writer's perspective and discuss some of the formidable challenges that face quantum computer compilers.

Alfred Aho is the Lawrence Gussman Professor of Computer Science at Columbia University. He received a B.A.Sc. in Engineering Physics from the University of Toronto and a Ph.D. in Electrical Engineering/ Computer Science from Princeton University. Prior to his current position, he served as vice president of the Computing Sciences Research Center at Bell Labs, the lab that invented UNIX, C and C++. He is the "A" in AWK, a widely used pattern-matching language. His current research interests include programming languages, compilers, algorithms, software engineering and quantum computing. He has won the IEEE John von Neumann Medal and is a Member of the National Academy of Engineering and of the American Academy of Arts and Sciences. He is a Fellow of the AAAS, ACM, Bell Labs and IEEE. In 2003 he received the Great Teacher Award from the Society of Columbia Graduates.

Host: Professor Milton Halem

ISCOM talk: Freeman Hrabowski on Technology, Diversity and Lifelong Learning

Exhibition

Information Systems Council of Majors Speaker Series

Technology, Diversity and Lifelong Learning

Dr. Freeman Hrabowski
President, University of Maryland, Baltimore County

3:00-4:30pm Friday, 26 April 2013, ITE102

As an ongoing service to UMBC our student group, the Information Systems Council of Majors (ISCOM), has developed a speaker series to bring industry professionals, academic luminaries, and prominent regional figures to discuss topics relating to technology, education, or topics of the speakers choosing.

This month our very special guest is Dr. Freeman Hrabowski. He will meet with members of ISCOM and UMBC students who are interested in hearing him speak about diversity, lifelong learning and of course technology. There will be interactive sessions prior to his remarks and our President Tabitha Haverkamp will provide closing remarks.

PhD defense: Analysis of brain network connectivity using spatial information

PhD Dissertation Defense

Analysis of brain network connectivity
using spatial information

Sai Ma

1:00pm Thursday, 18 April 2013, ITE 325b

In current functional magnetic resonance imaging (fMRI) research, one of the most active areas involves exploring statistical dependencies among brain regions, known as functional connectivity analysis. Data-driven methods, especially independent component analysis (ICA), have been successfully applied to fMRI data to extract distributed brain networks and offer an opportunity to investigate functional connectivity on a network level, thus at a multivariate level. However, the independence assumption in ICA is neither necessarily nor typically satisfied in real applications and an extension is desirable. Furthermore, most current ICA-based studies focus on the use of temporal information and second-order statistics for functional connectivity analysis. Taking spatial information and higher-order statistics in fMRI data into account is expected to lead to better understanding of the overall brain network connectivity in healthy controls and also in patients with mental disorders, such as schizophrenia.

We develop a dependent component analysis (DCA) framework to generalize the ICA-based connectivity analysis methods by grouping components into maximally independent clusters. First, we define functional network connectivity as the statistical dependence among spatial components, instead of the typically used temporal correlation. Based on this definition, we use a hypothesis test to automatically generate functional connectivity structure for a large number of brain networks. After that, we separate dependent components within a given cluster using prior information, such as sparsity and experimental paradigm information, to achieve a better decomposition. We also combine this DCA-based clustering analysis with graph-theoretical analysis to discover significant group differences in topological properties of functional connectivity structure. To extend the methodologies currently available for functional connectivity, we propose an independent vector analysis (IVA) based scheme to extract and analyze dynamic functional connectivity.

The methods we develop offer advantages for effective and efficient examination of not only static, but also dynamic functional connectivity among different brain networks. We identify significant differences in functional connectivity structure between healthy controls and patients with schizophrenia, which may prove useful to serve as potential biomarkers for diagnosis. We also find task-induced modulations in functional connectivity when comparing different active states in the brain. Furthermore, we observe temporal variability in functional connectivity structure and physiologically meaningful group differences in dynamic connectivity among several brain networks. Our methods can provide insights to understanding of functional characteristics of the brain network organization in healthy individuals and patients with schizophrenia.

Committee: Dr. Adali (Chair), Dr. Morris, Dr. Rutledge, Dr. LaBerge, Dr. Phlypo, Dr. Calhoun, and Dr. Westlake

PhD defense: Data-driven group analysis of complex-valued fMRI data

image_sixhund

PhD Dissertation Defense

Data-driven group analysis of complex-valued fMRI data

Pedro A. Rodriguez

11:00am Tuesday, 16 April 2013, ITE 346, UMBC

Analysis of functional magnetic resonance imaging (fMRI) data in its native, complex form has been shown to increase the sensitivity of the analysis both for data-driven techniques such as independent component analysis (ICA) and for model-driven techniques. The promise of an increase in sensitivity and specificity in clinical studies provides a powerful motivation for utilizing both the phase and magnitude data; however, the unknown and noisy nature of the phase poses a challenge for successful study of the fMRI data. In addition, complex-valued analysis algorithms, such as ICA, suffer from an inherent phase ambiguity, which introduces additional difficulty for group analysis and visualization of the results. We present solutions for these issues, which have been among the main reasons phase information has been traditionally discarded, and show their effectiveness when used as part of a complex-valued group ICA algorithm application. The developed methods become key components of a framework that allows the development of new fully complex data-driven and semi-blind methods to process, analyze, and visualize fMRI data.

In this dissertation, we first introduce the methods developed as part of the fully complex framework for ICA of fMRI data. We introduce a physiologically motivated de-noising method that uses phase quality maps to successfully identify and eliminate noisy voxels—3D pixels—in the fMRI complex images so they can be used in individual and group studies. We also introduce a phase correction scheme that can be either applied sub-sequent to ICA of fMRI data or can be incorporated into the ICA algorithm in the form of prior information to eliminate the need for further processing for phase correction. Finally, we present two visualization methods that are used to augment the sensitivity and specificity in the detection of activated voxels. We show the benefits of using the developed methods on actual complex-valued fMRI data.

In the remainder of the dissertation, we focus on developing constrained ICA (C-ICA) algorithms for complex-valued fMRI data. C-ICA uses prior information, hence providing a balance between model-based and data-driven approaches such as ICA to improve the source estimation performance and robustness to noise. C-ICA algorithms have been used to improve the estimation performance in real-valued fMRI data, but—to our knowledge—have not been applied to complex-valued fMRI data. We develop the first C-ICA algorithm that uses complex-valued references to constrain either the sources or the mixing coefficients. The designed algorithm is not restricted to having a unitary demixing matrix, which is a major assumption in existing C-ICA algorithms. We show, on both simulated and actual fMRI data, how the performance of ICA improves by using prior information about the fMRI paradigm.

Committee: Dr. Adali (Chair), Dr. Morris, Dr. Rutledge, Dr. Laberge, Dr. Phlypo, and Dr. Calhoun

PhD defense: Independent Vector Analysis: Theory, Algorithms and Applications, 4/17

datafusion

PhD Dissertation Defense

Independent Vector Analysis:
Theory, Algorithms, and Applications

Matthew Anderson

1:45pm Wednesday, 17 April 2013, ITE 325B

The field of blind source separation (BSS) is a well studied discipline within the signal processing community due to its applicability to a variety of problems when the data observation model is poorly known or difficult to model. For example, in the study of the human brain with functional magnetic resonance imaging (fMRI), a neuroimaging sensor, BSS algorithms are able to provide medical researchers and practitioners with a decomposition of a three-dimensional ‘movie’ of the brain that is amenable to analysis. BSS algorithms achieve this decomposition with only a few justifiable assumptions; this is contrary to methods based on the general linear model, which require prespecified models of the expected or desired response to achieve analysis of fMRI data.

Most BSS algorithms consider just a single dataset, but it also desirable to have methods that can analyze multiple subjects or data collections in fMRI jointly, so as to provide insights beyond that achieved with individual analysis of single datasets. Several frameworks for using BSS on multiple datasets jointly have been proposed. The subject of this dissertation is the study of one of these frameworks, which has been termed independent vector analysis (IVA). IVA is a recent extension of the classical independent component analysis (ICA) model to BSS of multiple datasets and it has been the subject of significant research interest. In this dissertation, we provide a formulation of IVA that accounts for sources which possess properties such as a) following Gaussian or non-Gaussian distributions; b) samples are independently and identically distributed (iid) or are dependent; and c) having either linear or nonlinear dependence of sources between datasets. The proposed IVA formulation utilizes the likelihood to define the objective function. This formulation admits to theoretical analysis. In particular, we provide the identification conditions, i.e., we determine when the sources can be ‘blindly’ recovered by IVA, and give a lower bound on the source separation performance.

Several algorithms exist for achieving IVA. We provide several new approaches to developing IVA algorithms and apply these approaches using a Gaussian distribution source model and a more general Kotz distribution model. The former, in addition to leading to efficient IVA algorithms, serves as the distribution model that directly connects canonical correlation analysis (CCA) and ICA.  

Committee: Dr. Tulay Adali (Chair), Dr. Joel Morris, Dr. Aninyda Roy, Dr. Ronald Phlypo, and Dr. Mike Novey

PhD defense: Digital Forensics for Infrastructure-as-a-Service Cloud Computing

Dissertation Defense

Digital Forensics for
Infrastructure-as-a-Service Cloud Computing

Josiah Dykstra

10:00am Tuesday, 16 April 2013, ITE 325b

We identify important issues in the application of digital forensics to Infrastructure-as-a-Service cloud computing and develop new practical forensic tools and techniques to facilitate forensic exams of the cloud. When investigating suspected cases involving cloud computing, forensic examiners have been poorly equipped to deal with the technical and legal challenges. Because data in the cloud are remote, distributed, and elastic, these challenges include understanding the cloud environment, acquiring and analyzing data remotely, and applying the law to a new domain. Today digital forensics for cloud computing is challenging at best, but can be performed in a manner consistent with federal law using the tools and techniques we developed.

The first problem is understanding how and why criminal and civil actions in and against cloud computing are unique and difficult to prosecute. We analyze a digital forensic investigation of crime in the cloud, and present two hypothetical case studies that illustrate the unique challenges of acquisition, chain of custody, trust, and forensic integrity. Understanding these issues introduces legal challenges which are also important for federal, state, and local law enforcement who will soon be called upon to conduct cloud investigations.

The second problem is the lack of practical technical tools to conduct cloud forensics. We examine the capabilities for forensics today, evaluate the use of existing tools including EnCase and FTK, and discuss why these tools are incapable of trustworthy cloud acquisition. We design consumer-driven forensic capabilities for OpenStack, including new features for acquiring trustworthy firewall logs, API logs, and disk images.

The third problem is a deficit of legal instruments for seizing cloud-based electronically-stored information. We analyze the application of existing policies and laws to the new domain of cloud computing by analyzing case law and legal opinions about digital evidence discovery, and suggest modifications that would enhance cloud the prosecution of cloud-based crimes. We offer guidance about how to author a search warrant for cloud data, and what pertinent data to request.

This dissertation enhances our understanding of technical, trust, and legal issues needed to investigate cloud-based crimes and offers new tools and techniques to facilitate such investigations.

Committee: Dr. Alan T. Sherman (Chair), Dr. Charles Nicholas, Dr. Richard Forno, Dr. Simson Garfinkel (Naval Postgraduate School), Mr. Donald Flynn, JD (Department of Defense Cyber Crime Center)

Sharma on a multilayer framework to catch data exfiltration 10:30 4/8

UMBC Graduate student Puneet Sharma talks about his research on developing a multilayer framework to catch data exfiltration, 10:30 Monday April 8 in toom ITE325b at UMBC. Here is the abstract.

Data exfiltration is the unauthorized leakage of confidential data from a particular system. It is nothing but a very specific form of intrusion which is particularly hard to catch due to the most common cause; an insider entity responsible for the leak. That entity could be a real person employed in the organization, or even a malicious hardware piece bought from an unreliable third party. Catching such intrusions therefore, can be extremely difficult. What is proposed is a framework with a multitude of parameters to be constantly monitored on a system. These parameters would cover the entire stack of the computer architecture starting from the hardware up till the application layer. A more spread out and comprehensive monitoring framework should ensure that designing an attack becomes extremely difficult since the intruder must now devote significantly more time and effort to bypass the multiple checks and avoid raising alarms.

talk: Sensor-based assessment of human motion during therapeutic exercise

sensor-prototype-web

UMBC Information Systems

Sensor-based assessment of the quality
of human motion during therapeutic exercise

Dr. Portia Taylor
Social Security Administration

12-1pm Wednesday, 10 April 2013, ITE 459

Advances in technology and research have been employed in recent years to develop efficient mechanisms to deliver home-based exercise therapy to patients suffering from knee osteoarthritis, a degenerative disease associated with aging. Essential to the success of a therapeutic home-exercise program is the quality of the motion performed by the patient. The unsupervised nature of home-based exercise may lead to incorrect exercise performance by patients; however, current home-based exercise programs do not provide mechanisms for monitoring the quality of motion performed or for providing feedback to the patient. This lack of support has been found to be a factor in patient non-compliance to home exercise programs.

Our goal is to provide a motion sensor-based system that can evaluate the quality of exercise to support home rehabilitation. We introduce the Quality Assessment Framework (QAF) that uses low-cost motion sensors with data processing and machine learning techniques to assess the quality of human motion performed during therapeutic exercises. Data from fifteen persons with knee osteoarthritis was collected in a laboratory environment, and a classifier was trained using multi-label learning methods to detect descriptive characteristics of the patient's motion. These characteristics represent errors in the exercise performance as well as variables regularly monitored by the patient's therapist, such as speed.

Results from multi-label learning are presented and recommendations made on requirements for an in-home therapeutic exercise system. The QAF can be adapted to the home therapy needs of conditions other than knee OA. We present a preliminary design of the InForm Exercise System that utilizes the QAF and has the potential to present feedback to patients completing home exercise programs.

Portia Taylor received her BS degree in Computer Science from Grambling State University in 2007 and a Ph.D. degree in Biomedical Engineering from Carnegie Mellon University in 2012. At CMU, she was part of the Quality of LifeTechnology Center, a NSF Engineering Research Center dedicated to the development of technologies for the elderly and disabled. Currently, Dr. Taylor works at the Social Security Administration as an IT Fellow. Her research interests include machine learning applications in biomedical engineering, intelligent systems for rehabilitation and physical therapy, and heath information technology.

talk: Machine learning for predicting chronic diseases

dna

UMBC CSEE Colloquium

Machine learning techniques for predicting chronic diseases

Vladimir Korolev

1:00pm Friday, 5 April 2013, ITE 227, UMBC

In recent years we saw an explosion of cheap genetic tests, which lead to the emergence of personalized medicine. Personalized medicine is defined as practice of medicine that is tailored to specifics of individual patient. My work addresses the problem predicting an individual’s predisposition towards certain chronic diseases based on the their genetic makeup. The benefits of such work allow for more selective administration of invasive tests such as biopsies, which are known to cause health problems themselves.

Recently NIH has done a number of Gene Wide Association Studies q that resulted in massive datasets containing subjects’ generic makeup and labeled with clinical data including occurrence of chronic diseases. Unfortunately, given the relatively small number of patients in such studies and the vast number of genes possessed by human beings, these datasets cannot be analyzed with traditional statistical predictive models, which require a large number of samples (patients) with a very few features per sample.

My work attempts to solve this problem by employing state of the art machine learning techniques. In the past year I have built a software system that is capable of crunching of multi-terabyte scale datasets to refactor the NIH data into the form that is palatable by modern big data systems. I have run initial stages of feature selection. I will present the current state of the work and future plans. Another goal of this work is to ensure the repeatability of the experiments and flexibility to run with any similar dataset from current and future studies

Vlad Korolev is a PhD student in the UMBC Computer Science program. His research interests are in the are of personalized medicine, machine learning and large scale data processing. Vlad has considerable experience in the industry specializing in IT security, large scale data processing and the organization of software development processes.

1 36 37 38 39 40 58