talk: Reinventing the Classroom, Harry Lewis, Noon Fri 3/1

Students at the CS20 whiteboard. (Photos by Eliza Grinnell, Harvard SEAS Communications.)

Reinventing the Classroom:
creating a new course and a space to teach it

Professor Harry Lewis

Gordon McKay Professor of Computer Science
Harvard University

12:00-2:00 ITE 456, UMBC


slides (ppt); talk (video); discussion (video)

TALK AND LUNCH: 12:00-1:00. Lunch courtesy of Dr. Warren DeVries, Dean of the College of Engineering and Information Technology. RSVP on my.umbc.edu by Monday, February 25 to reserve a spot. Email requests from outside UMBC to .

DISCUSSION: 1:00-2:00. The community is invited to stay after the talk for an open discussion and conversation with Professor Lewis and your UMBC colleagues about designing new classroom spaces for active learning and the flipped classroom approach.

For decades my lectures kept getting better, my enrollments kept going up, and the number of warm bodies in the lecture hall kept going down. So I decided to try something entirely different, a "flipped classroom." Students watched lectures over the Internet at night in their rooms, and spent class time solving problems under supervision in small groups. The subject matter was discrete mathematics, which is well suited to this pedagogical style, but the class was so successful that it is being adapted for use in other Harvard courses. I will report on some of the conceptual and practical problems I encountered, including the creation of a new teaching space, which had to be cheap to construct and adaptable in use since the experiment might have failed.

Harry Lewis is Gordon McKay Professor of Computer Science at Harvard, where he has taught since 1974. He is uncertain whether he should be proud of his role in launching the careers of Bill Gates and Mark Zuckerberg, both of whom dropped out of Harvard shortly after taking his course. From 1995-2003 Lewis served as Dean of Harvard College. In this capacity he oversaw the undergraduate experience, including residential life, career services, public service, academic and personal advising, athletic policy, and intercultural and race relations. He is a long time member of the College’s Admissions Committee.

For more information, see his article Reinventing the Classroom in the Fall 2013 issue of Harvard Magazine.

Ph.D. dissertation proposal: Huguens Jean

In developing countries, people are now more likely to have access to a mobile phone than clean water, making cellular based technology the only viable medium for collecting, aggregating, and communicating local data so that it can be turned into useful information.

UMBC Computer Science and Electrical Engineering
Ph.D. Dissertation Proposal

Paper form digitization for information systems strengthening and socio-economic development in developing countries

Huguens Jean

3:00pm Tuesday, 5 March 2013, ITE346, UMBC

In developing countries, people are now more likely to have access to a mobile phone than clean water, making cellular based technology the only viable medium for collecting, aggregating, and communicating local data so that it can be turned into useful information. While mobile phones have found broad application in reporting health, financial, and environmental data, many data collection methods still suffer from delays, inefficiency and difficulties maintaining quality. In environments with insufficient IT support and infrastructure, and among populations with limited education and experience with technology, paper forms rather than electronic methods remain the predominant means for data collection. To meet the digitization needs of paper driven data collection practices, this thesis proposes the development and study of a software platform that automatically converts unknown paper forms into digital structured data and uses human intelligence when necessary to improve its performance.

We begin by identifying a high-level system architecture for dealing with infrastructure constraints and human resources limitations. We then break the architecture into its integral pieces and organize them into three distinct functional and interacting stages: data collection, data conversion, and crowdsourcing. In the collection phase, we focus on visually detecting structurally identical form instances and transmitting the images of their raw input data to a remote server. During this phase, we present a novel framework for identifying specific form types by generating a multipart template for unknown forms and decomposing the form identification problem into three distinct tasks: similar image retrieval, learning, and duplicate matching. The conversion phase uses a mixture of Optical Character Recognition (OCR) and human annotations techniques to convert images into digital information and group structurally identical forms in their respective database table. In crowdsourcing, we investigates how to use low-end smartphones for collecting training information to improve OCR related tasks and verify the accuracy of converted input values. We pay special emphasis on identifying natural interaction forms that lower the technical and knowledge threshold for local residents. Furthermore, because crowdsourcing can also provide money to the mobile workers of its micro-tasking platform, we concurrently explore how systems that facilitate collaboration between humans and machines for improving the quality of intelligent information systems can be used a vehicle for delivering socioeconomic opportunities to developing countries.

Committee: Dr. Timothy Oates (Chair), Dr. Janet Rutledge, Dr. Fow-Sen Choa, Dr. Jesus Caban

talk: Integration of HBase and Lucene for real-time big data analysis, 1pm Fri 2/22

UMBC CSEE Colloquium

Integration of HBase and Lucene for real-time big data analysis

Yin Huang
CSEE Department, UMBC

1:00 pm Friday, 22 February 22 2013, ITE 227, UMBC

The increasing size of data sets have posed several challenges on real-time big data analysis, Business Intelligence for example, in terms of system scalability and data availability. Business Intelligence focuses on mining the big data, providing multidimensional visualization and thus supporting business decision making, ideally in a real-time fashion. Traditional relational database management systems fail to provide a flexible and stable solution. Several NoSQL database systems have been proposed to tackle these challenges, such as Cassandra and HBase. HBase, however, does not support full-text searching; current implementation of HBase only offers the row-key based indexing. In this talk, we introduce building Lucene index on top of HBase to support multidimensional queries for data mart under MapReduce framework, serving as the corner stone for future data analysis and business report.

Yin Huang obtained his B.S. in Computer Science from Nanchang University in 2009, and studied in Chongqing University for two years for his M.S. He started his Ph.D. program in Computer Science at the University of Maryland, Baltimore County in 2011. In 2012, he interned in IBM Ottawa lab for four months with the focus of using Multicore-Enhanced Hadoop System for Buisness Intelligence. His current research area is database, data mining, and parallel computing.

talk: Analytics for Cancer Survival Time, 1pm Fri 2/15, ITE227

UMBC CSEE Colloquium

Analytics for Cancer Survival Time

Dr. Shujia Zhou and Ran Qi
CSEE Department, University of Maryland, Baltimore County

1:00 pm Friday, 15 February 2013, ITE 227, UMBC

With advent of new medical technologies, more and more prognostic factors are discovered and used in predicting cancer survival time. Consequently, the number of cancer patient types increase significantly. However, there are limited therapies available for cancer patients. Therefore, there is an urgent need to develop accurate algorithms in grouping cancer patients so that a doctor can choose the optimal therapy for a cancer patient. In this talk we will introduce current grouping algorithms, discuss new approaches in improving their efficiencies, and present a prototype of prognostic system for cancer patients.

Dr. Shujia Zhou is a research associate professor of Computer Science and Electrical Engineering at UMBC. He received a Ph.D. from Washington University at St. Louis in 1993. He held a Director’s-funded Postdoctoral Fellowship at Los Alamos National Laboratory (LANL) and became a LANL technical staff member in 1996. At LANL, his researches in large-scale molecular dynamics simulations were published in the Journal of Science and reported in the journals of both Science and Nature. In 2000, he joined Northrop Grumman Corporation and worked on NASA Computation Technology Projects. He is a pioneer in accelerating climate and weather applications with multi-core processors such as,IBM’s Cell B.E. processor. His current research interests are big data analytics, in particular in finance and health.

Ran Qi received her M.S. degree in Computer Science from the Lamar University in 2009 and started her Ph.D program in Computer Science at the University of Maryland, Baltimore County in 2010. In 2011, she worked on train monitoring system development at Norfolk Southern for 4 months as a co-op intern. In January 2012, she worked on personalized medicine in IBM Toronto Lab for 3 weeks. Her current Ph.D research focus is data mining in health analytics and personal medicine including cancer survival analysis and developing a cancer prognostic system.

talk: Reinventing the Classroom, Harry Lewis, Noon Fri 3/1

Students at the CS20 whiteboard. (Photos by Eliza Grinnell, Harvard SEAS Communications.)

Reinventing the Classroom

Professor Harry Lewis

Gordon McKay Professor of Computer Science
Harvard University

12:00-2:00 ITE 456, UMBC

TALK AND LUNCH: 12:00-1:00. Lunch courtesy of Dr. Warren DeVries, Dean of the College of Engineering and Information Technology. RSVP on my.umbc.edu by Monday, February 25 to reserve a spot. Email requests from outside UMBC to .

DISCUSSION: 1:00-2:00. The community is invited to stay after the talk for an open discussion and conversation with Professor Lewis and your UMBC colleagues about designing new classroom spaces for active learning and the flipped classroom approach.

    For decades my lectures kept getting better, my enrollments kept going up, and the number of warm bodies in the lecture hall kept going down. So I decided to try something entirely different, a "flipped classroom." Students watched lectures over the Internet at night in their rooms, and spent class time solving problems under supervision in small groups. The subject matter was discrete mathematics, which is well suited to this pedagogical style, but the class was so successful that it is being adapted for use in other Harvard courses. I will report on some of the conceptual and practical problems I encountered, including the creation of a new teaching space, which had to be cheap to construct and adaptable in use since the experiment might have failed.

    Harry Lewis is Gordon McKay Professor of Computer Science at Harvard, where he has taught since 1974. He is uncertain whether he should be proud of his role in launching the careers of Bill Gates and Mark Zuckerberg, both of whom dropped out of Harvard shortly after taking his course. From 1995-2003 Lewis served as Dean of Harvard College. In this capacity he oversaw the undergraduate experience, including residential life, career services, public service, academic and personal advising, athletic policy, and intercultural and race relations. He is a long time member of the College’s Admissions Committee.

    For more information, see his article Reinventing the Classroom in the Fall 2013 issue of Harvard Magazine.

    Talk: Energy Efficiency in Large-Scale Computing, to be rescheduled

    CSEE Colloquium

    Energy Efficiency in Large-Scale Computing

    David Prucnal, PE
    Advanced Computing Systems

    *TO BE RESCHEDULED, ITE 227, UMBC

    Data center power demand and energy consumption have grown substantially over the past 10-20 yrs. For high performance computing, power has become one of the main limiting factors. The Supercomputing Top500 List is now dominated by individual machines that demand nearly 10MW, which is equivalent to the technical load of an entire data center just 10 yrs ago. In addition these machines require another 5-10MW to power the necessary cooling systems. This talk will examine the power problem, and discuss some approaches to improving energy efficiency in large scale computing environments. In particular, it will look at demand side techniques for fully exploiting existing infrastructure, and at the use of immersion cooling.

    Mr. Prucnal has been active as a Professional Engineer in the field of power engineering for over 25 yrs. Over the past 15 yrs. he has been involved with designing, building, and optimizing high-reliability data centers. He joined the Agency as a power systems engineer and was one of the first to recognize the power, space and cooling problem in high performance computing. He moved from the facilities engineering directorate to the research directorate to pursue solutions to the HPC power problem from the demand side versus the infrastructure supply side. Mr. Prucnal leads the Energy Efficiency Thrust within the Advanced Computing Systems research team. His current work includes Power-Aware Data Center Operation, and Immersion Cooling. He also oversees projects investigating single/few electron transistors, 3D chip packaging, low-power electrical and optical interconnects, and power efficiency through enhanced data locality.

    more information and directions

    Ph.D. defense: Multi-Source Option-Based Policy Transfer

    Ph.D. Defense

    Multi-Source Option-Based Policy Transfer

    James MacGlashan

    10:00am Friday, 25 January 2013, ITE 325B

     

    Reinforcement learning algorithms are very effective at learning policies (mappings from states to actions) for specific well defined tasks, thereby allowing an agent to learn how to behave without extensive deliberation.  However, if an agent must complete a novel variant of a task that is similar to, but not exactly the same as, a previous version for which it has already learned a policy, learning must begin anew and there is no benefit to having previously learned anything. To address this challenge, I introduce novel approaches for policy transfer. Policy transfer allows the agent to follow the policy of a previously solved, but different, task (called a source task) while it is learning a new task (called a target task). Specifically, I introduce option-based policy transfer (OPT). OPT enables policy transfer by encapsulating the policy for a source task in an option (Sutton, Precup, & Singh 1999), which allows the agent to treat the policy of a source task as if it were a primitive action. A significant advantage of this approach is that if there are multiple source tasks, an option can be created for each of them, thereby enabling the agent to transfer knowledge from multiple sources and to combine their knowledge in useful ways. Moreover, this approach allows the agent to learn in which states of the world each source task is most applicable. OPT's approach to constructing and learning with options that represent source tasks allows OPT to greatly outperform existing policy transfer approaches. Additionally, OPT can utilize source tasks that other forms of transfer learning for reinforcement learning cannot.

    Challenges for policy transfer include identifying sets of source tasks that would be useful for a target task and providing mappings between the state and action spaces of source and target tasks. That is, it may not be useful to transfer from all previously solved source tasks. If a source task has a different state or action space than the target task, then a mapping between these spaces must be provided. To address these challenges, I introduce object-oriented OPT (OO-OPT), which leverages object-oriented MDP (OO-MDP) (Diuk, Cohen, & Littman 2008) state representations to automatically detect related tasks and redundant source tasks, and to provide multiple useful state and action space mappings between tasks. I also introduce methods to adapt value function approximation techniques (which are useful when the state space of a task is very large or continuous) to the unique state representation of OO-MDPs.

    Committee: Dr. Marie desJardins (Chair), Dr. Tim Finin, Dr. Michael Littman, Dr. Tim Oates, Dr. Yun Peng

    talk: Phlypo on Letting the data speak — from blind to semi-blind source separation, 1pm Fri 2/1

    Functional magnetic resonance imaging or functional MRI (fMRI) is a type of specialized MRI scan used to measure the hemodynamic response (change in blood flow) related to neural activity in the brain or spinal cord of humans or other animals.

    Letting the data speak — from blind to semi-blind source separation

    Dr. Ronald Phlypo
    Research Associate, MLSP lab, UMBC

    1:00pm Friday, 1 February 2013, ITE 227, UMBC

    Blind source separation has known a vivid and rapid expansion during the nineties. Alleviating the need for prior physical knowledge—such as the geometry of the antenna array—allowed for data-driven exploration of the data, based on the sole, but natural assumption of independence. In this talk, I will focus on blind source separation, with specific applications in biomedical signal processing. Since independence allows to have an identifiable model under very few assumptions on the data, it is indeed widely praised as a candidate objective for source separation. However, it will be shown that independence alone is not always sufficient to permit for a physically or physiologically interpretable signal. During this talk, I will show some proposed solutions that add minimal extra assumptions on the data, allowing to identify physiological "sources" from electroencephalography, electrocardiography, and functional magnetic resonance imaging. I will also shortly demonstrate why the linear mixture model is indeed an appropriate model for these biophysical signals.

    Ronald Phlypo obtained a degree in industrial engineering at the KHBO, Ostend, Belgium ('03) and a master in artificial intelligence at the KULeuven, Leuven, Belgium ('04) where he completed his master's thesis under the supervision of prof. S. Van Huffel. While pursuing his PhD degree at the University of Ghent, Ghent, Belgium, he visited the I3S lab and worked with P. Comon, M. Antonini and V. Zarzoso. From Jan'10 to Feb.'12 he was a research associate at GIPSA lab, Grenoble, France and since April'12 a research associate at the UMBC -MLSP lab, Baltimore, USA. His research interests are in blind source separation, statistical signal processing, and machine learning.

    Public tutorials on high performance computing research and technologies

     

    The Center for Hybrid Multicore Productivity Research is a collaborative research center sponsored by the National Science Foundation with two university partners (UMBC and University of California San Diego), six government, and seven industry members. The Center's research is focused on addressing productivity, performance, and scalability issues in meeting the insatiable computational demands of its members' applications through the continuous evolution of multicore architectures and open source tools.

    As part of its annual industrial advisory board meeting next week, the center will hold an afternoon of public tutorials from 1:00pm to 4:00pm on Monday, 17 December 2012 in room 456 of the ITE building at UMBC. The tutorials will be presented by students doing research sponsored by the Center and feature some of the underlying technologies being used and some of their applications. The tutorials are:

    • GPGPUs – Tim Blattner and Fahad Zafa
    • Cloud Policies – Karuna Joshi
    • Human Sensors Networks – Oleg Aulov
    • Machine Learning Disaster Warnings – Han Dong
    • Graph 500 – Tyler Simon
    • HBase – Phuong Nyguen

    The tutorial talks are free and open to the public. If you plan to attend, please RSVP by email to Dr. Valerie L. Thomas, .

    PhD defense: Supporting Citizen Science and Biodiversity Informatics on the Semantic Web

    Ph.D. Dissertation Defense

    Supporting Citizen Science and
    Biodiversity Informatics on the Semantic Web

    Joel Sachs

    10:00am Friday, 14 December 2012, ITE 325b

    It is common for Semantic Web documents to use terms from multiple ontologies, with no expectation that the full semantics of each ontology will be imported by consuming applications. This makes sense, because importing all ontologies referenced by a document causes both practical and logical problems. But it has the drawback of leaving it to the consuming application to determine appropriate semantics for the terms being used. We describe an approach to constructing ontologies by layer, designed to make it easier for both data publishers and application developers to tailor-fit semantics to use cases.

    The layers that we develop correspond to patterns in the RDF graph. This contrasts with typical approaches to modular ontology development, where the layers are domain based. The three primary motivations for this approach are i) preserving computational tractability; ii) enabling easy coupling and decoupling with foundational ontologies and iii) maintaining cognitive tractability. This third motivation is still under-studied in semantic web development; we consider it in relation to reducing the ease with which ontology users can publish data that accidentally implies things that they do not mean. This is important always, but becomes especially so in citizen science, where users will naturally bring intuitive semantics to the terms that they encounter.

    We describe case studies that involved deploying our approach in the context of citizen science activities, and which provided opportunities to assess its capabilities and limitations. We also describe subsequent work aimed at addressing these limitations, and, by applying newly defined layers over the underlying data, show that we are able to improve the competency of our knowledge base. More generally, we show that appropriately combining triple-pattern-based layers allows us to support a wide variety of use cases with varied (and occasionally conflicting) requirements.

    In addition to our approach to semantic layering, contributions include an improved understanding of how to blend social and semantic computing to support citizen science, and a collection of layers for representing biodiversity information in RDF, with a focus on invasive species. Compared with other proposed “semanticizations” of the Darwin Core standard for representing biodiversity occurrence data, these layers involve minimal modification to the Darwin Core vocabulary, and make maximal use of the Darwin Core namespace, thereby simplifying the transition of current practices onto the semantic web.

    Committee: Drs. Tim Finin (Chair), Anupam Joshi, Tim Oates, Cynthia Parr, Yelena Yesha, Laura Zavala

    1 38 39 40 41 42 58