Skip to main content

Data Integration Systems and the Plethora of Standards: Joining Research and Pragmatics

Arnon Rosenthal, The MITRE Corporation

Thursday, March 23, 2017
10:30am
EBII 3211 — NCSU Centennial Campus
(Directions to Centennial campus and parking information)

This talk is part of the Taming the Data invited-speaker series, held in the Department of Computer Science at NC State University.

Talk Title: Data Integration Systems and the Plethora of Standards: Joining Research and Pragmatics

Talk abstract:

This talk will have two parts, both of which try to crystalize general, pragmatic lessons.
• Extending and targeting research to improve adoptability: Over the years, we have seen much excellent algorithmic research that has failed to make it to product or routine practice. This phenomenon has certainly harmed data integration (to which, we give a brief introduction). Using simple examples from data integration research, we identify general patterns by which research choices could increase the chances for adoption and generate additional research questions.
• Disrupting today’s limits on interoperability: The government organizations we know rarely use modern tools – The default tool for data engineering is Microsoft Office, and artifacts rapidly become shelfware. We explain why organizations such as CDC or SEC that run submission hubs – collecting and forwarding data from many sources may be best placed to break the logjam – but report that even they are too conservative.

Data standards are the alternative to integration tools – more powerful in that they guide organizations in deciding what information they need to collect. However, each release takes years, requires agreement on a broad range of data, and offers little flexibility to serve urgent needs or niche needs. We sketch an alternative mode that is attuned to the fundamental challenges–independent actors, diverse preferences, and a need for local simplicity. Our approach is incremental and radically decentralized — a web of (overlapping, loosely-coupled) topic-ontologies, with automated mediation and provision for incremental change. We close by identifying the research and practical challenge in carrying out such an approach.

About the speaker:

Arnon Rosenthal has consulted and published in data sharing and administration, databases, clouds, data security, policy based systems, and graph algorithms. He has (according to ResearchGate) 150+ publications, and 4000+ citations. His work tries to address many sides of a problem simultaneously, clarify and decompose the challenges, understand the pragmatics, simplify, and generalize components of a solution, for a realistically imperfect world.

He has worked at The MITRE Corporation, and Computer Corporation of America, and Sperry Research, was a visiting researcher at IBM Almaden Research and ETH Zurich, and a faculty member at University of Michigan (Ann Arbor). He holds a Ph.D. from the University of California, Berkeley.

This invited-speaker series has been made possible thanks to generous support from:

Please send your comments to Rada Chirkova