In 1965, Mike Stonebraker graduated from Princeton University and later went on to get his masters and PhD from the University of Michigan in 1967 and 1971. Stonebraker is currently an adjunct professor at MIT, a Principal Investigator at MIT CSAIL, and also the founder of several companies, specializing in database research. He worked as an assistant professor at the University of California, Berkeley for 29 years. During this time he won several awards such as the IEEE John von Neumann Medal, SIGMOD Edgar F. Codd Innovations award, along with being inducted in the Fellow of the Association for Computing Machinery. Stonebraker was elected as a member of the National Academy of Engineering and won the Turing Award in 2015, which is given for contributions “of lasting and major technical importance to the computer field.”
Aside from many academic achievements, Stonebraker has also founded several database companies such as Ingres Corporation, Illustra, Paradigm4, StreamBase Systems, Tamr, Vertica, and VoltDB.
In 1973, inspired by papers published by Edgar F. Codd, Stonebraker and Eugene Wong focused on researching relation database systems. Together they created Ingres, which along with System R from IBM, was one of the first systems that showed that it was viable to build an impletelatiation of the relational model that Edgar F. Codd had wrote about. Several key components of Ingres are used in relational systems today, such as primary-copy replication, B-trees, and others. Wong and Stonebraker’s Ingres competed with IBM, threatening their market and eventually, by using a variation of the BSD license, became used by manyl companies due to its nominal fee. In the 1980s, Ingres was sold to Computer Associates, which it left in 2005, and was renamed Actian.
When Stonebraker founded Relational Technology, him and his colleagues addressed the limits of the model and began Postgres with the purpose of providing support to the data types to the database systems. This allowed users to register new types of data, along with different functions. Postgres was easier from programmers to change and modify allowing them to work with the optimizer, query language, runtime, and indexing frameworks. Postgres’s work has helped create a basis for several startup companies like Aster Data Systems, EnterpriseDB, and Greenplum.
Stonebraker is a lead in the Database Group, which includes professor and Principal Investigator Sam Madden, and an assortment of graduate students, post-doc students, and visiting scientists. The group focuses their research on all areas relating to database systems and information management. Projects range from the design of new user interface and query languages to low-level query execution issues. Their areas of impact are cybersecurity and wireless and their research areas are programming languages and software engineering, security and cryptography, and systems and networking.
One of Stonebraker’s current projects in the Database Group is “Data Discovery”. The goal is to help people with their data-related tasks. From discovering, to cleaning, to transforming it, the idea is to shape the data in a way that is easier for people to analyze. Lead by Stonebraker and Principal Investigator Steve Madden, the group identifies the problem as several parts. The first being data is stored across multiple storage systems. It is stored as far as databases to data lakes. The second issue is that data scientists do not operate within the limits of well-defined schemas, but instead they look for data across their organization to answer increasingly complex business questions. The group created ‘Aurum’ which is part of the Data Civilizer project. The system helped tackle data discovery problems at large. Aurum created a new discovery language known as SRQL, that allows users to decide what is relevant through a set of data primitives that expose the relations of the underlying data. Using an Enterprise Knowledge Graph to answer queries in human-scale latencies, Aurum is scalable, building the EKG in linear time, despite the complexity of extracting complex relationships among thousands of data sources.
On-line transaction processing database management systems are an essential part of operation in many large businesses. Stonebraker leads a team of Marco Serafini, Yu Lu, Ashraf Aboulnage, Rocardo Mayerhofer, and Francisco Andrade in a project called “Predictive Elastic Database Systems”. The project focuses on several different aspects of elasticity including mechanism for on-line data migration as well as algorithms for determining when to reconfigure and which data to move.
In addition, Stonebraker also leads, along with Sam Madden and graduate students Aaron Zalewski Oscar Ricardo Moll Thomae, the “Building a Scalable Database for Autonomous Vehicles” project. The group addresses the challenges presented by the potential scale of autonomous vehicle data and the unique characteristics of the data. One autonomous vehicle has the ability to produce up to 30 GB/hour of data. This means that a team with a single car will produce large amounts of data at a challenging scale. While platforms do exist to manage sensor data, the problem is that it is difficult to come by a good tools for querying specific sensor data over a large dataset. Stonebraker and the Database group plan on addressing these challenges.