The Emergence of AI Engineering and its Implications for Spatial Science

GIS Meetup and Educational Seminar Summary #11

A presentation at: 2021 Los Angeles Geospatial Summit

Presentation by: USC Spatial Sciences Institute

Presented on: 26 February 2021

Presenter:

Sean O’Brien, Ph.D., Chief Technologist, Northrop Grumman

External link:

https://geospatialsummit.secure-platform.com/a/gallery/rounds/1/details/1#videos

Overview

In this presentation Sean O’Brien makes the case that AI engineering should be a separate discipline from both computer science and data science. Moreover, he claims that this distinction is truly relevant to geospatial science. To set the context for this proposition, he explains that AI is not new, but has exploded in the past decade due to a combination of factors including Moore's law, the internet, cloud computing, and developments in deep learning, especially deep neural networks.

Challenges

Today, deep learning algorithms are used in the fields of natural language processing (NLP), computer vision, playing structured games, and prediction algorithms. However, O’Brien claims that today's deep learning neural networks are brittle and limited.

Current issues with deep learning neural networks include requirements for large training datasets; inability to explain AI behavior, lack of trustworthiness; shallowness i.e., AI systems have limited capacity for transfer between domains; inability to integrate prior knowledge; and difficulty engineering AI into "missionized" systems.

As a result, AI development teams often face a wide array of challenges such as a mismatch between expectations, requirements, and technology; bad data; ill-conditioned data which has not been optimized for AI algorithms; poor algorithm engineering processes; and general "hype" which damages trust.

Poorly Defined Roles

Today data scientists, computer scientists, and mathematicians are typically drafted to engineer AI systems. However, roles in AI teams are poorly defined, and these roles are poorly integrated into engineering teams. In addition, AI teams use non-standard tools, and the return on investment for AI projects is poorly understood. Moreover, AI projects are often viewed by management as "fairy dust." Consequently, AI is approaching a crisis of confidence, as questions are raised about the fairness, reliability, and governance of AI systems.

Great Software Crisis of 1967

O'Brien claims that we have seen this situation before and that the current state of AI engineering is analogous to the "Great Software Crisis of 1967" which led to the rise of software engineering as a separate discipline. Following the NATO conference in Garmisch in 1968, it was agreed that engineers should focus on one discipline; that they did not need to be electrical engineers before they could become software engineers; and that computer software engineering should be abstracted from computer hardware engineering. In short, the causes of that crisis went beyond the processes that were then available.

New Discipline

O'Brien claims that an analogous situation prevails in the development of AI systems today and that the solution is to apply similar ideas to AI engineering. Specifically, he recommends that AI engineering should be abstracted from software engineering; some of the software complexity should be hidden to allow AI engineers to focus on a single discipline; and AI engineers should not have to be software engineers first. In short, he proposes that the new discipline of AI engineering should be built on the principles of abstraction; modularity and power; and simplicity and robustness.

AI Systems Behave Differently

Unlike most software systems, the behavior of AI systems is non-deterministic and evolutionary which means they are subject to "cognitive-drift." AI system depend on both the AI model and the data. Therefore, to achieve robust AI systems, it is necessary to continuously track the "provenance" and relationship between the model and the data over the life of the system. Unfortunately, the current software “DevOps” approach is not a good fit for this purpose. O'Brien outlines a series of best practices and changes to improve AI engineering. He claims that it is possible to solve more complex problems by building more complex systems. A key aspect of AI Engineering will be to enable non-software engineers to participate in the development and use of AI systems more directly. This has powerful implications for the application of AI to fields like Geospatial Science.

Ten Principles of AI System Design

He proposes the following ten principles of AI system design.

AI is core to many systems; it cannot be added like fairy dust.
Sensors and platforms should be designed to serve AI.
AI systems depend on both the algorithm and the data.
Test and evaluation must be continuous.
AI traceability and governance rely on the code, model, and data.
AI systems development is a multidisciplinary challenge.
Algorithm engineering, data engineering, and software engineering are separate but complimentary disciplines.
Data engineering and machine learning do not by themselves provide complete solutions.
AI models require much cleaner datasets than humans.
User experience requirements should be considered from beginning.

Reaction

For me, this was a fascinating presentation. It was good to hear about some of the challenges that large AI projects face. I believe that the principles outlined here are truly relevant to the application of AI to geospatial problem solving. However, this connection was only mentioned briefly in this presentation which was a little disappointing. A key take-away for me was the need for explicit traceability and governance of AI systems. It was also eye-opening to begin to understand the motivation for a transition from supervised to unsupervised learning. I find the prospect of these systems being deployed in the defense sector unnerving.