GIS Meetup and Educational Seminar Summary #9
Presented at: 2021 Los Angeles Geospatial Summit, USC Spatial Sciences Institute
Presented on: 26 February 2021
Presenter:
- Orhun Aydin, Ph.D., Researcher, Esri Spatial Statistics; Lecturer, Spatial Sciences Institute
External link:
https://geospatialsummit.secure-platform.com/a/gallery/rounds/1/details/2
Overview
The growing availability of data sources and computing infrastructure enables spatial data science to solve complex multidisciplinary problems. This workshop covers the field of spatial data science and introduces the capabilities of ArcGIS in this area, including open source and open data integrations.
Spatial data science is used to summarize, represent, and model observations of spatial and spatial-temporal phenomena. Some typical users of spatial data science include geo-designers, city planners, hydrogeologists, and epidemiologists.
Growth in Spatial Data
The quantity of spatial data available is growing rapidly. Data collected by many smart devices have a spatial dimension, and many devices now include GPS receivers. All this spatial data presents greater opportunities for mapping and modelling. However, not all data will be available for analysis due to privacy concerns. A 2020 forecast by Microsoft predicted the following volumes of data:
-
Small city 250 PB/day
-
Smart device 20GB/IoT device
-
People 1.5 GB/day
-
Smart home 1.5 GB/day
-
Autonomous vehicle 5 TB/day
-
Smart office 150 GB/day
-
Stadium 200 TB/game
-
Connected factory 1 PB/day
High Resolution Data
Much of this data is available in exceptionally fine detail which surpasses the resolution available from satellite imagery, for example, the road network data collected by lidar from autonomous vehicles. This data may be used to understand agent-based movement patterns, consumer behavior, and to perform social sensing tasks. In addition, the advent of microsatellites is increasing the variety and volume of data available.
Spatial Data Science Workflow
The spatial data science workflow enables the creation of impactful spatial data products and the dissemination of spatial knowledge. This workflow includes the following steps:
-
Clean and wrangle data
-
Exploratory data analysis
-
Model observed data
-
Model relationships and make predictions
-
Make decisions
Workflow Elements
These workflow steps make use of the following elements:
-
Data sources: spatial, temporal, tabular, written materials.
-
Analysis framework: ArcGIS toolbox, ArcPy, TensorFlow, PySAL, R packages.
-
Relevant tools: hot-spot analysis, deep neural networks, MaxEnt, Mean centers.
-
Data products: maps, widgets, and apps.
Artificial Intelligence
Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) are all current buzz words which are often used incorrectly and out of context. Whereas data is explicitly represented, data does not explicitly represent knowledge. AI is possible without ML, for example, knowledge bases which require human intervention to create the rules. However, ML enables AI to infer rules and patterns from the data. While ML can be accomplished through statistical methods, DL is a specific ML method that uses neural networks.
Open Data
Open data is crucial to facilitate interoperability and enable large scale problem solving. The components of open data include open standards and formats, direct product integration, and open software architecture.
Use Cases
There are three main use cases for spatial data science: prediction, clustering, and classification.
-
Prediction uses the known to predict the unknown. For example, a global climate model can be "downscaled" to accurately predict the impact of climate change on local temperatures. Time series prediction can be used to predict the future from the past, for example, earthquake measurements from a sensor network can be used to predict the magnitude of an earthquake as a time series.
-
Clustering groups observations based on similarity of values and location. For example, density-based clustering has been be used to analyze a dataset of 50,000 GPS tracks collected by the Waze application, between 5pm and 6pm, in Los Angeles, to identify congested highways and intersections.
-
Classification decides which category an object should be assigned to, based on a training dataset. For example, high-resolution imagery can be used to classify impervious surfaces to inform effective preparations for storm and flood events. It is possible to perform classification using non-spatial statistical methods including maximum likelihood clustering, random forest, and support vector machine (SVM) algorithms.
Downscaling Demonstration
The workshop concluded with a demonstration of the climate downscaling problem using a Python notebook. This is a regression problem which uses 19 predictors from a global climate model to predict local temperatures. The demonstration compared the performance of several non-spatial statistical methods with that of spatial models. Three non-spatial techniques were used: SVM, Ridge regression, and random forest. These were compared with two spatial methods: geographically weighted regression and EBK regression. For the spatial methods, the number of input variable was reduced to three. Overall, the spatial methods produced models that were simpler, and results that were more informative.
Reaction
For me, this was a coherent, informative, and inspiring workshop. Although I have seen much of this material before in Esri marketing presentations, I felt that Orhun Aydin gave a more coherent, less buzzy presentation in this more academic setting. It is interesting to observe how he tailors his material to different audiences. Overall, I feel that this presentation has given me a better understanding of the capabilities and limitations of statistical analysis verses deep learning. I particularly liked the explanations of real-world applications of this technology.
Using Artificial Intelligence, Spatial Modeling and Python to Break Down Big Data Barriers
26 February 2021
2021 Los Angeles Geospatial Summit, USC Spatial Sciences Institute
https://geospatialsummit.secure-platform.com/a/gallery/rounds/1/details/4
40 minutes.
- Michael Ann Lane, Global Education and Inside Sales Manager, Hexagon Geospatial mike.lane@hexagon.com
- Bradley C. Skelton, Product Line Director, ERDAS IMAGINE and M.App X, Hexagon Geospatial
Deep Learning
This presentation introduces the ERDAS IMAGINE image geoprocessing software and demonstrated a workflow using deep learning to identify candidate images which contain an object of interest.
The presentation began by introducing the terms artificial intelligence (AI), machine learning (ML), and deep learning (DL) and explaining the relationship between these technologies. The desire to use AI to automate the task of processing images is driven by the ever-increasing demand for image processing and rapidly growing number of images. This is illustrated by a quote from Robert Cardillo’s 2017 NAG address where he stated: “In five years, there may be a million times more than the amount of geospatial data we have today.”
ERDAS IMAGINE Spatial Modeler
The ERDAS IMAGINE spatial modeler is like a big box of tools that can be connected into a customized workflow to build applications. The pieces can be put together to build sophisticated systems. The use of these tools to create a model workflow was demonstrated including the following steps:
-
Create a training set
-
Build a repeatable model
-
Run the model and detect images of interest
-
Incorporate Python to watch a folder for new image files and automatically execute the model
-
Call a Python script to send email when an object of interest is detected in an image
Deploy Models
After the model has been trained once, it may be used many times to detect object in new images. The best accuracy is achieved by training the model on hundreds of images. Some examples of image detection were shown including:
-
Car detection in a parking lot
-
Oil palm mapping (using RGB rather than near-IR band imagery).
-
Well stand detection
Extensions
The ERDAS IMAGINE spatial modeler can be extended using Python in two ways:
-
Calling Python scripts from existing models
-
Using spatial modeler operators in exiting Python scripts
Reaction
This was a narrowly focused demonstration on one aspect of the ERDAS IMAGINE remote sensing software. Consequently, I did not get a good understanding of the market position and overall value proposition of the whole solution. I would have preferred a broader view of this product’s capabilities and benefits. I am unsure how this product stacks up against the capabilities built into ArcGIS and what the pros and cons are of using a separate product for image detection. However, it was good to see how easily a product like this can be integrated using Python and this was another demonstration of the power of deep learning for image classification. Clearly this technology is already widely used in defense and intelligence gathering applications.