How much math does a GIS Analyst need to know? This is a question that is asked by many prospective and practicising GIS professionals. SOme ask because they want to become analysts or some are already analysts but are not sure of their competencies.

It is important that we differentiate between the different types of analysts that we have.

We have a GIS theory analyst who knows lots of mathematics, statistics and computer science and is interested in developing new spatial analysis models and then we have a practitioner who does not necessarily need to know how to manipulate complex matrix algebra or know the formula of Euclidean distance but uses these measures through GUIs like ESRI’s ArcGIS and ArcMAp.

SO the question is, what kind of an analysts are you? Are the theorists or just a practitioner?

For the Theorist:

Spatial analysis is a highly quantitative subject and for one to be great at developing models for academic or industry purposes, one has to have a very good foundation of mathematics with at least a degree in a quantitative discipline. If not a full degree then one should have taken some modules at colege level to be able to decipher the abstract language used in coming up with these models. One should think of themselves as a Spatial Data Scientist who has the following skills:

Technical Skills

  • Math (e.g. linear algebra, calculus and probability)
  • Statistics (e.g. hypothesis testing and summary statistics)
  • Machine learning tools and techniques (e.g. k-nearest neighbors, random forests, ensemble methods, etc.)
  • Software engineering skills (e.g. distributed computing, algorithms and data structures)
  • Data mining
  • Data cleaning and munging
  • Data visualization (e.g. ggplot and d3.js) and reporting techniques
  • Unstructured data techniques
  • R and/or SAS languages
  • SQL databases and database querying languages
  • Python (most common), C/C++ Java, Perl
  • Big data platforms like Hadoop, Hive & Pig
  • Cloud tools like Amazon S3
  • Geology, geography, GIS, or a related field;
  • Manipulating vector and raster data,
  • Geographic information systems and visualization software packages, including geologic cartographic production and quantitative spatial analyses e.g. ArcGIS software suite (ArcMap, ArcCatalog, ArcToolbox, ArcGIS server);
  • Ground-based and remotely-sensed data collection methodologies and processing;
  • Geodatabase management, and demonstrated ability to deal with large datasets (Big Data techniques)
  • Written and verbal communications skills, as demonstrated through published papers and presentations;
  • Project management abilities

Business Skills

  • Analytic Problem-Solving: Approaching high-level challenges with a clear eye on what is important; employing the right approach/methods to make the maximum use of time and human resources.
  • Effective Communication: Detailing your techniques and discoveries to technical and non-technical audiences in a language they can understand.
  • Intellectual Curiosity: Exploring new territories and finding creative and unusual ways to solve problems.
  • Industry Knowledge: Understanding the way spatial analysis in GIS and Region Sciences functions and how data are collected, analyzed and utilized.

For the Practitioner:

Now there are people withing the GIS community who are truly afraid of mathematics and statistics, they do not have programming skills and heavily rely on the GUIs like ArcGIS and ArcMap to click their lives away. What kind of maths and stats do they need? To help us answer this I looked around the web for similar questions and stumbled onto the following:

I make my living applying mathematics and statistics to solving the kinds of problems a GIS is designed to address. One can learn to use a GIS effectively without knowing much math at all: millions of people have done it. But over the years I have read (and responded to) many thousands of questions about GIS and in many of these situations some basic mathematical knowledge, beyond what’s usually taught (and remembered) in high school, would have been a distinct advantage.

The material that keeps coming up includes the following:

  • Trigonometry and spherical trigonometry. Let me surprise you: this stuff is overused. In many cases trig can be avoided altogether by using simpler, but slightly more advanced, techniques, especially basic vector arithmetic.
  • Elementary differential geometry. This is the investigation of smooth curves and surfaces. It was invented by C. F. Gauss in the early 1800’s specifically to support wide-area land surveys, so its applicability to GIS is obvious. Studying the basics of this field prepares the mind well to understand geodesy, curvature, topographic shapes, and so on.
  • Topology. No, this does not mean what you think it means: the word is consistently abused in GIS. This field emerged in the early 1900’s as a way to unify otherwise difficult concepts with which people had been grappling for centuries. These include concepts of infinity, of space, of nearness, of connectedness. Among the accomplishments of 20th century topology was the ability to describe spaces and calculate with them. These techniques have trickled down into GIS in the form of vector representations of lines, curves, and polygons, but that merely scratches the surface of what can be done and of the beautiful ideas lurking there. (For an accessible account of part of this history, read Imre Lakatos‘ Proofs and Refutations. This book is a series of dialogs within a hypothetical classroom that is pondering questions that we would recognize as characterizing the elements of a 3D GIS. It requires no math beyond grade school but eventually introduces the reader to homology theory.)

    Differential geometry and topology also deal with “fields” of geometric objects, including the vector and tensor fields Waldo Tobler has been talking about for the latter part of his career. These describe extensive phenomena within space, such as temperatures, winds, and crustal movements.

  • Calculus. Many people in GIS are asked to optimize something: find the best route, find the best corridor, the best view, the best configuration of service areas, etc. Calculus underlies allthinking about optimizing functions that depend smoothly on their parameters. It also offers ways to think about and calculate lengths, areas, and volumes. You don’t need to know much Calculus, but a little will go a long way.
  • Numerical analysis. We often have difficulties solving problems with the computer because we run into limits of precision and accuracy. This can cause our procedures to take a long time to execute (or be impossible to run) and can result in wrong answers. It helps to know the basic principles of this field so that you can understand where the pitfalls are and work around them.
  • Computer science. Specifically, some discrete mathematics and methods of optimization contained therein. This includes some basic graph theory, design of data structures, algorithms, and recursion, as well as a study of complexity theory.
  • Geometry. Of course. But not Euclidean geometry: a tiny bit of spherical geometry, naturally; but more important is the modern view (dating to Felix Klein in the late 1800’s) of geometry as the study of groups of transformations of objects. This is the unifying concept to moving objects around on the earth or on the map, to congruence, to similarity.
  • Statistics. Not all GIS professionals need to know statistics, but it is becoming clear that a basic statistical way of thinking is essential. All our data are ultimately derived from measurements and heavily processed afterwards. The measurements and the processing introduce errors that can only be treated as random. We need to understand randomness, how to model it, how to control it when possible, and how to measure it and respond to it in any case. That does not mean studying t-tests, F-tests, etc; it means studying the foundations of statistics so that we can become effective problem solvers and decision makers in the face of chance. It also means learning some modern ideas of statistics, including exploratory data analysis and robust estimation as well as principles of constructing statistical models.