A tutorial for performing fast point-in-polygon tests with Python's Shapeley and Geopandas.
by Martin D. Maas, Ph.D
Last updated: 2021-05-04
On this page:
Performing point-in-polygon queries is one of the most commonplace operations in geospatial analysis and GIS. As Python is a great tool for automating GIS workflows, being able to solve this basic problem of computational geometry as efficiently as possible is often a prerequisite for further analysis.
How to check if a point is inside a polygon in Python
To perform a Point in Polygon (PIP) query in Python, we can resort to the Shapely library's functions .within(), to check if a point is within a polygon, or .contains(), to check if a polygon contains a point.
Here's some example code on how to use Shapely.
from shapely.geometry import Point, Polygon
# Create Point objects
p1 = Point(24.82, 60.24)
p2 = Point(24.895, 60.05)
# Create a square
coords = [(24.89, 60.06), (24.75, 60.06), (24.75, 60.30), (24.89, 60.30)]
poly = Polygon(coords)
# PIP test with 'within'
p1.within(poly) # True
p2.within(poly) # False
# PIP test with 'contains'
poly.contains(p1) # True
poly.contains(p2) # False
To perform a Point in Polygon (PIP) query in Python, we can resort to the Shapely library's functions .within(), to check if a point is within a polygon, or .contains(), to check if a polygon contains a point.
Here's some example code on how to use Shapely.
from shapely.geometry import Point, Polygon
# Create Point objects
p1 = Point(24.82, 60.24)
p2 = Point(24.895, 60.05)
# Create a square
coords = [(24.89, 60.06), (24.75, 60.06), (24.75, 60.30), (24.89, 60.30)]
poly = Polygon(coords)
# PIP test with 'within'
p1.within(poly) # True
p2.within(poly) # False
# PIP test with 'contains'
poly.contains(p1) # True
poly.contains(p2) # False
How to get a list of points inside a polygon in Python
One of the simplest and most efficient ways of getting a list of points inside a polygon with Python is to rely on the Geopandas library and perform a spatial join. To do this, we simply create one geodataframe with points and another one with polygons, and join them with 'sjoin'.
In this example, we will use, as polygons, the shapefile of US counties and a list of random points, uniformly distributed in the lat-long region of (23,51)x(-126,-64).
Let's begin by importing all the libraries we are going to need
import geopandas as gpd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt from shapely.geometry
import Point
Now we open a shapefile containing our polygons
# Open the shapefile counties = gpd.GeoDataFrame.from_file('cb_2018_us_county_20m.shp')
That's it for the polygons, let's move on now to generate a GeoDataFrame with some points
# Generate random points
N=10000
lat = np.random.uniform(23,51,N)
lon = np.random.uniform(-126,-64,N)
# Create geodataframe from numpy arrays
df = pd.DataFrame({'lon':lon, 'lat':lat})
df['coords'] = list(zip(df['lon'],df['lat']))
df['coords'] = df['coords'].apply(Point)
points = gpd.GeoDataFrame(df, geometry='coords', crs=counties.crs)
Finally, let's perform the spatial join:
# Perform spatial join to match points and polygons
pointInPolys = gpd.tools.sjoin(points, counties, op="within", how='left')
That's it, we have just matched points and polygons!
Now, let's use this to retrieve the points from a specific polygon, as follows:
# Example use: get points in Los Angeles, CA. pnt_LA = points[pointInPolys.NAME=='Los Angeles']
That's it, now let's plot everything
# Plot map with points in LA in red
base = counties.boundary.plot(linewidth=1, edgecolor="black")
points.plot(ax=base, linewidth=1, color="blue", markersize=1)
pnt_LA.plot(ax=base, linewidth=1, color="red", markersize=8)
plt.show()
This will produce the following image:
Retrieving points in a specific polygon from a list of points-in-polygonsYou can download the complete script here
One of the simplest and most efficient ways of getting a list of points inside a polygon with Python is to rely on the Geopandas library and perform a spatial join. To do this, we simply create one geodataframe with points and another one with polygons, and join them with 'sjoin'.
In this example, we will use, as polygons, the shapefile of US counties and a list of random points, uniformly distributed in the lat-long region of (23,51)x(-126,-64).
Let's begin by importing all the libraries we are going to need
import geopandas as gpd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt from shapely.geometry
import Point
Now we open a shapefile containing our polygons
# Open the shapefile counties = gpd.GeoDataFrame.from_file('cb_2018_us_county_20m.shp')
That's it for the polygons, let's move on now to generate a GeoDataFrame with some points
# Generate random points
N=10000
lat = np.random.uniform(23,51,N)
lon = np.random.uniform(-126,-64,N)
# Create geodataframe from numpy arrays
df = pd.DataFrame({'lon':lon, 'lat':lat})
df['coords'] = list(zip(df['lon'],df['lat']))
df['coords'] = df['coords'].apply(Point)
points = gpd.GeoDataFrame(df, geometry='coords', crs=counties.crs)
Finally, let's perform the spatial join:
# Perform spatial join to match points and polygons
pointInPolys = gpd.tools.sjoin(points, counties, op="within", how='left')
That's it, we have just matched points and polygons!
Now, let's use this to retrieve the points from a specific polygon, as follows:
# Example use: get points in Los Angeles, CA. pnt_LA = points[pointInPolys.NAME=='Los Angeles']
That's it, now let's plot everything
# Plot map with points in LA in red
base = counties.boundary.plot(linewidth=1, edgecolor="black")
points.plot(ax=base, linewidth=1, color="blue", markersize=1)
pnt_LA.plot(ax=base, linewidth=1, color="red", markersize=8)
plt.show()
This will produce the following image:
You can download the complete script here
No comments:
Post a Comment