Simplify a LineString using the Ramer–Douglas–Peucker or Visvalingam-Whyatt algorithms

`pip install simplification`

Please use a recent (>= 8.1.2) version of `pip`

.

- Python 3.7
- Python 3.8
- Python 3.8 (Linux and macOS Darwin only)
- Python 3.9 (Linux and macOS Darwin only)
- Python 3.10 (Linux and macOS Darwin only)

- Linux (
`manylinux1`

-compatible) - macOS Darwin
- Windows 32-bit / 64-bit

```
import numpy as np
from simplification.cutil import (
simplify_coords,
simplify_coords_idx,
simplify_coords_vw,
simplify_coords_vw_idx,
simplify_coords_vwp,
)
# Using Ramer–Douglas–Peucker
coords = [
[0.0, 0.0],
[5.0, 4.0],
[11.0, 5.5],
[17.3, 3.2],
[27.8, 0.1]
]
# For RDP, Try an epsilon of 1.0 to start with. Other sensible values include 0.01, 0.001
simplified = simplify_coords(coords, 1.0)
# simplified is [[0.0, 0.0], [5.0, 4.0], [11.0, 5.5], [27.8, 0.1]]
# Using Visvalingam-Whyatt
# You can also pass numpy arrays, in which case you'll get numpy arrays back
coords_vw = np.array([
[5.0, 2.0],
[3.0, 8.0],
[6.0, 20.0],
[7.0, 25.0],
[10.0, 10.0]
])
simplified_vw = simplify_coords_vw(coords_vw, 30.0)
# simplified_vw is [[5.0, 2.0], [7.0, 25.0], [10.0, 10.0]]
```

Passing empty and/or 1-element lists will return them unaltered.

`simplification`

now has:

`cutil.simplify_coords_idx`

`cutil.simplify_coords_vw_idx`

The values returned by these functions are the **retained** indices. In order to use them as e.g. a masked array in Numpy, something like the following will work:

```
import numpy as np
from simplification.cutil import simplify_coords_idx
# assume an array of coordinates: orig
simplified = simplify_coords_idx(orig, 1.0)
# build new geometry using only retained coordinates
orig_simplified = orig[simplified]
```

You can use the topology-preserving variant of `VW`

for this: `simplify_coords_vwp`

. It's slower, but has a far greater likelihood of producing a valid geometry.

No problem; Decode them to LineStrings first.

```
# pip install pypolyline before you do this
from pypolyline.cutil import decode_polyline
# an iterable of Google-encoded Polylines, so precision is 5. For OSRM &c., it's 6
decoded = (decode_polyline(line, 5) for line in polylines)
simplified = [simplify_coords(line, 1.0) for line in decoded]
```

FFI and a Rust binary

I should think so.

Using `numpy`

arrays for input and output, the library can be reasonably expected to process around 2500 1000-point LineStrings per second on a Core i7 or equivalent, for a 98%+ reduction in size.

A larger LineString, containing 200k+ points can be reduced to around 3k points (98.5%+) in around 50ms using RDP.

This is based on a test harness available here.

All benchmarks are subjective, and pathological input will greatly increase processing time. Error-checking is non-existent at this point.

Get A Weekly Email With Trending Projects For These Topics

No Spam. Unsubscribe easily at any time.

Python (1,141,503)

Geospatial (825)

Geo (316)

Computational Geometry (250)

Rdp (132)

Related Projects