Posted year ago

Making our own tiny Google Maps

I wanted to write this post for a long time. In fact, I made a draft of it half year ago and never managed to publish it. But it finally did happen, and in this one, I will combine knowledge from literally every post I've written to this date on this blog. That's why I've posted those seemingly useless or unrelated tricks and algorithms (which you should definitely check out!).

What makes a map engine like Google Maps?

First, it involves conversion of data coordinates using Web Mercator. Then data is divided into square tiles (which works perfectly with the fact that Mercator makes the world a giant square). Each tile can be assigned different data with different complexity that can be either precalculated or queried dynamically from a special database. The rest is about determining which tiles user is seeing, downloading them from API and rendering.

This time I will skip tiling as it isn't necessary to see the result of the renderer. We will simply start with a dataset that is small enough to render it as a whole and make it zoomable and draggable (by the way, the name for such thing is a slippy map).

Data preparation

I will download a bunch of buildings surrounding my university. I've used Get Lat+Lon to find out coordinates of the center point and then calculated the bounding box of $200m \times 200m$.

(you can get degreesToMeters and metersToDegrees from Web Mercator projection article).

const center = degreesToMeters(50.068, 19.913)
const ne = metersToDegrees(center.x + 100, center.y + 100)
const sw = metersToDegrees(center.x - 100, center.y - 100)

Remember the bounding box format, it was: [bbox:south,west,north,east]. And we get it from:

const boundingBox = `${sw.lat},${sw.lng},${ne.lat},${ne.lng}`

And now the query. To start with something simple to build upon, I will go with just the buildings for now. Try it in overpass turbo:

const query = `
  [out:json][bbox:${boundingBox}];
  (
    way[building];
    relation[building];
  );
  (._;>;);
  out;`

Downloading

const cleared = query.replace(/\s+/g, '')
const command = `wget -O data.json http://overpass-api.de/api/interpreter?data="${cleared}"`

I am skipping a bit of details here, but hope you won't mind. Run the content of command either copy pasted to the terminal or wrapped in some NodeJS script.

Conversion to GeoJSON

Convert it to GeoJSON for convenience in rendering using the osmtogeojson.

osmtogeojson data.json > data.geojson

Normalization

Before proceeding, I will convert and normalize data a bit.

let minX = Infinity
let minY = Infinity

// For each polygon, stores its points as two-element arrays.
const coordinates = data.features
  .filter(f => f.geometry.type === 'Polygon')
  .map(f =>
    f.geometry.coordinates[0].map(p => {
      const { x, y } = degreesToMeters(p[1], p[0])
      if (x < minX) minX = x
      if (y < minY) minY = y
      return [x, y]
    })
  )

// Takes the previous array, 'normalizes' coordinates by subtracting
// minimum value on both axes and flattens each polygon's array.
const offseted = coordinates.map(polygon =>
  polygon.flatMap(p => [p[0] - minX, p[1] - minY])
)

Triangulation

Then I am triangulating the whole thing:

// Maps each polygon with an array of vertices from triangulating it.
const shapes = offseted.map(polygon => {
  const node = linkedList(polygon)
  const triangles = earCut(node)
  const vertices = triangles.flatMap(i => [polygon[2 * i], polygon[2 * i + 1]])
  return vertices
})

// Join everything together for rendering (because it's just triangles,
// as long as we use the same set of shaders, it can be one big blob
// sent to the GPU).
const vertices = new Float32Array(shapes.flat())

Renderer

In case something goes wrong, I am for now showing the wireframes like in the Wireframes with barycentric coordinates. The whole renderer is basically the same, so I won't cover it here in depth.

Making it scrollable and draggable

I am using some variables to help with calculations. Width is the gl.canvas.width divided by pixel ratio of the device. Height... you know. Zoom is changed by the scrolling. Factor is there to steer how fast the zoom works. Mouse is for storing mouse position on the screen.

Setup

let width = 0
let height = 0
let zoom = 0
let offset = { x: 0, y: -500 }
const factor = 5

Scroll event

When user scrolls, a bit of magic happens. First of all, zoom is updated by a constant rate (the constant is there to normalize it across different browsers; you never know).

Then happens update of offset. It is necessary to make the scroll look like expected, i.e. when we are scrolling over some point, that point stays under the cursor.

The key here is to notice that when we are scaling the world after zooming, the points between mouse cursor and the screen edge are also scaled. We can deduce that the solution is to offset the world back. Exactly by the difference between size in pixels of current mouse position and the unscaled one. In this case: $o = o - m(1.01^{z'} - 1.01^{z})$.

const handleScroll = event => {
  const sign = event.deltaY >= 0 ? 1 : -1
  const previous = Math.pow(1.01, zoom)
  zoom += sign * factor
  const delta = Math.pow(1.01, zoom) - previous
  offset.x -= event.offsetX * delta
  offset.y -= event.offsetY * delta

  event.preventDefault()
  requestAnimationFrame(render)
}

Setting cursors

To make it look nicer, I am setting cursors so that it indicates to the user that the map can be dragged.

// Somewhere close to top of the code
document.body.style.cursor = 'grab'

// ...

const handleMouseDown = () => {
  document.body.style.cursor = 'grabbing'
}

const handleMouseUp = () => {
  document.body.style.cursor = 'grab'
}

Mouse move

When user moves mouse, we want to apply offset. I am doing it only when the left mouse button is pressed. Note that by the use of event.buttons we can entirely skip listening to mousedown and mouseup. The information is available for us every time.

The offset must be also scaled to keep the point by which user drags under the cursor. Without that, the user will intuitively notice something is wrong. Try it.

const handleMouseMove = event => {
  if (event.buttons !== 1) return // If mouse is not pressed
  offset.x -= event.movementX * Math.pow(1.01, zoom)
  offset.y -= event.movementY * Math.pow(1.01, zoom)
  requestAnimationFrame(render)
}

Rerendering

Now a bit about why I've put requestAnimationFrame in two of previous handlers. This handy callback tells the browser to execute given function before next repainting of the screen. This way we are making sure that any change in the information important for us makes the browser refresh our map. And thanks to that, we don't have to constantly rerender the map just to heat up the user's GPU.

Keep in mind that each call to the requestAnimationFrame adds another request to run a function. This way we can run into a situation when while scrolling we will fire those faster than the browser can render them. But it's not a noticeable problem and I am leaving it for now.

I am also wrapping resize in a similar function:

const handleResize = () => {
  requestAnimationFrame(render)
}

Draw function

I am using a 4-dimensional matrix as in the 3D example. You can do as you want. I put scale in the projection. The typical camera for me is about translating the view by the offset. You could of course write it as inverse(translation(offset.x, offset.y, 0)), but here I am saving CPU cycles for the better times. Then, because the whole scene is flipped on the Y axis, I am scaling that one by $-1$ and moving all triangles a screen down.

const pX = width * Math.pow(1.01, zoom)
const pY = height * Math.pow(1.01, zoom)
const p = projection(pX, pY, 1)
const v = translation(-offset.x, -offset.y, 0)
const m = multiply(scaling(1, -1, 1), translation(0, -height, 0))
const matrix = multiply(p, v, m)

Event handlers

The last missing part is adding the event listeners. Just a bunch of them.

window.addEventListener('resize', handleResize, false)
window.addEventListener('mousewheel', handleScroll, false)
window.addEventListener('mousedown', handleMouseDown, false)
window.addEventListener('mouseup', handleMouseUp, false)
window.addEventListener('mousemove', handleMouseMove, false)

Result

You can see it on a separate page for convenience.

What's next?

You've entered an endless road to creating a perfect map service. The opportunities to sink more and more hours into it are endless. Here's what you can do:

Add roads and rivers

Add different types of objects rendered and fetched from the API. Roads, rivers, parks and forests are a way to go. You need to come up with a way to render them differently. You can go with separate shaders for each type, which gives the biggest flexibility if you want to add more sophisticated effects to them. Or you can go with a simple solution and sneak color information in the third and fourth dimension of points (that's a trick that graphic programmers love to do)

Simplify shapes

Have you noticed that some shapes are unnecessarily complicated while they are just some tiny background buildings? There are algorithms for simplifying shapes that could be used here.

Add tiling

This point is actually a bit of a trap. This involves two parts: having client for displaying and some server that will provide those tiles. Client needs to figure out what tiles user can currently see and fetch them from API. API has to find a way to serve them to the user. So it needs to either fetch them from some third party (but this would make it painfully slow), be limited to map that fits into the memory, or use a database (probably one with a good support for geospatial data).

There are also other aspects that you would have to find a balance on:

binary data or trivial to load but heavier for transfer JSON
triangulate server side (with cache) or make the client do it on the fly

Put labels on the map

One more thing that makes your map different from the Google Maps (maybe not the only, but this one will soon be very noticeable) is lack of text.

There's one huge hidden trap that came with taking the path of a WebGL renderer: there is no text here.

As we had to transform data into triangles ourselves, the same problem comes with text. Unfortunately this time, it will be much harder and broader topic.

If you want to get started with it, be advised that there are several approaches to the problem, but one that seems to work quite OK for maps is SDF text rendering (it is used for example by Mapbox).

Go 3D

Now that you've created a WebGL renderer, nothing stops you from going 3D. This involves fetching any useful data from OSM and using it to generate 3D shapes. You can find more in the article about Simple 3D buildings on the OSM wiki. There were already a lot of attempts (more or less successful) that you can take inspiration from.

Geocoding

It means translating input text like address to a latitude and longitude coordinates.

Check out Pelias, an open source geocoder built on top of Elasticsearch, part of the post-mapzen legacy.

Routing

Which means telling which sequence of roads gives the shortest driveable path between two places. For that you can have a look at, for example, OSRM.

Cool resources

Get Lat+Lon – handy service mentioned before that I used for checking coordinates of a point on the map.

overpass turbo – site visualizing results of Overpass API queries.

osmtogeojson – converter between OSM JSON and GeoJSON.

Map label placement in Mapbox GL – more on the topic of labels.

<- home

@tchayen

React (Native) dev. SECCLO @AaltoUniversity. He/him.

Twitter / Github