State of The BuzMap, 2019 Report

I have created a single map that lets you easily browse all bus, tram, metro, ferry  and train routes. I’ve loaded some data from Europe, Israel and India. It’s called BuzMap: Here is an update. 

I have done significant work over the last 3 months to get BuzMap further towards a usable product and at least prove the scalability of the architecture and I hope a glimpse of where we can go with this.. I have what I think is a viable user interaction model.  Upon touching or hovering over a section of the map you (should, see bugs below) get the longest service that runs through that part of the map highlighted on the map itself and an instance of its timetable in the margin. To move the mouse over to the margin or manipulate the map without crossing other lines and thus selecting them, click on the map to “hold” that route.  Below the timetable should be all the other services that use that line. You can click on any of those and get that route highlighted and a timetable instance. Click back on the map to enable browsing of other routes. There is currently only one timetable instance. Multi modal route planning is a well covered subject. BuzMap is something else. It’s a map. It is intended as a discovery tool that can direct you to detailed timetables and booking platforms elsewhere. It could easily be integrated as the background to a route finder or perhaps a hotel booking application, and be as reactive as the application requires.

The publication model works very well. I make extensive use of the amazing works of Graphhopper, Overpass and not least the MapBox box of tricks especially the geojson to vector tile technology which is preposterously fast. I can manufacture 2,500 tiles a second on an entire reprint thanks to the above software, and the simplification method works great. The heaviest tiles are in large metropolitan areas at level 10 and are under 60k. I believe this is the upper limit, no matter how dense your map is, the worst tiles will still be under 60k and most will normally be under 15k. We only need to reprint the entire map if we are making system wide changes, which right now is a common occurrence of course. For a single operator the data can be live in minutes.  

I have loaded most of the European data I can find, plus the whole of Israel, and also Indian Railways. We are missing, due to the bugs mentioned below, several famous narrow gauge lines in India, but not all,  and a lot of lines are broken on railways across the map from Tralee to Ledo. There is a particular railway in Sweden that links those highland lakes together which is completely missing despite the data being available. It is precisely the kind of service BuzMap is here to encourage you to investigate. The US and EU is where the vast bulk of the available data is, and perhaps where any viable business model might lie. I have at least been able to address many performance bottlenecks by tackling such a significant data set as 85,000 routes from over 500 different operators with saturation coverage in five countries, Ireland, Holland, Sweden, Estonia and Israel.  

I have yet to load much of the available GTFS data outside Europe. I haven’t even touched North America as I just don’t have the disk space left. I am going to need at least another 10TB to handle all currently existing GTFS without having to delete then download OSM data and the GTFS and various stages of transformation repeatedly, and still have room to run test suites. There are currently 8.3 million main map tiles in total, plus a further 316 million highlighting tiles. Every route has its own tile set. I imagine these numbers will easily double if we load all of North American and also Australia which has a lot of data available, plus all the bits and bobs dotted about the south. Switzerland, who predictably have supplied at country level, failed to load and I haven’t had chance to debug it, nor the disk space either now. As a railway enthusiast Switzerland is hallowed ground. I’ll be making sure we’ve got every yard of metre, narrow gauge and rack railway they’ve given us. I think we’ve even got cable cars in it. 

Even if all the data currently held by the Google route-finder was in public view, most of the world’s transit networks across Southern America, Africa and Asia will still be blank. There is a lot of work to do to encourage both state and privately run bus companies in these regions to get their bus stops geocoded and present their timetables publicly. Some operators such as Indian Railways actually forbid the caching of their timetable data without a license, so I may be in breach of this. The India map is both incomplete and uses data that is at least 4 years old. I will gladly redact the times, perhaps replaced with number of hours and minutes it takes to travel, if that keeps BuzMap on the right side of IR rules. 

Bugs, there are two serious ones:

  1. Route Merging. The consolidation of routes is working well topographically, we are getting a good map, at least on roads/buses.  However we are not merging all the Route IDs properly. I have a good idea why this is so but at the time of writing it still isn’t working quite properly and I need a lot more disk space to re-run. I’ve also had to colour trams in as buses due to another merge inheritance failure. I don’t store the list of route ids in the tiles, only a pointer/id to a group of them, i.e a single value. If a new service adds no new geography it will make zero impact on the tile sizes.  The list of routes will end up in the 100s for some sections and while I claim this will have no impact on the tile sizes if we’ve added no more geography, the client UI will need refining. 

  2. Train lines are more seriously broken. There are several reasons for this that I am aware of relating to the assignment of possible platform lines/ways through stations. In a city like Glasgow for instance this is certainly a non trivial task to do automatically, A radius search certainly won’t do the job.  There is another problem in there though which I need to debug. There are railway lines broken, or missing entirely, for no apparent reason. I think I am going to have to build an interactive editing tool where candidate platform nodes can be selected by the cartographer. Add that to a relative speed check on the service to flag suspect shunting around cities, and comparison of calculated polyline length to estimated point to point length of the route to highlight broken train lines, and it shouldn’t be too onerous a task to identify most of the problem areas. It’s feasible to do this level of work on railways, buses are too heavy and need to work entirely automatically. 

There is some bizarre usage of ferries in Sweden by some bus routes, due to stations being reused as bus stops, something my importer needs to resolve. A comparison of speeds between sections will help detect these anomalies, e.g. a bus actually does need to use the ferry but doesnt due to a map encoding error and skirts the entire estuary. We’ve also got bus only lanes to be considered in the Graphopper encoder. I am sure these are behind some of the craziness in St Petersburg, and Paris too. 

Some of the highlight tiles didn’t get made before we hit 99% of the disk, so sometimes you get a timetable but no highlight. There are also “bugs” in the map itself where road junctions are not passable in one or both directions, e.g. some roundabouts and at least one check point on the Israeli border which nevertheless has a bus service straight through it. These will require individual consideration and careful editing of the map itself, but can be detected simply by relative speed checking. 

The interaction doesn’t work very well yet on touch-screen/mobile devices, I’m only rarely getting an event from touching a line. I have tried several approaches and cannot regularly get an event. I am hoping to get help on this now we have something tangible. It is of course critical that it works well on smartphones.  I’ve also just noticed I am losing events even with a mouse, I have to wave the cursor several times over lines sometimes but not always. The client is the least developed part so far as for the most part it’s all Leaflet and vector tiles doing the work. It’s going to get re-written from the top before any significant attempt to launch this.

What Next 

My main concerns are nailing the railways which have a lot of failures including merge bugs common in buses. I have to get an encoder working for Graphhopper that will consume all roads including bus only ways, rail of all kinds as long as not “unused”, cable cars, cable trams, and ferries. Due to the nature of how the merging  works the railways of any kind can and need to be treated as all the same network type. 

I need to enable ferries ASAP. They are just the kind of thing I envisage the map helping people see. I did implement a hack previously for them which I avoided re-hacking into the latest rewrite. The Baltic coast in particular comes alive and the Agean would also look superb if the data were available anywhere. 

We are going to need to do things like overlay place names as I have scribbled all over them with my bus routes. Another cool trick would be to get the different networks, ferry, tram, metro, railway or road, to flip to the foreground as you touch them. Both of these thanks to these amazing vector tiles. Once you get saturation coverage of a metropolis this becomes not just a gimmick but quite necessary, allowing us to pack far more information into just the one map and it still works. 

The refinement of the UI is going to be a challenge. This isn’t a multi modal route finder, it’s a map. Sure, as soon as you have honed in on somewhere to investigate the user is obviously going to need that level of detail about exactly when and how often. I could attempt to run Open Trip Planner for everywhere, I have the data, but it is a distraction from the job of building a working map. If BuzMap can establish itself as a default first point of investigation for any terrestrial travel, directing users to relevant local route finding platforms and booking engines, it will be serving us well. 

Once I have obtained some more disk I will process all of the US and Canada data. New York, especially with it’s ferries, is going to be spectacular.  AmTrak, which I have tackled previously, is a most important addition as the US rail network is very heavily freight only. I think less than 20% of it actually takes passengers. I would also love to map Greyhound or any other national bus operators in the US if I could just locate their GTFS.

Most European countries have little bus data available, the UK in particular has virtually none currently visible. I have Transport for Greater Manchester but it didn’t get past the public domain GTFS loader. I have seen there is a project run by the UK government aimed at providing “local bus data in England .. by early 2020”.  This would be a great step forward and I would hope it bears results and grows to be a comprehensive listing of operators of any size across the UK. This data can effectively be held in a version control repository, as the Belgian rail group have done. I will use the Belgian and any other new data available in the next phase.

My keen interest is in South & Central America, Africa and Asia. For Africa I am aware of GTFS for Cairo, Tunisian and Algerian Railways, Accra, Nairobi and I think Cape Town but that’s about it. 

In Asia, a continent festooned with state run bus operators and home to a score of mega cities, the situation is arguably worse. I have next to nothing for Japan and less still for South Korea. We are in a position to build a bus map of the entire PRC if we can get hold of the data. Elsewhere, there are bus booking engines for long distance travel in south east and southern Asia that have the key data, so at that level we could build a map quite quickly. Manilla has data, and Delhi is highly regulated and we should be able to do this there. What I’d really like is for the likes of Himachal Pradesh or Nagaland to compile and then give us their State Bus Company’s GTFS.   

Some of the major cities of Argentina, Brazil and Chile have data. There is an awful lot more that is possible everywhere. 

Special thanks to OpenCage for lending me a 32G machine so that I can run a continental sized Graphhopper instance and for their continued moral support and encouragement, Digital Ocean for giving me a free 8G machine for 6 months, and also to the Graphhopper group themselves and MapBox, without which none of this is possible. And of course everyone who has ever contributed to The Open Street Map in any way.

Thank you for your interest.

Mark LesterWhat is BuzMap ?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create your website at WordPress.com
Get started
%d bloggers like this: