Scraping 241 UK council planning portals – 2.6M decisions so far (news.ycombinator.com)
UK planning data is technically public. In practice it's locked behind 400+ different council portals, some still running bespoke ASP.NET that looks like it dates from 2004, some behind AWS WAF, all with subtly different schemas. I've spent four months scraping them. I'm now at 241 councils and 2.6 million decisions across England, Scotland and Wales.
The scraping problem
Most UK councils run one of a handful of portal systems, Idox being the most common. In theory this makes things easy. In practice every council has configured theirs differently, some block non-browser requests via TLS fingerprinting, some have rate limits that will get you banned inside 10 minutes, and a handful are running the aforementioned bespoke ASP.NET.
I ended up writing several scrapers: a standard requests-based one, a Playwright-based one for councils that block anything that doesn't look like a real browser, and a curl_cffi one for TLS fingerprinting. Some councils I still can't get. Liverpool's portal sits behind AWS WAF with a JavaScript challenge. I have a working Playwright-based scraper that solves the challenge once and reuses cookies, but the WAF rate-limits the IP after about 10 requests and then blocks me for a day. So I have 60k Liverpool decisions from an old scrape and no easy way to add more.
What I found
The approval rate stuff is what most people come for. Nationally it's around 88%, but it varies wildly by ward within a council, not just between councils.
The more interesting finding came from the time-to-decision data. Across 119 English and Welsh councils, 36.5% of home extension applications missed the statutory 8-week target in 2025, up from 27.9% in 2019. Guildford is the worst at scale: 66% of decisions over target, averaging 13.3 weeks.
What it is now
A postcode checker (free) and paid PDF reports (£19/£79). Zero paying customers so far, which is fine. I've been heads down on data quality and coverage.
Site is planninglens.co.uk if you want to poke around. AMA on the scraping side – that's where the interesting problems are.
I understand wanting to get money, but honestly, there is no way I would give money to this website in it's current state, you are giving me far too little info before asking me to hand over a credit card.
Then, if someone gives you £19, a crazy amount of money honestly, the last page of the report is an advert to give them 4 times more!
I don't know if I would pay £19 for a general state-of-the-area report. I would almost certainly have paid £100-300 for a service that took my planning application, critically reviewed it and told me which aspects were and were not likely to pass, with references to specific examples within my local area.
Heads up tho... it's behind AWS WAF with a JS challenge. Solving the challenge once works fine, but the WAF rate-limits the IP after ~10 requests and blocks for the rest of the day. So getting a session is doable, getting through 80,000 decisions is the hard bit. If you crack it I want to know! Cheers.
On the grind, why not get an agent to help you build the long tail of deterministic scrapers? Claude etc is really shockingly good at this kind of moderate-complexity iterative work, it will just keep going around the fetch/parse/understand loop until it has what you're looking for.
It would be good to add appeal data in (also a public gateway) to show which councils are just being unreasonable.
I personally think the planning regulations in this country are the cause of many ills, including the housing shortage. It just costs so much to get through planning these days, it is often just not worth it. Data like this could help us get that changed.
You guys have all kinds of pro-individualistic, borderline nonsensical residental housing laws like "right to light" and "right to view". It's completely incompatible with "build more". Most British people view their privacy (or perceived privacy) as a higher priority than fixing the housing market. "It's so overlooked" is such common comment and it's almost bizarre to someone used to living in a higher density environment (like the UK very much is).
> You may not use automated tools to scrape, copy, or bulk-download data from our service.
Pot kettle, huh.
For the more challenging scrapes, would highly recommend using the Chrome Devtools MCP to be able to attach the network requests, being made by the browser to the site, as context for your agent/LLM chat - this approach really helped me to write a solid API-based scraper (also using curl_cffi) and bypassed the old tedious playwright-based approach I used to rely on.
Careful not to expose the councils too publicly before they shut you off
There's a Royal Institute of Town Planners, they probably have a magazine you could advertise in (but equally that might get you blocked, idk).
RICS people could probably use the data too? I guess it's useful house-buyer info; houses in the vicinity had successful loft conversions, say.
On the data side - it's something of a moat for you now, but I could see you being successful with FOI requests. An MP might be interested in championing open data access.
I appreciate that won't necessarily capture live / recent data. But it might be quicker than waiting for rate-limits to reset.
> UK planning data is technically public.
it's public, but still copyrighted by those who submitted it
the councils also have database rights over their database, unless you've obtained explicit permission from them directly
https://en.wikipedia.org/wiki/Database_right#United_Kingdom
> I ended up writing several scrapers: a standard requests-based one, a Playwright-based one for councils that block anything that doesn't look like a real browser, and a curl_cffi one for TLS fingerprinting.
so they're explicitly trying to stop you doing this, and ... you're openly admitting to bypassing their technical measures to try and stop you?
have you heard of the Computer Misuse Act?
I doubt the 240 councils are going to be happy once they find out you've done this, especially if you're selling it on for profit
See also the open addresses project by Data Adaptive [1] which is using Freedom of Information requests to publish public council tax address data. The problem they have run into there is that their address datasets are derived from proprietary Ordnance Survey data.
It looks like data.gov.uk is in the process of standardising the planning application process, and publishing them under OGL [2].
[1]: https://www.owenboswarva.com/blog/post-addr44.htm [2]: https://www.planning.data.gov.uk/dataset/planning-applicatio...
I did a search for my postcode and got given results for a different area and council miles away
The script was extracting prefixes from individual application addresses, and Ceredigion's database happened to have a chunk of records with HR* postcodes (data entry errors at source. The addresses are Welsh places like Aberystwyth and Borth, but with Hereford postcodes attached). Those errors polluted the lookup table, so HR* ended up pointing at Ceredigion.
Fixed by trusting only the manual postcode-to-council mappings for councils that have them, rather than supplementing from address extraction. The postcode you sent now correctly shows "not yet in coverage" rather than results from rural Wales. Cheers for flagging. That was a zinger of a bug.
1. Brilliant! Governments (and corps) treat public data like it’s theirs not ours. Information yearns to be free.
2. Having said that, you are likely violating T&Cs by scraping at all.
3. It is a lot easier to defend your position if you are making it free and public yourself.
4. But paying for food is nice
5. I suggest the business model here is providing architects and lawyers with strong evidence of prior planning decisions nationally
Most people applying for (difficult) planning have experience locally. But the planning system is a mess because it is not coherent nationally or regionally. The win here is not providing a copy of your data (that has legal issues) but providing pointers to decisions that support the case of the person paying you.
So I want to turn an old pub into tasteful housing and a cafe for the local village. The local planning team don’t like it, I could spend money bribing them and the councillors (see how much I understand British democracy) or I could get from you the fifteen pub to housing conversion decisions from around the country and use that to help my bribed councillors defend their u-turn
Everyone wins :-)
But it’s a big mindset chnage (one that will benefit the whole country) but it’s slow.
I think the “push for public policy improvements” angle if genuine will get you a lot more respect and kudos when things get sticky. Good luck
No-one has figured out how to make money off open source (while sticking to the basic principles. Jeff Bezos makes a fortune off of it)
Most people who open source their code that I have known and still wanted to be paid / recognised for their effort have always been disappointed
Can I suggest you mentally put the work you have done to date into a box marked “the past”, open the data, start yourself as part of the community trying to make government code and data open, and sell your skills - the old “consultancy paying the bills” approach
Trying to make cash off public data will just confuse the message, and start to build resentment. Make a clean and clear statement, Sell your services on top. Expand to other forms of data scraping in government.
It’s a tough road - good luck
A few weeks ago, I'd have said the SaaS product is the play, but now I'm not so sure. Cheers for taking the time.
If you really want the data, just FOI it for goodness' sake.
I get the distinct impression that many of these outfits aren't really advocating for impoved transparency but are simply trying to exploit and monetise illicitly obtained government data to make a quick buck.
I'm not implying that anything would get deliberately redacted, but it seems likely that information released through other channels would not match the web. A request might also reveal information that was not on the web.
What other choices are there?
The open source angle is something I'm increasingly considering, especially after a local government IT person made a fair point on this thread about the strain it causes. It won't fix the scraping load directly, but might frame the project as public-interest rather than attempting to make a bit of extra money. Tbh, the real value is probably in serving property developers and consultants, not emailing £19 PDFs to homeowners. Got a lot to think about.
I am not experienced at this at all but I managed to get around 107k from Wirral. the parsing afterwards was a bit crap. I could probably dig out the code from somewhere. the approach i took was quite long winded:
created a txt file with ID numbers 100000 to 200000 (or similar) then i think i used playwriter to pick a random number and scrape it then write the details to a CSV. then in another script i checked the CSV to see if the decision date was present and if it wasn't then repopulate the txt file. a massive pain and it of course changes every week!
I can think of a good use case and people who would be interested in this data though. I would love to see your work, especially for IDOX. don't give up, as others have said change your business model. members of the public are not your audience.
I dont think. look at selling the whole database rather than one line item. you will probably copies of the PDF decision notices etc. Oh and people who say that you should do a FOI.... ha! you would need to FOI every single house and street in the borough. the work that would create for the council would be expedential. The council charge businesses to access this free data. Councils should make the data available as a free download hosted. then all this would stop.