1. Brilliant! Governments (and corps) treat public data like it’s theirs not ours. Information yearns to be free.
2. Having said that, you are likely violating T&Cs by scraping at all.
3. It is a lot easier to defend your position if you are making it free and public yourself.
4. But paying for food is nice
5. I suggest the business model here is providing architects and lawyers with strong evidence of prior planning decisions nationally
Most people applying for (difficult) planning have experience locally. But the planning system is a mess because it is not coherent nationally or regionally. The win here is not providing a copy of your data (that has legal issues) but providing pointers to decisions that support the case of the person paying you.
So I want to turn an old pub into tasteful housing and a cafe for the local village. The local planning team don’t like it, I could spend money bribing them and the councillors (see how much I understand British democracy) or I could get from you the fifteen pub to housing conversion decisions from around the country and use that to help my bribed councillors defend their u-turn
Cheers, appreciate the feedback. The architect/consultant precedent angle is interesting and a couple of other commenters have already nudged me in similar directions. Tbh... you're likely right that the strongest commercial play isn't B2C £19 reports, it's giving someone fighting a contested case the national pattern across 15 similar pub conversions, the appeal outcomes, what stuck and what didn't. That's a very different product to what I have now but the data supports it.
On the T&Cs/legal stuff... I'm not going to pretend I have perfect clarity on it. The position I'd defend is that the data is statutorily public, councils are required by law to publish it, I respect rate limits, and I'm aggregating not republishing in bulk. But there is this grey area between data being public to view and being usable for a commercial product, and I haven't fully nailed it down.
I agree on the “public” data issue - I spent a long time campaigning for better FOSS / data access in government and there are some great people pushing in and outside local and central gov.
But it’s a big mindset chnage (one that will benefit the whole country) but it’s slow.
I think the “push for public policy improvements” angle if genuine will get you a lot more respect and kudos when things get sticky. Good luck
As somebody who works in local government IT, consistent scraping of our data like this is the bane of our life. We get hit by thousands of these, many with no rate limiting, making hugely intensive requests, that cause downtime and knock-on effects for actual customers and citizens. We block IPs, add captchas, and yet it persists.
If you really want the data, just FOI it for goodness' sake.
I get the distinct impression that many of these outfits aren't really advocating for impoved transparency but are simply trying to exploit and monetise illicitly obtained government data to make a quick buck.
Fair points and yeah you must be sick of unrate-limited mass scraping. I run with 1.5-3 second delays from a single residential IP and back off when portals push back, but from your side I look the same as someone hammering you.
On your point regarding FOI, what you say is fair. I should probably have led with that for the trickier councils. But the honest reason I haven't is doing 240 FOI requests at scale felt like it'd put a different kind of strain on councils, but if you're telling me the scraping is worse then I take that seriously.
On "monetise illicitly obtained data"... I'm not going to pretend the £19 is altruism. But there is a public interest in this data being navigable across council boundaries, and that's not something individual councils can deliver. I must stress that I'm not sure I've got the model right yet and a lot of feedback today is pushing me toward more free, which I'm seriously considering.
Maybe I'm just naive, but why wouldn't a citizen do both?
I'm not implying that anything would get deliberately redacted, but it seems likely that information released through other channels would not match the web. A request might also reveal information that was not on the web.
So, this sounds exciting to me, but the postcode checker really feels like a spam as a user. All it tells me is 'Mixed results'. I could make a website that prints 'mixed results', I bet most results are 'mixed'!
I understand wanting to get money, but honestly, there is no way I would give money to this website in it's current state, you are giving me far too little info before asking me to hand over a credit card.
Then, if someone gives you £19, a crazy amount of money honestly, the last page of the report is an advert to give them 4 times more!
Really useful feedback, cheers. Yeah, "Mixed results" is kinda rubbish as you say. It should give you something concrete before asking for anything. I'll fix that today.
Fair point on the £79 upsell at the end of a £19 report too. That's tone deaf and I'll move it.
On the £19... I'll think about it, but you're right the site needs to do more to justify the spend before pulling out a card. Appreciate the honest take!
Just a quick follow up, if my reply seemed very harsh, view that as a sign of how enthusiastic I was to see the website at first. I understand wanting to make money, but I'd seriously consider giving a lot more away (maybe even the basic report stuff) away for free, I'd love to explore my local area, my parent's, be nosey what life is like in Oxford (a place I previously lived), but even if I was willing to pay (I'm not), having to stop, get PDF, download, really breaks the flow.
No, that's absolutely a fair follow-up and not harsh at all. It's very helpful. The "be nosey about places you used to live" use case is exactly what the postcode tool should serve (thinking about it), and right now it doesn't. You're right that PDF-downloads break flow badly. Tbh... that's a hangover from the "people want a thing they can save" assumption that I'm still stuck in, I guess.
I'm still on the fence about giving the paid reports away wholesale, but the gap between "tells you nothing" and "£19 PDF" is way too big. I'm gonna need a middle layer of free but actually useful exploration on the site. Will have a solid think about this today. Appreciate the feedback!
I'm also enthusiastic, it's not often you see people find a genuinely underserved niche and you have.
I don't know if I would pay £19 for a general state-of-the-area report. I would almost certainly have paid £100-300 for a service that took my planning application, critically reviewed it and told me which aspects were and were not likely to pass, with references to specific examples within my local area.
Thanks, honestly that means a lot! Yeah, the pre-submission review idea is interesting and I've thought about it. I have the data to surface "applications similar to yours in your ward, here's what got approved and what didn't" but I haven't built it as a workflow because it requires the user to upload their plans... and that's a different kind of trust ask, but yeah, it is definitely worth revisiting. £100-500 is also a much more honest price for something that genuinely changes a decision. £19 is in the awkward "too much for curiosity, too little for stakes" zone you and the other commenter are both pointing at.
Just checking, are you using an LLM to reply? Your replies are riddled with things LLMs are good at, like making quoted analogies that make no sense. They're not even analogies
What benefit would people gain from the reports? Average rate of success/time is interesting, but I'm not sure what you'd do with this information other than a bit of local press discourse. I suppose it's nicely timed for the council elections?
Honest answer... I don't fully know, zero paying customers so it's still very much a hypothesis. The two use cases I think hold up: (1) people pre-buying a house with extension potential, who otherwise guess or pay £500+ for a planning consultant; (2) homeowners about to commission £2-5k of architect drawings who want a sanity check before proceeding. Someone else suggested £100-500 for a proper pre-submission review which is probably better for that second case than my £19 report. The "general state-of-area" framing is the weakest one and you're right it's mostly local press discourse — that's marketing not revenue.
I work with public data, and I'd love to get access to this data, but I suspect that although you have scraped the data from public websites, there are licensing and copyright implications for actually using it.
See also the open addresses project by Data Adaptive [1] which is using Freedom of Information requests to publish public council tax address data. The problem they have run into there is that their address datasets are derived from proprietary Ordnance Survey data.
It looks like data.gov.uk is in the process of standardising the planning application process, and publishing them under OGL [2].
Thanks and yeah, some of my boundary data (for the choropleth) comes from ONS open boundary files which I think are OGL but I'd need to check the chain of derivation. On the data.gov.uk standardisation, I've seen it but last I looked it was policy and boundaries, not actual decisions. Has that changed? If they're publishing decisions under OGL I'd gladly ditch the scraping for a proper feed.
On licensing more generally... I haven't fully nailed it down. Showing aggregates and pointing back to source, but yeah there's a gap between "data is public" and "do whatever you want with it commercially".
Great site. This data should really be more accessible. Planning in the UK is a total crapshoot, subject to the whims of the planning authorities. In our case, a simple rear extension and dormer loft conversion, similar to hundreds of thousands across the country, we ended up having to appeal which added 2 years and tens of thousands of pounds in costs to our extension project. Our area shows up as a high refusal area, which tracks.
It would be good to add appeal data in (also a public gateway) to show which councils are just being unreasonable.
I personally think the planning regulations in this country are the cause of many ills, including the housing shortage. It just costs so much to get through planning these days, it is often just not worth it. Data like this could help us get that changed.
Maybe a tongue-in-cheek comment but regulations are that way because you guys want it that way (maybe not you personally). If it wasn't like that, nothing would stop a garbage incinerator or a quarry popping up a few hundred meters from houses (which happens in European countries with more deregulated planning/zoning regulations).
You guys have all kinds of pro-individualistic, borderline nonsensical residental housing laws like "right to light" and "right to view". It's completely incompatible with "build more". Most British people view their privacy (or perceived privacy) as a higher priority than fixing the housing market. "It's so overlooked" is such common comment and it's almost bizarre to someone used to living in a higher density environment (like the UK very much is).
Waste disposal and planning for quarrying and mineral extraction are different functions, decided at a higher tier of local government, and are not directly comparable to development management/planning.
This is awesome! Worked on something similar albeit a different industry.
For the more challenging scrapes, would highly recommend using the Chrome Devtools MCP to be able to attach the network requests, being made by the browser to the site, as context for your agent/LLM chat - this approach really helped me to write a solid API-based scraper (also using curl_cffi) and bypassed the old tedious playwright-based approach I used to rely on.
Nice thinking. Hadn't thought of DevTools MCP that way. Curl_cffi I've used for TLS fingerprinting (Edinburgh was the first one) but the discovery side I've been doing manually... open DevTools, look at the request, copy as cURL, work out which params can be pruned. Automating that loop with an LLM in the middle would speed things up a lot, especially for the bespoke long tail. Will look into that this week. Thanks!
Have you spoken to any planners, a quick search for similar applications in other LAs might be a useful thing for them.
There's a Royal Institute of Town Planners, they probably have a magazine you could advertise in (but equally that might get you blocked, idk).
RICS people could probably use the data too? I guess it's useful house-buyer info; houses in the vicinity had successful loft conversions, say.
On the data side - it's something of a moat for you now, but I could see you being successful with FOI requests. An MP might be interested in championing open data access.
All good points. I've been so busy with the data collection and just "irl stuff" that I haven't spoken to planners directly which is an oversight on my part — they're the obvious power users. RTPI/RICS are both on my list but as I said, I've been focusing on data more than distribution. Probably the wrong order tbh.
FOI is interesting, especially for the trickier portals (Liverpool's WAF, the dead-portal ones). It might be cleaner than scraping. MP/open-data angle is definitely something I hadn't seriously considered. Worth thinking about tho! Thanks.
It's is the most ridiculous situation with council technology that they all use different providers for what are fundamentally the same functions. It's the same for council tax and a host of other services as it is for planning. Consequently, at least from the various portals I've used, they all do it badly. This absolutely could and should be done by a single, well funded central team.
GDS was nationalised and they certainly did a better, albeit not perfect, job than the myriad of private solutions councils use. There just doesn't appear to be the capability to properly specify and source IT at a council level.
> I ended up writing several scrapers: a standard requests-based one, a Playwright-based one for councils that block anything that doesn't look like a real browser, and a curl_cffi one for TLS fingerprinting.
so they're explicitly trying to stop you doing this, and ... you're openly admitting to bypassing their technical measures to try and stop you?
have you heard of the Computer Misuse Act?
I doubt the 240 councils are going to be happy once they find out you've done this, especially if you're selling it on for profit
Fair points and I appreciate the feedback.
Database right is real but the threshold is "substantial part". I'm literally only showing aggregates and letting people search by postcode. I'm not completely republishing council databases. Think that's defensible, but not gonna pretend that it's 100% black and white.
On CMA, I'd push back. That's about unauthorised access. These portals are public-facing and the data's published deliberately for people to view. Rotating user-agents isn't bypassing security in any meaningful way... I'm not breaking auth or guessing passwords. I back off when portals signal they're unhappy (Liverpool's WAF actively rate-limited me which is why that data's stale).
No council has reached out so far. Could change ofc. Solo founder with no legal team though, so happy to be told I've got it wrong.
Thanks for the feedback. On TOS: the same answer as I gave others... the data is statutorily public, I respect rate limits. That being said, I admit it's a grey area I haven't 100% nailed down.
The postcode bug is more concerning. That shouldn't happen. Do you mind sharing which postcode or city/county? It could be that it's falling back to the wrong council because I don't have data for the right one, or it's a bug in my mapping. Either way, it needs fixing asap! Cheers for flagging.
Ace, I can see how this could actually be quite useful for house conveyancing.
You've put a lot of effort into this. How are you affected by the upcoming changes to local government? They'll no doubt be some rationalisation at some point.
Is any of the data on Gov.uk - any scrapping tips there? I've tried scraping some patent tribunal data but haven't been successful (just using Python (copying in session data), I guess Playwright might be useful there).
Planning data on gov.uk is really patchy and not useful for what I want. There's planning.data.gov.uk which has some boundary/policy data but no actual decisions. The decisions only exist on council portals, which is the whole reason this project exists.
On patent tribunal, I haven't looked into that one specifically but a few general gov.uk tips: most gov.uk content is actually clean HTML (way easier than council portals), so if requests isn't working it's usually either JS-rendered content (Playwright fixes this) or session/cookie weirdness. Things that have helped me elsewhere: Playwright with page.wait_for_selector rather than networkidle, copying real browser headers wholesale (not just User-Agent), and checking if there's a hidden JSON API behind the page (open devtools → Network tab → look for XHR/fetch requests when you click search). Often there's a clean JSON endpoint that the page is using, which is way easier to scrape than the rendered HTML.
Working on similar problem in another domain. I found agentic direction powerful with browser use plugged into a multimodal (strong agentic capability) llm like gpt 5.4 mini working in a loop with orchestrator evaluator/judge.
Nice! Yeah, I went the other way... deterministic scrapers per portal type because once you've worked out the search form quirks for an Idox or Northgate or Ocellaweb, it's the same shape across every council using that platform. So the marginal cost of adding council N is config not code. The agentic approach gets more interesting for the long tail though — the bespoke ASP.NET ones where every council is its own snowflake... and it is a GRIND honestly. How are you finding the loop on cost vs reliability?
Deterministic scrapers are almost certainly the right answer for this task, because once those special snowflakes have paid for their bespoke IT system, they'll never change it.
On the grind, why not get an agent to help you build the long tail of deterministic scrapers? Claude etc is really shockingly good at this kind of moderate-complexity iterative work, it will just keep going around the fetch/parse/understand loop until it has what you're looking for.
Yeah, that's essentially what I'm doing. Claude handles most of the look at the portal, work out the search form, write the config loop. The actual bottleneck isn't code tbh, it's that every (snowflake) council needs like 30+ minutes of investigation before you can even get going, and a chunk deadend because the portal's broken or migrated. I already hit three this morning. Worcester returns connection refused, Breckland's URL is dead, Rother migrated to a different platform. The grind is "is this portal even alive" more than the scraper itself.
Cheers! Yeah, it's honestly mental how fragmented it is. Every council is its own little island. On the shutting-off worry: the data is statutorily public. Councils are legally required to publish it, and I'm respecting rate limits and not hammering anyone. So far no council has objected. Touch wood this remains the case. Tbh, I think the risk is more from the platform vendors than the councils themselves. It seems Idox etc have a commercial interest in this data being awkward to access.
No, I haven't tried Browserless. So far, it has all been from a single residential IP which is probably the bigger issue with Liverpool than the WAF challenge itself. Once I have a valid session cookie I can solve the JS challenge fine, the rate limit is per-IP. Rotating residential proxies (or Browserless behind one) might be the answer... I'm just reluctant at this stage to bite the bullet on the cost for a single (albeit huge) council. Have you used it for similar stuff?
Fair catch and pretty embarrassing... ngl. That's a generic template clause I didn't think hard enough about at the time and it's obviously contradictory given what the site does. I'll rewrite it today. The position I want to take is: scrape responsibly, respect rate limits, don't republish bulk data, which is what I try to do with the councils. Will fix the wording. Thanks.
Updated and pushed live: planninglens.co.uk/terms. Acceptable Use clause now permits programmatic access that respects rate limits, while still protecting our derived analysis and reports. Thanks for the kick.
Around four months part-time. The bulk was the first 6 to 8 weeks building the three main scrapers (Idox, Northgate, Ocellaweb). After that, councils on those platforms are mostly config. The rest has been a long tail of bespoke portals, each taking anywhere from an evening to "give up and revisit and repeat".
Hadn't thought of that tbh. Worth a go on Liverpool especially... that's the AWS WAF one I'm currently blocked on and it is doing my head in. The challenge there is volume rather than access (~80k decisions to backfill), so even if an LLM gets through the wall I'd still need to script around it. But could be a way in for the initial cookie. Cheers for the tip and will look into it.
I would try and go open source as fast as possible before a legal letter land on your desk. Then worry about the commercialisation. Also I have a feeling you could charge SERIOUS coin for some app for property developers based around this. But someone is almost certainly going to come at you because, you know, us Brits hate clever clogs.
Some thoughts
1. Brilliant! Governments (and corps) treat public data like it’s theirs not ours. Information yearns to be free.
2. Having said that, you are likely violating T&Cs by scraping at all.
3. It is a lot easier to defend your position if you are making it free and public yourself.
4. But paying for food is nice
5. I suggest the business model here is providing architects and lawyers with strong evidence of prior planning decisions nationally
Most people applying for (difficult) planning have experience locally. But the planning system is a mess because it is not coherent nationally or regionally. The win here is not providing a copy of your data (that has legal issues) but providing pointers to decisions that support the case of the person paying you.
So I want to turn an old pub into tasteful housing and a cafe for the local village. The local planning team don’t like it, I could spend money bribing them and the councillors (see how much I understand British democracy) or I could get from you the fifteen pub to housing conversion decisions from around the country and use that to help my bribed councillors defend their u-turn
Everyone wins :-)
Cheers, appreciate the feedback. The architect/consultant precedent angle is interesting and a couple of other commenters have already nudged me in similar directions. Tbh... you're likely right that the strongest commercial play isn't B2C £19 reports, it's giving someone fighting a contested case the national pattern across 15 similar pub conversions, the appeal outcomes, what stuck and what didn't. That's a very different product to what I have now but the data supports it. On the T&Cs/legal stuff... I'm not going to pretend I have perfect clarity on it. The position I'd defend is that the data is statutorily public, councils are required by law to publish it, I respect rate limits, and I'm aggregating not republishing in bulk. But there is this grey area between data being public to view and being usable for a commercial product, and I haven't fully nailed it down.
I agree on the “public” data issue - I spent a long time campaigning for better FOSS / data access in government and there are some great people pushing in and outside local and central gov.
But it’s a big mindset chnage (one that will benefit the whole country) but it’s slow.
I think the “push for public policy improvements” angle if genuine will get you a lot more respect and kudos when things get sticky. Good luck
As somebody who works in local government IT, consistent scraping of our data like this is the bane of our life. We get hit by thousands of these, many with no rate limiting, making hugely intensive requests, that cause downtime and knock-on effects for actual customers and citizens. We block IPs, add captchas, and yet it persists.
If you really want the data, just FOI it for goodness' sake.
I get the distinct impression that many of these outfits aren't really advocating for impoved transparency but are simply trying to exploit and monetise illicitly obtained government data to make a quick buck.
Fair points and yeah you must be sick of unrate-limited mass scraping. I run with 1.5-3 second delays from a single residential IP and back off when portals push back, but from your side I look the same as someone hammering you. On your point regarding FOI, what you say is fair. I should probably have led with that for the trickier councils. But the honest reason I haven't is doing 240 FOI requests at scale felt like it'd put a different kind of strain on councils, but if you're telling me the scraping is worse then I take that seriously. On "monetise illicitly obtained data"... I'm not going to pretend the £19 is altruism. But there is a public interest in this data being navigable across council boundaries, and that's not something individual councils can deliver. I must stress that I'm not sure I've got the model right yet and a lot of feedback today is pushing me toward more free, which I'm seriously considering.
Maybe I'm just naive, but why wouldn't a citizen do both?
I'm not implying that anything would get deliberately redacted, but it seems likely that information released through other channels would not match the web. A request might also reveal information that was not on the web.
What other choices are there?
So, this sounds exciting to me, but the postcode checker really feels like a spam as a user. All it tells me is 'Mixed results'. I could make a website that prints 'mixed results', I bet most results are 'mixed'!
I understand wanting to get money, but honestly, there is no way I would give money to this website in it's current state, you are giving me far too little info before asking me to hand over a credit card.
Then, if someone gives you £19, a crazy amount of money honestly, the last page of the report is an advert to give them 4 times more!
Really useful feedback, cheers. Yeah, "Mixed results" is kinda rubbish as you say. It should give you something concrete before asking for anything. I'll fix that today. Fair point on the £79 upsell at the end of a £19 report too. That's tone deaf and I'll move it. On the £19... I'll think about it, but you're right the site needs to do more to justify the spend before pulling out a card. Appreciate the honest take!
Just a quick follow up, if my reply seemed very harsh, view that as a sign of how enthusiastic I was to see the website at first. I understand wanting to make money, but I'd seriously consider giving a lot more away (maybe even the basic report stuff) away for free, I'd love to explore my local area, my parent's, be nosey what life is like in Oxford (a place I previously lived), but even if I was willing to pay (I'm not), having to stop, get PDF, download, really breaks the flow.
No, that's absolutely a fair follow-up and not harsh at all. It's very helpful. The "be nosey about places you used to live" use case is exactly what the postcode tool should serve (thinking about it), and right now it doesn't. You're right that PDF-downloads break flow badly. Tbh... that's a hangover from the "people want a thing they can save" assumption that I'm still stuck in, I guess. I'm still on the fence about giving the paid reports away wholesale, but the gap between "tells you nothing" and "£19 PDF" is way too big. I'm gonna need a middle layer of free but actually useful exploration on the site. Will have a solid think about this today. Appreciate the feedback!
I'm also enthusiastic, it's not often you see people find a genuinely underserved niche and you have.
I don't know if I would pay £19 for a general state-of-the-area report. I would almost certainly have paid £100-300 for a service that took my planning application, critically reviewed it and told me which aspects were and were not likely to pass, with references to specific examples within my local area.
Thanks, honestly that means a lot! Yeah, the pre-submission review idea is interesting and I've thought about it. I have the data to surface "applications similar to yours in your ward, here's what got approved and what didn't" but I haven't built it as a workflow because it requires the user to upload their plans... and that's a different kind of trust ask, but yeah, it is definitely worth revisiting. £100-500 is also a much more honest price for something that genuinely changes a decision. £19 is in the awkward "too much for curiosity, too little for stakes" zone you and the other commenter are both pointing at.
Just checking, are you using an LLM to reply? Your replies are riddled with things LLMs are good at, like making quoted analogies that make no sense. They're not even analogies
What benefit would people gain from the reports? Average rate of success/time is interesting, but I'm not sure what you'd do with this information other than a bit of local press discourse. I suppose it's nicely timed for the council elections?
Honest answer... I don't fully know, zero paying customers so it's still very much a hypothesis. The two use cases I think hold up: (1) people pre-buying a house with extension potential, who otherwise guess or pay £500+ for a planning consultant; (2) homeowners about to commission £2-5k of architect drawings who want a sanity check before proceeding. Someone else suggested £100-500 for a proper pre-submission review which is probably better for that second case than my £19 report. The "general state-of-area" framing is the weakest one and you're right it's mostly local press discourse — that's marketing not revenue.
I work with public data, and I'd love to get access to this data, but I suspect that although you have scraped the data from public websites, there are licensing and copyright implications for actually using it.
See also the open addresses project by Data Adaptive [1] which is using Freedom of Information requests to publish public council tax address data. The problem they have run into there is that their address datasets are derived from proprietary Ordnance Survey data.
It looks like data.gov.uk is in the process of standardising the planning application process, and publishing them under OGL [2].
[1]: https://www.owenboswarva.com/blog/post-addr44.htm [2]: https://www.planning.data.gov.uk/dataset/planning-applicatio...
Thanks and yeah, some of my boundary data (for the choropleth) comes from ONS open boundary files which I think are OGL but I'd need to check the chain of derivation. On the data.gov.uk standardisation, I've seen it but last I looked it was policy and boundaries, not actual decisions. Has that changed? If they're publishing decisions under OGL I'd gladly ditch the scraping for a proper feed. On licensing more generally... I haven't fully nailed it down. Showing aggregates and pointing back to source, but yeah there's a gap between "data is public" and "do whatever you want with it commercially".
Great site. This data should really be more accessible. Planning in the UK is a total crapshoot, subject to the whims of the planning authorities. In our case, a simple rear extension and dormer loft conversion, similar to hundreds of thousands across the country, we ended up having to appeal which added 2 years and tens of thousands of pounds in costs to our extension project. Our area shows up as a high refusal area, which tracks.
It would be good to add appeal data in (also a public gateway) to show which councils are just being unreasonable.
I personally think the planning regulations in this country are the cause of many ills, including the housing shortage. It just costs so much to get through planning these days, it is often just not worth it. Data like this could help us get that changed.
Maybe a tongue-in-cheek comment but regulations are that way because you guys want it that way (maybe not you personally). If it wasn't like that, nothing would stop a garbage incinerator or a quarry popping up a few hundred meters from houses (which happens in European countries with more deregulated planning/zoning regulations).
You guys have all kinds of pro-individualistic, borderline nonsensical residental housing laws like "right to light" and "right to view". It's completely incompatible with "build more". Most British people view their privacy (or perceived privacy) as a higher priority than fixing the housing market. "It's so overlooked" is such common comment and it's almost bizarre to someone used to living in a higher density environment (like the UK very much is).
Waste disposal and planning for quarrying and mineral extraction are different functions, decided at a higher tier of local government, and are not directly comparable to development management/planning.
This is awesome! Worked on something similar albeit a different industry.
For the more challenging scrapes, would highly recommend using the Chrome Devtools MCP to be able to attach the network requests, being made by the browser to the site, as context for your agent/LLM chat - this approach really helped me to write a solid API-based scraper (also using curl_cffi) and bypassed the old tedious playwright-based approach I used to rely on.
Nice thinking. Hadn't thought of DevTools MCP that way. Curl_cffi I've used for TLS fingerprinting (Edinburgh was the first one) but the discovery side I've been doing manually... open DevTools, look at the request, copy as cURL, work out which params can be pruned. Automating that loop with an LLM in the middle would speed things up a lot, especially for the bespoke long tail. Will look into that this week. Thanks!
Have you tried using FoI to get the data? I've had some success with data requests - often getting dumps in CSV or similar.
I appreciate that won't necessarily capture live / recent data. But it might be quicker than waiting for rate-limits to reset.
Have you spoken to any planners, a quick search for similar applications in other LAs might be a useful thing for them.
There's a Royal Institute of Town Planners, they probably have a magazine you could advertise in (but equally that might get you blocked, idk).
RICS people could probably use the data too? I guess it's useful house-buyer info; houses in the vicinity had successful loft conversions, say.
On the data side - it's something of a moat for you now, but I could see you being successful with FOI requests. An MP might be interested in championing open data access.
All good points. I've been so busy with the data collection and just "irl stuff" that I haven't spoken to planners directly which is an oversight on my part — they're the obvious power users. RTPI/RICS are both on my list but as I said, I've been focusing on data more than distribution. Probably the wrong order tbh. FOI is interesting, especially for the trickier portals (Liverpool's WAF, the dead-portal ones). It might be cleaner than scraping. MP/open-data angle is definitely something I hadn't seriously considered. Worth thinking about tho! Thanks.
It looks like this kind of data will start to be more open in the future. New legislation introduces mandatory data standards in England: https://mhclgdigital.blog.gov.uk/2026/04/22/data-standards-l...
It's is the most ridiculous situation with council technology that they all use different providers for what are fundamentally the same functions. It's the same for council tax and a host of other services as it is for planning. Consequently, at least from the various portals I've used, they all do it badly. This absolutely could and should be done by a single, well funded central team.
Unless you use a nationalised product for this; this is the best outcome.
GDS was nationalised and they certainly did a better, albeit not perfect, job than the myriad of private solutions councils use. There just doesn't appear to be the capability to properly specify and source IT at a council level.
I hate to be a downer but...
> UK planning data is technically public.
it's public, but still copyrighted by those who submitted it
the councils also have database rights over their database, unless you've obtained explicit permission from them directly
https://en.wikipedia.org/wiki/Database_right#United_Kingdom
> I ended up writing several scrapers: a standard requests-based one, a Playwright-based one for councils that block anything that doesn't look like a real browser, and a curl_cffi one for TLS fingerprinting.
so they're explicitly trying to stop you doing this, and ... you're openly admitting to bypassing their technical measures to try and stop you?
have you heard of the Computer Misuse Act?
I doubt the 240 councils are going to be happy once they find out you've done this, especially if you're selling it on for profit
Fair points and I appreciate the feedback. Database right is real but the threshold is "substantial part". I'm literally only showing aggregates and letting people search by postcode. I'm not completely republishing council databases. Think that's defensible, but not gonna pretend that it's 100% black and white. On CMA, I'd push back. That's about unauthorised access. These portals are public-facing and the data's published deliberately for people to view. Rotating user-agents isn't bypassing security in any meaningful way... I'm not breaking auth or guessing passwords. I back off when portals signal they're unhappy (Liverpool's WAF actively rate-limited me which is why that data's stale). No council has reached out so far. Could change ofc. Solo founder with no legal team though, so happy to be told I've got it wrong.
I'd be careful because even though its 'public' data, scraping it might not be legal due to TOS of the various sites.
I did a search for my postcode and got given results for a different area and council miles away
Thanks for the feedback. On TOS: the same answer as I gave others... the data is statutorily public, I respect rate limits. That being said, I admit it's a grey area I haven't 100% nailed down. The postcode bug is more concerning. That shouldn't happen. Do you mind sharing which postcode or city/county? It could be that it's falling back to the wrong council because I don't have data for the right one, or it's a bug in my mapping. Either way, it needs fixing asap! Cheers for flagging.
OK I have emailed your hello@planninglens... address with screenshots
Thanks. Will look into that right away!
Ace, I can see how this could actually be quite useful for house conveyancing. You've put a lot of effort into this. How are you affected by the upcoming changes to local government? They'll no doubt be some rationalisation at some point.
Is any of the data on Gov.uk - any scrapping tips there? I've tried scraping some patent tribunal data but haven't been successful (just using Python (copying in session data), I guess Playwright might be useful there).
Planning data on gov.uk is really patchy and not useful for what I want. There's planning.data.gov.uk which has some boundary/policy data but no actual decisions. The decisions only exist on council portals, which is the whole reason this project exists. On patent tribunal, I haven't looked into that one specifically but a few general gov.uk tips: most gov.uk content is actually clean HTML (way easier than council portals), so if requests isn't working it's usually either JS-rendered content (Playwright fixes this) or session/cookie weirdness. Things that have helped me elsewhere: Playwright with page.wait_for_selector rather than networkidle, copying real browser headers wholesale (not just User-Agent), and checking if there's a hidden JSON API behind the page (open devtools → Network tab → look for XHR/fetch requests when you click search). Often there's a clean JSON endpoint that the page is using, which is way easier to scrape than the rendered HTML.
Working on similar problem in another domain. I found agentic direction powerful with browser use plugged into a multimodal (strong agentic capability) llm like gpt 5.4 mini working in a loop with orchestrator evaluator/judge.
Nice! Yeah, I went the other way... deterministic scrapers per portal type because once you've worked out the search form quirks for an Idox or Northgate or Ocellaweb, it's the same shape across every council using that platform. So the marginal cost of adding council N is config not code. The agentic approach gets more interesting for the long tail though — the bespoke ASP.NET ones where every council is its own snowflake... and it is a GRIND honestly. How are you finding the loop on cost vs reliability?
Deterministic scrapers are almost certainly the right answer for this task, because once those special snowflakes have paid for their bespoke IT system, they'll never change it.
On the grind, why not get an agent to help you build the long tail of deterministic scrapers? Claude etc is really shockingly good at this kind of moderate-complexity iterative work, it will just keep going around the fetch/parse/understand loop until it has what you're looking for.
Yeah, that's essentially what I'm doing. Claude handles most of the look at the portal, work out the search form, write the config loop. The actual bottleneck isn't code tbh, it's that every (snowflake) council needs like 30+ minutes of investigation before you can even get going, and a chunk deadend because the portal's broken or migrated. I already hit three this morning. Worcester returns connection refused, Breckland's URL is dead, Rother migrated to a different platform. The grind is "is this portal even alive" more than the scraper itself.
Amazing! It’s so bloody hard to access this information or even to know what there is.
Careful not to expose the councils too publicly before they shut you off
Cheers! Yeah, it's honestly mental how fragmented it is. Every council is its own little island. On the shutting-off worry: the data is statutorily public. Councils are legally required to publish it, and I'm respecting rate limits and not hammering anyone. So far no council has objected. Touch wood this remains the case. Tbh, I think the risk is more from the platform vendors than the councils themselves. It seems Idox etc have a commercial interest in this data being awkward to access.
There was a story how similar initiative for a courts decisions scraping was shut off.
Send a message to infoshareplus.com They might be interested in your data because they operate a business around local govs.
Thanks, hadn't come across them. I will have a poke around and reach out. Appreciate the pointer.
Have you tried using Browserless/similar to scrape around tricky hosts?
No, I haven't tried Browserless. So far, it has all been from a single residential IP which is probably the bigger issue with Liverpool than the WAF challenge itself. Once I have a valid session cookie I can solve the JS challenge fine, the rate limit is per-IP. Rotating residential proxies (or Browserless behind one) might be the answer... I'm just reluctant at this stage to bite the bullet on the cost for a single (albeit huge) council. Have you used it for similar stuff?
Your terms:
> You may not use automated tools to scrape, copy, or bulk-download data from our service.
Pot kettle, huh.
Fair catch and pretty embarrassing... ngl. That's a generic template clause I didn't think hard enough about at the time and it's obviously contradictory given what the site does. I'll rewrite it today. The position I want to take is: scrape responsibly, respect rate limits, don't republish bulk data, which is what I try to do with the councils. Will fix the wording. Thanks.
Updated and pushed live: planninglens.co.uk/terms. Acceptable Use clause now permits programmatic access that respects rate limits, while still protecting our derived analysis and reports. Thanks for the kick.
How long did the scraping take you to build?
Around four months part-time. The bulk was the first 6 to 8 weeks building the three main scrapers (Idox, Northgate, Ocellaweb). After that, councils on those platforms are mostly config. The rest has been a long tail of bespoke portals, each taking anywhere from an evening to "give up and revisit and repeat".
Worth trying claude/gemini to see if they'll do some scraping for you. I've found some paywall sites only too happy to allow Gemini past the wall.
Hadn't thought of that tbh. Worth a go on Liverpool especially... that's the AWS WAF one I'm currently blocked on and it is doing my head in. The challenge there is volume rather than access (~80k decisions to backfill), so even if an LLM gets through the wall I'd still need to script around it. But could be a way in for the initial cookie. Cheers for the tip and will look into it.
I would try and go open source as fast as possible before a legal letter land on your desk. Then worry about the commercialisation. Also I have a feeling you could charge SERIOUS coin for some app for property developers based around this. But someone is almost certainly going to come at you because, you know, us Brits hate clever clogs.