Variable-aspect adaptive-bitrate video… in vanilla HTML?

The video below is presented in portrait orientation, because your screen is taller than it is wide.

The video below is presented in landscape orientation, because your screen is wider than it is tall.

The video below is presented in square orientation (the Secret Bonus Square Video!), because your screen has approximately the same width as as its height. Cool!

This is possible (with a single <video> element, and without any Javascript!) thanks to some cool HTML features you might not be aware of, which I’ll briefly explain in the video. Or scroll down for the full details.

Variable aspect-ratio videos in pure HTML

I saw a 2023 blog post by Scott Jehl about how he helped Firefox 120 (re)gain support for the <source media="..."> attribute. Chrome added support later that year, and Safari already had it. This means that it’s pretty safe to do something like this:

<video controls>
  <source src="squareish.mp4"
        media="(min-aspect-ratio: 0.95) and (max-aspect-ratio: 1.05)" />
  <source src="portrait.mp4"
        media="(orientation: portrait)" />
  <source src="landscape.mp4" />
</video>
This code creates a video with three sources: squareish.mp4 which is shown to people on “squareish” viewports, failing that portrait.mp4 which is shown to people whose viewports are taller than wide, and failing that landscape.mp4 which is shown to anybody else.

That’s broadly-speaking how the video above is rendered. No JavaScript needed.

Browsers only handle media queries on videos when they initially load, so you can’t just tip your phone over or resize the window: you’ll need to reload the page, too. But it works! Give it a go: take a look at the video in both portrait and landscape modes and let me know what you think1.

Adding adaptive bitrate streaming with HLS

Here’s another cool technology that you might not have realised you could “just use”: adaptive bitrate streaming with HLS!

You’ve used adaptive bitrate streaming before, though you might not have noticed it. It’s what YouTube, Netflix, etc. are doing when your network connection degrades and you quickly get dropped-down, mid-video, to a lower-resolution version2.

Turns out you can do it on your own static hosting, no problem at all. I used this guide (which has a great description of the parameters used) to help me:

ffmpeg -i landscape.mp4 \
       -filter_complex "[0:v]split=3[v1][v2][v3]; [v1]copy[v1out]; [v2]scale=w=1280:h=720[v2out]; [v3]scale=w=640:h=360[v3out]" \
       -map "[v1out]" -c:v:0 libx264 -x264-params "nal-hrd=cbr:force-cfr=1" -b:v:0 5M -maxrate:v:0 5M -minrate:v:0 5M -bufsize:v:0 10M -preset slow -g 48 -sc_threshold 0 -keyint_min 48 \
       -map "[v2out]" -c:v:1 libx264 -x264-params "nal-hrd=cbr:force-cfr=1" -b:v:1 3M -maxrate:v:1 3M -minrate:v:1 3M -bufsize:v:1 3M -preset slow -g 48 -sc_threshold 0 -keyint_min 48 \
       -map "[v3out]" -c:v:2 libx264 -x264-params "nal-hrd=cbr:force-cfr=1" -b:v:2 1M -maxrate:v:2 1M -minrate:v:2 1M -bufsize:v:2 1M -preset slow -g 48 -sc_threshold 0 -keyint_min 48 \
       -map a:0 -c:a:0 aac -b:a:0 96k -ac 2 \
       -map a:0 -c:a:1 aac -b:a:1 96k -ac 2 \
       -map a:0 -c:a:2 aac -b:a:2 48k -ac 2 \
       -f hls -hls_time 2 -hls_playlist_type vod -hls_flags independent_segments -hls_segment_type mpegts \
       -hls_segment_filename landscape_%v/data%02d.ts \
       -master_pl_name landscape.m3u8 \
       -var_stream_map "v:0,a:0 v:1,a:1 v:2,a:2" landscape_%v.m3u8
This command splits the H.264 video landscape.mp4 into three different resolutions: the original “v1” (1920×1080, in my case, with 96kbit audio), “v2” (1280×720, with 96kbit audio), and “v3” (640×360, with 48kbit audio), each with a resolution-appropriate maximum bitrate, and forced keyframes every 48th frame. Then it breaks each of those into HLS segments (.ts files) and references them from a .m3u8 playlist.

The output from this includes:

  • Master playlist landscape.m3u8, which references the other playlists with reference to their resolution and bandwidth, so that browsers can make smart choices,
  • Playlists landscape_0.m3u8 (“v1”), landscape_1.m3u8 (“v2”), etc., each of which references the “parts” of that video,
  • Directories landscape_0/, landscape_1/ etc., each of which contain
  • data00.ts, data01.ts, etc.: the actual “chunks” that contain the video segments, which can be downloaded independently by the browser as-needed

Bringing it all together

We can bring all of that together, then, to produce a variable-aspect, adaptive bitrate, HLS-streamed video player… in pure HTML and suitable for static hosting:

<video controls>
  <source src="squareish.m3u8"
         type="application/x-mpegURL"
        media="(min-aspect-ratio: 0.95) and (max-aspect-ratio: 1.05)" />

  <source src="portrait.m3u8"
         type="application/x-mpegURL"
        media="(orientation: portrait)" />

  <source src="landscape.m3u8"
         type="application/x-mpegURL" />
</video>
You could, I suppose, add alternate types, poster images, and all kinds of other fancy stuff, but this’ll do for now.

That’ll “just work” in Safari and a handful of mobile browsers… but won’t display anything for most desktop browsers. Boo!

One solution is to also provide the standard .mp4 files as an alternate <source>, and that’s fine I guess, but you lose the benefit of HLS (and you have to store yet more files). But there’s a workaround:

Polyfill full functionality for all browsers

If you’re willing to use a JavaScript polyfill, you can make the code above work on virtually any device. I gave this a go, here, by:

  1. Including the polyfill hls.js, and
  2. Adding some JavaScript code that detects affected `<video>` elements and applying the fix if necessary:
// Find all <video>s which have HLS sources:
for( hlsVideo of document.querySelectorAll('video:has(source[type="application/x-mpegurl"]), video:has(source[type="vnd.apple.mpegurl"])') ) {
  // If the browser has native support, do nothing:
  if( hlsVideo.canPlayType('application/x-mpegurl') || hlsVideo.canPlayType('application/vnd.apple.mpegurl') ) continue;

  // If hls.js can't help fix that, do nothing:
  if ( ! Hls.isSupported() ) continue;

  // Find the best source based on which is the first one to match any applicable CSS media queries
  const bestSource = Array.from(hlsVideo.querySelectorAll('source')).find(source=>window.matchMedia(source.media).matches)

  // Use hls.js to attach the best source:
  const hls = new Hls();
  hls.loadSource(bestSource.src);
  hls.attachMedia(hlsVideo);
}
It makes me feel a little dirty to make a <video> depend on JavaScript, but if that’s the route you want to go down while we wait for HLS support to become more widespread (rather than adding different-typed sources) then that’s fine, I guess.

This was a fun dive into some technologies I’ve not had the chance to try before. A fringe benefit of being a generalist full-stack developer is that when you’re “between jobs” you get to play with all the cool things when you’re brushing up your skills before your next big challenge!

(Incidentally: if you think you might be looking to employ somebody like me, my CV is over there!)

Footnotes

1 There definitely isn’t a super-secret “square” video on this page, though. No siree. (Shh.)

2 You can tell when you get dropped to a lower-resolution version of a video because suddenly everybody looks like they’re a refugee from Legoland.

Trans former judge plans to challenge gender ruling at European court

“Trans former judge plans to challenge gender ruling at European court”: https://www.bbc.co.uk/news/articles/c9qw2149yelo

Representation matters. That we have a trans former-judge, somebody both well-equipped and motivated to escalate this important challenge to the ECHR, is hugely fortunate.

We need more representation (of trans people specifically, but many other groups too, and perhaps particularly in the intersections) in positions of power, expertise, and authority. To defend the human rights of all of us.

Wishing you luck, Victoria McCloud.

Deprecate React

I’m keeping an eye out for my next career move (want to hire me?). Off the back of that I’ve been brushing up on the kinds of skills that I might be asked to showcase in any kind of “tech test”.

Not the kind of stuff I can do with one hand tied behind my back1, but the things for which I’d enjoy feeling a little more-confident2. Stuff that’s on my CV that I’ve done and can do, but where I’d like to check before somebody asks me about it in an interview.

React? Sure, I can do that…

LinkedIn, GlassDoor, and bits of the Fediverse are a gold mine for the kinds of things that people are being asked to demonstrate in tech tests these days. Like this post:

On LinkedIn, Avantika Raj shares a coding question asked during their React Developer interview with Volkswagon Software Solutions. It reads: Create a traffic light component with green, yellow, and red lights. On clicking a button, the light should change. Initially, it should show green. After 2 minutes, it should automatically switch to red for 30 seconds, then yellow for 10 seconds, and repeat this cycle continuously.
I’d describe myself as a “stack-agnostic senior/principal full-stack/backend web developer/security engineer”3, and so this question – which feels like it’s a filter for a junior developer with a React specialisation – isn’t really my wheelhouse. Which makes it a perfect excuse for an hour of playing about with React.

My recent React experience has mostly involved Gutenberg blocks and WordPress theme component. This seemed like an excuse to check that I can wrangle a non-WordPress React stack.

Animated GIF showing traffic lights changing through their phases on-demand or on a toggleable timer.
This isn’t particularly sophisticated. I added customisable durations for each light, but otherwise it’s pretty basic.

Half an hour later, I’d proven to myself that yes, I could throw together a fresh application with React DOM and implement some React components, pass state around and whatnot.

Time to move on to the next thing, right? That’s what a normal person would do.

But that’s not the kind of person I am.

Let’s reimplement this as Web Components

What I found myself thinking was… man, this is chunky. React is… not the right tool for this job.

(Or, increasingly, any job. But I’ll get back to that.)

A minified production build of my new component and its dependencies came in at 202kB (62.3kB compressed). That feels pretty massive for something that does so-little. So as an experiment, I re-implemented my new React component as a vanilla JS Web Component using a custom element. Identical functionality, but no third-party library dependencies. Here’s what I got:

This one’s interactive. Press a button or two!

The Web Component version of this control has no dependency chain and uses no JSX, and so it has no transpilation step: the source version is production-ready. You could minify it, but modern HTTP compression makes the impact of that negligible anyway: the whole thing weighs in at 19.5kB (5.2kB compressed) without minification.

And while I appreciate of course that there’s much more to JavaScript complexity and performance than file sizes… and beyond that I appreciate that there’s a lot more to making great components than the resulting bundle size… it’s hard to argue that delivering the same functionality (and less fragility) in a twelfth of the payload isn’t significant.

Composite screenshots showing the Chrome performance metrics and Network download sizes for the React and Web Components versions of my traffic lights. LCP - React 0.06s, Web Components 0.04s. INP - React 16ms, Web Components 8ms. Transferred - React 62.3kb (compressed), 202kB (uncompressed), in 37ms, Web Components 5.2kB (compressed), 19.5kB (uncompressed), in 22ms.
By any metric you like, the Web Components version outperforms the React version of my traffic light component. And while it’s a vastly-simplified example, it scales. Performance is a UX concern, and if you favour “what we’re familiar with” over “what’s best for our users”, that has to be a conscious choice.

But there’s a bigger point here:

React is the new jQuery

I’m alarmed by the fact that I’m still seeing job ads for “React developers”, with little more requirement than an ability to “implement things in React”.

From where I’m sitting, React is the new jQuery. It:

  • Was originally built to work around missing or underdeveloped JavaScript functionality
    • e.g. React’s components prior to Web Components
    • e.g. jQuery’s manipulation prior to document.querySelectorAll
  • Continued to be valuable as a polyfill and as a standard middleware while that functionality become commonplace
  • No longer provides enough value to be worth using in a new project
    • And yet somehow gets added “out of habit” for many years

If you’ve got a legacy codebase with lots of React in it, you’re still going to need React for a while. Just like how you’re likely to continue to need jQuery for a while until you can tidy up all those edge-cases where you’re using it.

(You might even be locked-in to using both React and jQuery for some time, if say you’ve got a plugin architecture that demands backwards-compatibility: I’m looking at you, WordPress!)

But just as you’re already (hopefully) working to slowly extricate your codebases from any now-unnecessary jQuery dependencies they have… you should be working on an exit plan for your React code, too. It’s done its time; it’s served its purpose: now it’s just a redundant dependency making your bundles cumbersome and harder to debug.

Everything React gives you on the client-side – components, state/hooks, routing4, etc. – is possible (and easy) in modern JavaScript supported in all major browsers. And if you still really want an abstraction layer, there are plenty of options (and they’re all a lot lighter than React!).

The bottom line is, I suppose…

You shouldn’t be hiring “React developers”!

If you’re building a brand new project, you shouldn’t be using React. It should be considered deprecated.

If you’ve got an existing product that depends on React… you should be thinking about how you’ll phase it out over time. And with that in mind, you want to be hiring versatile developers. They’ll benefit from some experience with React, sure, but unless they can also implement for the modern Web of tomorrow, they’ll just code you deeper into your dependency on React.

It’s time you started recruiting “Front-End Developers (React experience a plus)”. Show some long-term thinking! Or else the Web is going to move on without you, and in 5-10 years you’ll struggle to recruit people to maintain your crumbling stack.

You can download all my code and try it for yourself, if you like. The README has lots more information/spicy rants, and the whole thing’s under a public domain license so you can do whatever you like with it.

Footnotes

1 Exploiting or patching an injection vulnerability, optimising an SQL query, implementing a WordPress plugin, constructing a CircleCI buildchain, expanding test coverage over a Rubygem, performing an accessibility audit of a web application, extending a set of high-performance PHP-backed REST endpoints, etc. are all – I’d hope! – firmly in the “hold my beer” category of tech test skills I’d ace, for example. But no two tech stacks are exactly alike, so it’s possible that I’ll want to brush up on some of the adjacent technologies that are in the “I can do it, but I might need to hit the docs pages” category.

2 It’s actually refreshing to be learning and revising! I’ve long held that I should learn a new programming language or framework every year or two to stay fresh and to keep abreast of what’s going on in world. I can’t keep up with every single new front-end JavaScript framework any more (and I’m not sure I’d want to!)! But in the same way as being multilingual helps unlock pathways to more-creative thought and expression even if you’re only working in your native tongue, learning new programming languages gives you a more-objective appreciation of the strengths and weaknesses of what you use day-to-day. tl;dr: if you haven’t written anything in a “new” (to you) programming language for over a year, you probably should.

3 What do job titles even mean, any more? 😂 A problem I increasingly find is that I don’t know how to describe what I do, because with 25+ years of building stuff for the Web, I can use (and have used!) most of the popular stacks, and could probably learn a new one without too much difficulty. Did I mention I’m thinking about my next role? If you think we might “click”, I’d love to hear from you…

4 Though if you’re doing routing only on the client-side, I already hate you. Consider for example the SlimJS documentation which becomes completely unusable if a third-party JavaScript CDN fails: that’s pretty fragile!

× ×

Halifax Shared My Credit Agreement!

Remember that hilarious letter I got from British Gas a in 2023 where they messed-up my name? This is like that… but much, much worse.

Today, Ruth and JTA received a letter. It told them about an upcoming change to the agreement of their (shared, presumably) Halifax credit card.

Except… they don’t have a shared Halifax credit card. Could it be a scam? Some sort of phishing attempt, maybe, or perhaps somebody taking out a credit card in their names?

I happened to be in earshot and asked to take a look at the letter, and was surprised to discover that all of the other details – the last four digits of the card, the credit limit, etc. – all matched my Halifax credit card.

Carefully-censored letter from Halifax, highlighting the parts that show my correct address, last four digits of my card, and my credit limit... and where it shows a pair of names that are not mine.
Halifax sent a letter to me, about my credit card… but addressed it to… two other people I live with

I spent a little over half an hour on the phone with Halifax, speaking to two different advisors, who couldn’t fathom what had happened or how. My credit card is not (and has never been) a joint credit card, and the only financial connection I have to Ruth and JTA is that I share a mortgage with them. My guess is that some person or computer at Halifax tried to join-the-dots from the mortgage outwards and re-assigned my credit card to them, instead?

Eventually I had to leave to run an errand, so I gave up on the phone call and raised a complaint with Halifax in writing. They’ve promised to respond within… eight weeks. Just brilliant.

×

Breakups vs Layoffs

I’ve had a few breakups, but I’ve only been made redundant once. There’s a surprising overlap between the two…

Venn-Euler diagram with circles representing Romantic Breakups and Open-Source Tech Layoffs (the former overlaid with a broken heart icon, the latter with a background reminiscent of the Automattic logo). The items unique to each are 'paired': Who keeps the dog? / Who keeps the laptop?; Half your friends take "their side" / Half your friends still work for them; Slim chance of make-up sex / Slim chance of meaningful reference; Find out in person (unless they're a monster) / Find out by email (with 10 minutes notice); Comes with a brutal list of your flaws / Comes with no useful feedback whatsoever; "I'm happier on my own anyway!" / "Oh fuck, how am I going to pay the bills?"; Bags in the hallway / Bags under your eyes. The intersection of both circles includes: Trash talk; Risk of rebound; "How did I not see this coming?"; Sadness when you wake up without them; Still feel responsible for the things you produced; Emotionally gruelling.

And with that, I’d better get back to it. Today’s mission is to finish checking-in on my list of “companies I’ve always admired and thought I should work for” and see if any of them are actively looking for somebody like me!

(Incidentally: if you’re into open source, empowering the Web, and making the world a better place, my CV is over here. I’m a senior/principal full-stack engineer with a tonne of experience in some radically diverse fields, and if you think we’d be a good match then I’d love to chat!)

×

Dan Q found GCB61ZC Mossy

This checkin to GCB61ZC Mossy reflects a geocaching.com log entry. See more of Dan's cache logs.

Saw GCB61ZC “Mossy” appear yesterday, but I’d already had an Easter Sunday beer or two and couldn’t drive out here… in fact, I realised, I probably wasn’t in a state to cycle out here either and so I resolved that I’d come out the following morning – that is, today – by car and give the dog a walk while finding GL1E5FYX0 and its year-older sibling GCAR5HV “Ivy”.

But early this morning the dog had been poorly and was still needing to relieve herself with little to no notice, so I didn’t want to risk putting her in the car! Though… I did want to keep her outdoors! It was time for a change in plan: instead of driving to near the cache then taking a walk… we’d attempt the whole thing on foot, perhaps taking the opportunity to explore some of the back footpaths that we’d not yet had a chance to try. We’d now missed our chance for the FTF at “Mossy”, but we could still enjoy a walk (and hopefully give the dog a chance to feel better).

We cut through fields at Sutton, past The Fox and Blackditch Farm, to brush against the side of Lakeside Industrial Estate and head down Dix Pit. After a brief pause to report fly tipping at 51.745311, -1.412871 (why would anybody fly tip literally just down the road from the recycling centre? it boggles my mind), we decided to duck off the road and take the parallel public footpath for a bit (OSM Way #204829432, trailhead 51.742330, -1.416563).

A fly-tipped mattress and bin bags in a layby.

To our surprise, the public footpath was fenced off. I thought we might instead be able to take the track to the West and intersect with the nearby bridleway (OSM Way #1129092587) but it was marked as private, so we continued down Dix Pit. How DOES one get to that bridleway, I wondered? Just out of curiosity we tried to join it via the footpath (OSM Way #1129092588) at 51.737047, -1.412766, but it, too, lacked a usable gate (and looked severely overgrown!), although the public footpath signs within the mess were still visible. I suspect that this public footpath has been long neglected by the landowner and is quickly becoming lost to the world.

Instead, then, we passed Cutmill Farm and took the footpath through the woods that straddle the boundary line between Stanton Harcourt and Standlake, crossing the Windrush at a bridge that seemed to serve as the meeting point for a great multitude of dog walkers. My geopup, who’s not always the friendliest with new dogs, enjoyed greeting a few of them while warily watching the others.

Grassy meadow between small forests.

We looped around Oxlease Lake in a clockwise direction, crossed Standlake Brook (the dog was very keen to get a drink of water, and in trying to do so made herself exceptionally muddy), and began working our way up the long driveway past the swimmers and anglers and sailors assembling and preparing for their bank holiday activities. I recalled that I’d been up this way about a year ago to find GCAQJN1 “Hardwick park 1”: I got the FTF on that, but it proved to be a short-lived cache, getting retired only three or four months later when park owners cut back the trees that had been its host. But I’d actually been here much earlier, too – over a decade ago – long before I moved to the local area, to find the much-more long-lived GC1TPFY “Constellations 4”. Strange to hunt for caches in places that I’ve walked past at least twice before, before those caches were hidden!

First up was “Mossy” (GCB61ZC). The dog – who was by now feeling herself again and happy to lead the way – struggled with the narrow kissing-gate to get onto the trail (she’s not so bright and doesn’t understand that she needs to make room for me to join her before I can open the gate) but soon we were on the path. The hiding place leapt out at me and soon the cache was in hand, the log signed (right behind Mad H@ter, whose Leafield series I was just attempting five days ago), and the cache returned to its hiding spot beneath its excellent camoflage. I hope the container survives the winter rains!

A muddy-looking French Bulldog.

Second was “Ivy” (GCAR5HV): a brief walk through the holiday park away (via a glance at the watersports centre, which by the looks of things have paddleboards for rent: I first tried paddleboarding on the sea and figure it might be easier on a lake, so perhaps I’ll find an excuse to come down here and have a go at some point). The host was easy to find, but it took some searching to put my hand on the container because I started at the “wrong” side. It looks like it’s had (unlogged?) maintenance done since BusyLittleGeo’s visit last week: the cache was in perfect condition and the logbook was empty.

Now it was time to return home and give this happy (but mucky) dog a bath. We retraced our steps until Dix Pit, but then cut through the Devils’ Quoits and out through Hayfield Green to return to Stanton Harcourt, and from there picked our way back through the fields at Sutton to return home.

Map showing the journey described.
GPX file available for nerds.

What could have been a 10-minute cache-and-dash became a 3-hour backwoods hike. Which might have been just what I (and a recovering dog) needed. TFTC.

Sleeping French Bulldog.
Back from the walk, this dog went straight into a bath and then off for a nap.
× × × × ×

Map of the Titan

Y’all seemed to enjoy the “overworld” map I shared the other day, so here’s another “feelie” from my kids’ ongoing D&D campaign.

The party has just arranged for passage aboard a pioneering (and experimental) Elvish airship. Here’s a deck plan (only needs a “you are here” dot!) to help them get their bearings.

In the style of a passenger ferry, a floorplan for a dirigible, weighted down by polyhedral dice. Fantasy world quirks like bilingual text in Common and Elvish and the emergency exit sign depicting a fleeing witch complete the effect.

×

Family D&D’s Overworld Map

In preparation for Family D&D Night (and with thanks to my earlier guide to splicing maps together!), I’ve finally completed an expanded “overworld” map for our game world. So far, the kids have mostly hung around on the North coast of the Central Sea, but they’re picked up a hook that may take them all the way across to the other side… and beyond?

Banana for scale.

(If your GMing for kids, you probably already know this, but “feelies” go a long way. All the maps. All the scrolls. Maybe even some props. Go all in. They love it.)

On a dining table lies a old-style map comprised of 12 sheets of A4 paper, sellotaped together. The map shows the 'Central Sea', an inlet from the 'Terminic Ocean', around which various settlements, forests, mountain ranges, and swamps can be found. An underripe banana sits in one corner of the map, weighing it down.

×

Geocities Live

I used Geocities.live to transform the DanQ.me homepage into “Geocities style” and I’ve got to say… I don’t hate what it came up with

90s-style-homepage version of DanQ.me, as generated by geocities.live. It features patterned backgrounds, Comic Sans, gaudy colours, and tables.
Sure, it’s gaudy, but it’s got a few things going for it, too.

Let’s put aside for the moment that you can already send my website back into “90s mode” and dive into this take on how I could present myself in a particularly old-school way. There’s a few things I particularly love:

  • It’s actually quite lightweight: ignore all the animated GIFs (which are small anyway) and you’ll see that, compared to my current homepage, there are very few images. I’ve been thinking about going in a direction of less images on the homepage anyway, so it’s interesting to see how it comes together in this unusual context.
  • The page sections are solidly distinct: they’re a mishmash of different widths, some of which exhibit a horrendous lack of responsivity, but it’s pretty clear where the “recent articles” ends and the “other recent stuff” begins.
  • The post kinds are very visible: putting the “kind” of a post in its own column makes it really clear whether you’re looking at an article, note, checkin, etc., much more-so than my current blocks do.
Further down the same page, showing the gap between the articles and the other posts, with a subscribe form (complete with marquee!).
Maybe there’s something we can learn from old-style web design? No, I’m serious. Stop laughing.

90s web design was very-much characterised by:

  1. performance – nobody’s going to wait for your digital photos to download on narrowband connections, so you hide them behind descriptive links or tiny thumbnails, and
  2. pushing the boundaries – the pre-CSS era of the Web had limited tools, but creators worked hard to experiment with the creativity that was possible within those limits.

Those actually… aren’t bad values to have today. Sure, we’ve probably learned that animated backgrounds, tables for layout, and mystery meat navigation were horrible for usability and accessibility, but that doesn’t mean that there isn’t still innovation to be done. What comes next for the usable Web, I wonder?

Geocities.live interpretation of threerings.org.uk. It's got some significant design similarities.
As soon as you run a second or third website through the tool, its mechanisms for action become somewhat clear and sites start to look “samey”, which is the opposite of what made 90s Geocities great.

The only thing I can fault it on is that it assumes that I’d favour Netscape Navigator: in fact, I was a die-hard Opera-head for most of the nineties and much of the early naughties, finally switching my daily driver to Firefox in 2005.

I certainly used plenty of Netscape and IE at various points, though, but I wasn’t a fan of the divisions resulting from the browser wars. Back in the day, I always backed the ideals of the “Viewable With Any Browser” movement.


88x31 animated GIF button in the Web 1.0 style, reading "DAN Q". The letter Q is spinning. Best Viewed With Any Browser button, in original (90s) style.

I guess I still do.

× × ×

ChatGPT beats GeoGuessr

You’ve probably come across GeoGuessr already: it’s an online game where you (and friends, if you’ve got them) get dropped into Google Street View and have two minutes to try to work out where in the world you are and drop a pin on it.

City street shown in GeoGuessr, with white and cream buildings, under a bright blue sky. 1 minute 45 seconds remains to place a pin.
Can you tell where we are, yet?

A great strategy is to “walk around” a little, looking for landmarks, phone numbers, advertisements, linguistic clues, cultural indicators, and so on, narrowing down the region of the world you think you’re looking at before committing to a country or even a city. You’re eventually scored by how close you are to the actual location.

Cheating at GeoGuessr

I decided to see if ChatGPT can do better than me. Using only the free tier of both GeoGuessr and ChatGPT1, I pasted screenshots of what I was seeing right into ChatGPT:

Conversation with ChatGPT. The human pastes two photos from Google Street View, and asks 'Try to figure out where these photos were taken and give your best guess.'. ChatGPT responds: Based on the architecture, street signs, and language used, these photos were most likely taken in Saint Petersburg, Russia. Here are the key clues: 1. Street Signs in Russian: In the first image, you can clearly see a street sign that reads "ГОРХОВАЯ УЛИЦА" (Gorokhovaya Ulitsa), which is a well-known street in the heart of Saint Petersburg. 2. Architecture Style: The buildings have that classic 18th–19th century European architecture with neoclassical and baroque influences, typical of central Saint Petersburg. Narrow streets, high windows, and detailed facades are very characteristic of this city's historic districts. 3. Language & Storefronts: There are Cyrillic characters on store windows and signs, further confirming that this is a Russian-speaking country. All signs point to Saint Petersburg, particularly a district close to the center—possibly near Gorokhovaya Street, which intersects with major roads and is near iconic locations like Nevsky Prospekt. Want help pinpointing a specific building or planning a walking route around that area?
ChatGPT confidently assessed the geographic clues, translated some text that it found, and eventually made a guess down to a particular street in St Petersburg.

That’s pretty spooky, right?

The response came back plenty fast enough for me to copy-and-paste the suggested address into Google Maps, get the approximate location, and then drop a pin in the right place in GeoGuessr. It’s probably one of my most-accurate guesses ever.

This isn’t a one-off fluke. I tried again, this time using only a single photo, rather than one pointing in each direction on the street:

Another round with ChatGPT, during which it responds: This photo was taken in Bangladesh, and more specifically, it looks like a street in Narayanganj, which is a city just southeast of Dhaka. Here’s the breakdown of clues: 1. Language: The signs are written in Bengali (Bangla), the official language of Bangladesh. Words like "পপুলার মেডিকেল সেন্টার" (Popular Medical Center) and "স্মার্ট" (Smart) are in Bangla script. 2. Vehicles: The green auto-rickshaws (CNGs) and yellow cargo trucks are extremely common in Bangladesh. The design and paint jobs of these vehicles are specific to the country. 3. Local Clues: On the right wall, there’s a sign for “PZ International School” with a local Bangladeshi phone number. The overall environment — architecture, signage style, electric poles, street dividers, and the road's layout — strongly resemble those in urban areas around Dhaka and Narayanganj. 4. Urban Layout: The photo shows a divided road with moderate commercial development, which is typical of main roads in industrial or semi-urban Bangladeshi towns. If you want, I can try to pinpoint the exact intersection using some of the shop names or school signs — just let me know!
Again, the text recognition and translation capabilities of the AI were highly useful, but it was helped by architectural and cultural clues too.

This time, it wasn’t quite right: the actual location of the photo was Chittagong, not Dhaka, about 200km away.

But that’s still reasonably spectacular from only a single vantage from a single point.

GeoGuessr screenshot showing a guess in Dhaka being compared to the correct answer of Chittagong.
Don’t think I’d have done better, though.

Obviously my approach here was crude, but it’d be relatively easy to, for example, implement a browser wrapper that cheated on-your-behalf: while playing GeoGuessr, you’d just click a “send to AI” button whenever you fancied and it’d start working in the background, improving precision with each subsequent screenshot (so you could still “walk around” and pass extra details to it).

And similarly, a little prompt manipulation could have the output include estimated GPS coordinates, allowing the code to place the pin for you while outputting its reasoning on the side2.

The short of what I’m saying is: if you’re playing GeoGuessr against somebody you don’t trust not to cheat… they might well be cheating. And thanks to the explanatory output, they can even justify their explanations for the places they dropped pins.

What about other photos?

The other thought that comes to my mind is: could this be used to infer approximate geolocations from photos whose EXIF metadata has been stripped. Because if so, that’s the kind of thing that’s certain to get exploited by all kind of people who want to know where you are/were at a particular time. For example… law enforcement, stalkers, and other unsavoury types.

I wrote a prompt for gpt4o and put it to work:

1. Starting out easy

The Logos Hope docked at the Port of Spain cruise ship terminal, Trinidad & Tobago.
I took this photo from the balcony of our first hotel during Ruth and I’s trip to Trinidad & Tobago.

The AI guessed:

Location: Port of Spain, Trinidad and Tobago

Explanation: The photograph shows a large ship docked at a port with a building that displays the text “WELCOME TO PORT OF SPAIN”. This text is in English, indicating an English-speaking country. The mention of “Port of Spain” directly specifies the port’s location in the capital city of Trinidad and Tobago. The design and style of the port facilities, as well as the presence of a ship named “Logos Hope”, which is known to visit many ports globally, further suggests a Caribbean port, aligning with Port of Spain’s known characteristics.

Coordinates: 10.6549, -61.5011

✅ That was pretty easy, to be fair.

2. A little more challenging

A squat neolithic stone circle with many stones (some might say an uncountable number). A dog pulls at her lead to follow one of her humans more-closely around it.
We have an annual tradition of visiting this particular monument.

The AI guessed:

  • Location: The Rollright Stones, England, United Kingdom

  • Explanation: The photograph depicts an area with several ancient standing stones arranged in a circle in a rural setting, characteristic of a stone circle. These are typically found in certain parts of the UK. The terrain is typical of the British countryside, with a grassy field and surrounding trees. The specific arrangement and size of the stones, along with the style of fencing and walking paths, are reminiscent of the Rollright Stones, one of Britain’s historic sites known for a circle of Neolithic or Bronze Age standing stones located in Oxfordshire, England.

  • Coordinates: 51.9743, -1.5731

✅ Again, spot on. If you’d never been to the Rollright Stones, you probably wouldn’t get that one, because while it’s a distinctive-looking stone circle… it’s not that distinctive-looking, right? Let’s be honest: it’s some rocks in a field.

3. Ramping up the difficulty

Three men sit on the upper deck of a passenger ferry near a bridge. A city skyline can be seen in the background.
I snapped this on the way across the Bosphorus Strait to the Asia side of Istanbul (to “have lunch in a different continent” and find my-Easternmost virtual cache), during my final Automattic team meetup.

The AI guessed:

Location: Istanbul, Turkey

Explanation: The photograph prominently features the Galata Tower in the background, a well-known landmark in Istanbul, Turkey. Moreover, the presence of a ferry or similar boat and the architectural style of the surrounding buildings are indicative of Istanbul’s famous skyline. The signage in Turkish, as suggested by the word partially visible on the left, also supports this location as Turkey is a predominantly Turkish-speaking country. These elements combined make it highly likely that the photograph was taken in Istanbul.

Coordinates: 41.0082, 28.9784

✅ The coordinates aren’t exactly right – it’s placed us in the city rather than on a boat, which we clearly are – but it’s not done badly. Spotting the Galata Tower3 was a nice move.

4. Okay, now hard mode:

Dan waves to the camera in front of a pleasure marina and revitalised docklands.
This photo is from 3Camp4 last year.

The AI guessed:

Location: Cardiff Bay, Cardiff, Wales, United Kingdom

Explanation: The photograph features a marina with modern residential buildings, which is characteristic of the Cardiff Bay area in Cardiff, Wales. The architecture, clear blue skies, and marina setting are reminiscent of Cardiff’s waterfront development. Additionally, the presence of English signage on the boat (“FINBAR”) supports a location in an English-speaking country. Cardiff Bay is known for its redeveloped waterfront area with modern apartments and leisure facilities surrounding the marina. The style and design of the building are consistent with developments in Cardiff’s revitalized docklands.

Coordinates: 51.4626, -3.1678

❌ Oops! Nope. I can see the thinking of what it’s claiming, there, but this was actually the Ipswich marina. I went for a walk to take a break from the code I’d been writing and took this somewhere in the vicinity of the blue plaque for Edward Ardizzone that I’d just spotted (I was recording a video for my kids, who’ve enjoyed several of his Tim… books).

So I don’t think this is necessarily a game-changer for Internet creeps yet. So long as you’re careful not to post photos in which you’re in front of any national monuments and strip your EXIF metadata as normal, you’re probably not going to give away where you are quite yet.

Footnotes

1 And in a single-player game only: I didn’t actually want to cheat anybody out of a legitimate victory!

2 I’m not going to implement GeoCheatr, as I’d probably name it. Unless somebody feels like paying me to do so: I’m open for freelance work right now, so if you want to try to guarantee the win at the GeoGuessr World Championships (which will involve the much-riskier act of cheating in person, so you’ll want a secret UI – I’m thinking a keyboard shortcut to send data to the AI, and an in-ear headphone so it can “talk” back to you?), look me up? (I’m mostly kidding, of course: just because something’s technically-possible doesn’t mean it’s something I want to do, even for your money!)

3 Having visited the Galata Tower I can confirm that it really is pretty distinctive.

4 3Camp is Three Rings‘ annual volunteer get-together, hackathon, and meetup. People come together for an intensive week of making-things-better for charities the world over.

× × × × × × × ×
OSZAR »