Features

What Comes After Next?

By Michael Nunan

(In it’s original form, the following was presented as the Keynote for AES Toronto “Expo25”)

Let's talk about the Future.

David Bowie once said, “Tomorrow belongs to those who can hear it coming.” It’s a perfect sentiment for this crowd. But perhaps the more fitting line for today—especially given what we’re about to explore—is from William Gibson, the father of the term "cyberspace": “The future is already here—it’s just not very evenly distributed.”

That line resonates especially in light of what’s known as the Theory of the Adjacent Possible (TAP) (Kauffman). It’s the idea that real innovation doesn’t usually march in through the front door. It slips in through the side—emerging not from linear progress, but from surprising recombinations of previously unconnected ideas. TAP tells us that the next big thing almost never starts out looking like the next big thing. In innovation space, it's almost the equivalent of "the whole is greater than the sum of the parts."

But more than that, TAP suggests the future isn’t something we deduce—it’s something we grow into. Each new tool, format, or workflow opens up new combinatorial possibilities—what Stuart Kauffman called “adjacent” to the current state of things. The future of media, then, won’t be designed in a vacuum—it will be discovered at the edges, through novel recombinations and the unexpected utility of things that weren’t built with us in mind.

Take the example of Auto-Tune. Originally developed as a tool for geologic data analysis, it was reimagined through a musical lens and became one of the most defining audio effects of the last 30 years. It wasn’t just “a better EQ.”, it was a leap—the emergence of a new creative capability born from the fusion of disparate disciplines.

The point is, the future often arrives sideways and TAP gives us a framework to understand why...

And right now in 2025 - the expanse of "adjacent" capabilities (adjacent that is to the main thrust of our attention as an industry) is expanding and maturing rapidly enough that it's possible to start seeing how the confluence of quickly maturing technologies will significantly shape and empower Media in the future; and I'm
thinking here of: AI, in all of its guises (Machine Learning, Neural Networks, LLMs, Generative AI… up to the threat of AGI), but also, the curious-ness of Quantum, the promise of Web3.0, the rise of GameEngines, the arrival of practical Machine Vision and Hearing, the beginnings of Embodied computing and advanced haptics... the list goes on.

And yet, despite the apparent truism of this TAP idea, we tend to talk about the future (especially in our business) as though it’s a predictable, linear upgrade. Needing always to be on the lookout for the "next new thing" is an occupational hazard for us... but let's be frank, usually that "new" thing can often be best described as "finding a new way to do an old job". And more to the point, more threateningly, finding "more efficient ways to do old jobs"... which leads inevitably to the generalized pessimistic expectation that all of this can be summed up by:

Robots are coming for our jobs.

Cue the cartoonish imagery of our apocalypse! Our friendly-neighbourhood MediaBot, Roberta is learning to mix a show. Seems reasonable I guess. Eventually the robots will need to learn.

But then, having mastered the complexity of Audio, she (it?) moves on to learn how to Direct. Makes sense; if you’re learning all the jobs, Directing is
certainly an important one.

But if this is all true, and the robot knows it’s in the business of taking jobs - is there a chance that she might decide that all things considered, it would rather Present the News!!

Let’s be honest—do we really think the robots are going to choose audio? Of course not. Why would they be any different from the rest of the world? They’ll aim for something sexier, easier, more celebrated... It's maybe possible that Audio remains
our thing by default!

But seriously, here's the thing about this: I'm not guessing. I can prove this - because its already happened. Sort of. Here’s the story…

Anyone over a certain age remembers "Rosie the Robot" from The Jetsons... and, well, truth is catching up with this particular fiction... now we have Rhumba.

(This is a photo that was taken 2 months ago, in Vancouver at the VCC, backstage at at the annual TED Conference... it shows "Neo Gamma", an autonomous bipedal robot. Think "Rhomba" with arms and legs - it's literally a manifestation of the idea of Rosie the Robot. "Neo" is huddled with the crew and getting ready to go on.

The next photo was taken 3 stories below and several hundred feet from the stage, inside a truck called Dome “Gateway”. It shows my great pal and mentor Howard Baggley; easily one of the finest and most credentialed A1s in the world... and what's he doing? Mixing the robot. NOT the other way 'round!

Humour aside, let’s stay with that image— Howard at the console. Because that console, for me, perfectly represents the situation we’ve put ourselves in.
That device doesn’t know anything about television production. Or audio, for that matter. It’s a hyper-specialized tool, which could cost the better part of a million-dollars, built expressly for broadcast, and yet it remains a total cipher—oblivious even to its usual infrastructure neighbours; things like like switchers or routers or cameras or microphones, let alone the creative context it’s supposed to provide Howard with support for.

It's a calculator! That processes silence, pink noise, or the sound of a Stradivarius violin with the same precision and fidelity, AND the same lack of understanding or priority. In many ways, our tools are brilliant—but blind. Or deaf. And to be fair, that’s not an indictment - it's exactly what we asked for. We’ve spent decades -
basically the entire history of recorded and electronic sound - building better tools—higher fidelity, easier to use, faster, more powerful, more precise. More convenient.

And here we are today. Tasks that once required enormous effort are so trivial that they escape notice, and things that only a few years ago would have been utter science fiction are now merely inconvenient.

Vince Silvestri, my boss, who helms Research and Development at Evertz, has a favourite analogy for this: the humble light switch. There was a time when electric light was high tech. New. Revolutionary. Divisive and Disrupting. Today? Most people don’t even see it as “technology”—it’s just part of the furniture. That’s the mark of a very specific kind of maturity that should be aspirational for us.
And just now—I think we're on the eve of something similar, the ability to envision a future in which today’s high-tech, extremely capable (and increasingly moreso) but very arcane media tools become tomorrow’s light switches. And they’re going to achieve that level of transparency by becoming tools that don’t just wait to be told what to do—but understand what needs to be done.

Everyone knows that old adage: When all you have is a hammer, every problem looks like a nail... it's funny because it can be true! But only in a world where the hammer can't recognize a nail, and has no say in the matter anyways!

In the coming world, our tools will be cooperative partners—capable of understanding not just content, but context. A world where tools adapt themselves to us—not the other way around.

Enough about us on this side of the camera. Focused outwards, what might that this all mean for creativity? For audio? For music? For media?

Let’s imagine a few scenarios: (and for the record, each of these already exists in the world. But they’re science experiments, basically bespoke Proofs-of-Concept, and not broadly available or easy to access.

Nevertheless, this reinforces my opening premise about the future not being evenly distributed!)

An album that sounds one way if you listen to it during the day while you're working, and differently if you then listen at night when you're at rest.
A hockey game that automatically switches to a vertical, all-close-up presentation mode when you’re watching on your phone, on the subway on the way home from work.
A newscast that adjusts for your preferred topics, scope, tone, and bias.
An interactive serialized drama you can explore like a “choose your own adventure" novel you may have read as a kid.
A documentary that reshuffles itself every time you watch—curating a new 50-minute experience from a 500-hour repository of source material. (That last one really isn’t science fiction—Gary Hustwit’s “Eno” does exactly that. It’s the world’s first Generative Documentary and it was shortlisted for an Oscar in 2024)

These aren’t just product ideas—they’re signals that location, context, time-of-day, device and display modality, personal preferences and demographic data are all becoming active ingredients in the conception, creation and delivery of media. In a very real sense, media is about to become spatial, adaptive, and situational —whether it wants to or not.

How do we get to any of these examples from where we are today?

And yes, I hear the cynics. It's true. In the far future, maybe most content will be machine-made. Just like the bulk of many things which have been commoditized are today machine-made. But just like most photos today are digital and increasingly computer-generated, it’s still true that analog photochemical photography still has a place for connoisseurs and artists. Human-made content may become rare, or precious, certainly eccentric in the far future. And that’s okay. We don't need to know that now.

The question is: what can we do now?

Do we have to wait for HAL 9000 to take up a surprise interest in mixing hockey before we start making our lives easier? We’ve spent years using technology as a lever to reduce cost and optimize workflows—fine. But what about using it to reduce friction for the human beings involved? And then seeing what they could manage to accomplish in the time allotted!?

Because if you’re honest, in most production environments today, the humans serve the technology—not the other way around.

Our tools are powerful, but they’re dumb. They rely on human operators to tell them exactly what to do, every time. I get it. They’re Tools. And all our investment has gone into raw capability and capacity—not cognitive support. I understand why. I understand history and scientific and technical progress. But also - that was then, and this is now.

What if our tools knew what they were doing? What if they could anticipate, collaborate, adapt? Or at least appear to, for all practical purposes. That’s the kind of shift in thinking we need. Because that would be context-aware tooling and infrastructure. That would be dynamic signal flow. That’s the future I want to point toward.

And to get there, we’ll need to revisit everything—our sensors, our data structures, our UIs, and maybe especially, our definitions of authorship and editorial. But I'd like to suggest this: there are real, concrete steps we can take today to prepare for that future...

Because if you believe, even a little, that some of the scenarios I've described are likely, then here’s the takeaway: the future is metadata.

It doesn’t matter if it’s human-readable or machine-readable. Doesn’t matter when it arrives. If we want our tools to be smarter, if we want the process to be sensitive to what we're trying to accomplish, then we need to start encoding intent in our content and context in our systems. And that requires metadata. And a lot of it.
And I can hear the barely suppressed groans, from colleagues of a certain vintage.

Yes, we’ve had a rocky history with metadata. Most of that was static, program-level, usually manually handled, after-the-fact, and often misunderstood or ignored. But imagine it fully granular—down to every source, every channel, every destination, every device, every process. Metadata that doesn’t just describe the content—it helps the system understand what it’s for and what it’s doing. Imagine if signals had agency.

And of course this is going to require yet more machine intervention, to help us generate, manage, and react to metadata in real time. Otherwise, we’ll have just managed to turn the remaining human operators into glorified data-entry clerks... again serving the technology.

The good news? You can start now.

Micheal Nunan is Senior Architect, Live Media Systems. Research & Developmentat, Evertz Microsystems Limited and previously Senior Manager, Broadcast Operations (Audio) at Bellmedia

Linkedin: www.linkedin.com/in/michael-nunan-26a25936/

Subscribe to our regular updates