Phonopolis key art
MAR 10, 2026

Phonopolis

The dystopian imagination is overwhelmingly a visual imagination. Posters, surveillance cameras, the architectural grid. The sonic dimension of actual authoritarian rule, which has been one of the most consistent technologies of large-scale control for at least a thousand years, has been almost completely absent from the genre's standard vocabulary. Phonopolis is the rare game whose central design proposition is that authoritarian power is something you hear before you see.

Image & Sound
Tuesday analysis

How Authoritarianism Actually Sounds

Writer
J. A. Marsh
Lens
Image & Sound
Published
MAR 10, 2026
Length
2,322 words / 10 min
Notes
8 sources
SpoilersThis essay discusses the game's premise and general thematic structure; no specific plot revelations.

In a French village in 1872, the church bell rang at six in the morning, at noon, and at six in the evening, on a schedule that the village had been keeping for several centuries. The bell told the field workers when to begin the day, when to break for the midday meal, and when to return home. It also marked deaths, marriages, baptisms, fires, the arrival of officials, and the start of religious services. The historian Alain Corbin, in his 1994 Les Cloches de la Terre (translated as Village Bells in 1998), documented the legal and political battles that broke out across rural France in the late nineteenth century over who controlled the bell ropes. The bells, in Corbin's careful documentation, were not symbols of authority. They were authority's most reliable broadcast technology - the only medium by which a single instruction could reach every villager simultaneously, regardless of literacy, occupation, or location within the village's territory. To control the bells was to control the rhythm of the village's daily life.

In Moscow in 1933, every newly-built apartment block was wired with a centralized public-address speaker that broadcast state radio. The speakers could not be turned off. They could be turned down, but the receiver was a Soviet government device, hard-wired into the building's electrical infrastructure, and the volume control was a courtesy rather than a real disablement. The historians of Soviet acoustic infrastructure have documented the system in detail: the broadcasts began at six in the morning with the national anthem, continued through the workday with news and political instruction, paused briefly in the afternoon, and resumed in the evening with cultural programming. The system was not, on the official rationale, surveillance. It was enlightenment. The state was giving its citizens uninterrupted access to information and culture. The citizens, on the available oral-history record, mostly experienced the system as the unstoppable background sound of their lives.

The dystopian imagination is overwhelmingly a visual imagination. Posters, surveillance cameras, the architectural grid. The sonic dimension of actual authoritarian rule, which has been one of the most consistent technologies of large-scale control for at least a thousand years, has been almost completely absent from the genre's standard vocabulary. Phonopolis is the rare game whose central design proposition is that authoritarian power is something you hear before you see.

In Pyongyang, today, the city center is wired with a public-address system that broadcasts a recorded vocal performance of the song "Where Are You, Dear General?" at six in the morning, every morning, across the entire downtown. The song is loud enough to be heard inside apartment buildings with the windows closed. The melody is the most-heard piece of music in the country. The acoustic infrastructure that delivers it is a deliberate inheritance from the Soviet system, refined for the specific purposes of the North Korean state, and operates as the city's audible spine.

These three cases - French village bells, Soviet apartment-block radio, Pyongyang's morning anthem - span roughly seven centuries of governance through sound. They are not the only cases. They are a small selection from a substantial historical record that the standard dystopian imagination has been almost completely failing to draw on. The dystopia in fiction is overwhelmingly a visual register: the surveillance camera, the propaganda poster, the architectural grid, the regimented uniform. The sonic dimension of actual authoritarian rule, which has been one of the most consistent technologies of large-scale control for at least a thousand years, has been almost completely absent from the genre's standard vocabulary.

Phonopolis, the dystopian puzzle adventure released by the Czech studio Amanita Design in May 2026, is the rare game whose central design proposition takes this absence as its subject. The city in the game is governed not by a visible authority but by an audible one: a leader who rules through tone, through frequency, through the unstoppable ambient sound of his voice. The player, a refugee from the city's sonic conformity, has to navigate spaces where listening differently is the political act, and where silence - a brief negotiated absence of the city's voice - is the puzzle the design treats as the game's primary reward.

The frame this essay wants to give the reader extends beyond Phonopolis to a broader proposition the dystopian-fiction tradition has been slow to absorb: authoritarianism, in the empirical historical record, has always been at least as much sonic as visual. The propaganda poster gets the cultural attention because it survives as an object that can be photographed and printed in textbooks. The state radio broadcast does not survive in the same way; it left no preservable artifact beyond the recordings of programs nobody listens to today, and the historical experience of being saturated in that broadcast is unrecoverable for any reader who did not live through it. The visual register of authoritarianism has been preserved because it is preservable. The sonic register has been lost because the actual lived experience of being inside a state-controlled soundscape leaves no physical trace.

The foundational work on what sound does politically was done by the French economist and theorist Jacques Attali, in his 1977 book Noise: The Political Economy of Music. Attali's central claim, drawn from a sweeping survey of music's social functions across European history, was that political organization always involves the organization of sound. Who controls the sonic environment - who can make noise, who can be quiet, who is required to listen, whose voice carries - is a fundamental political question that the standard political-theory vocabulary has consistently ignored. Music, in Attali's account, is not decoration on the political order; music is one of the order's primary technologies. To organize a society is, among other things, to organize what it sounds like.

Attali's framework has been refined and extended by subsequent scholarship in what is now called sound studies. The Canadian composer R. Murray Schafer introduced the term soundscape in 1969 to name the entire acoustic environment of a given place at a given time, in the same sense that the visual environment is called the landscape. Schafer's argument was that soundscapes are designed, even when no single person is designing them. The acoustic environment of a medieval village was designed by the placement of the bell tower, by the muting effect of thatched roofs, by the absence of mechanical engines. The acoustic environment of a contemporary city is designed by the placement of highways, by the regulation of construction noise, by the volume rules in commercial districts. Soundscapes have authors, even when the authorship is distributed across many decisions made by many people over many years.

What authoritarian governance has consistently done, on the historical record, is centralize that authorship. The bell tower, the apartment-block speaker, the Pyongyang loudspeaker, the Nazi-era Volksempfänger receiver that brought state radio into millions of German homes - these are all technologies for consolidating the design of the soundscape in the hands of the central authority. The citizen in a heavily controlled sonic environment does not get to choose what the city sounds like. The state has made that choice. The citizen lives inside it.

Phonopolis is interesting in this longer historical context because the game is set in a city whose central authority has done what the historical cases gestured at but never fully achieved: organize the entire sonic environment around a single voice that brooks no acoustic competition. The game's visual aesthetic - a stylized retro-modernist city in Amanita Design's house style of paper-craft and hand-touched textures - does the necessary work of marking the city as fictional. The sonic premise is the part that draws on the historical record. The dictator's voice in the game's world is everywhere. Buildings carry his frequencies. The architecture has been built to resonate with his tones. To escape the voice is to find spaces the architecture failed to colonize. The puzzle of the game is, in a precise sense, the puzzle of finding silence in a city whose silences have been engineered out.

This is closer to the actual experience of life in heavily controlled acoustic environments than most of the dystopian fiction the medium has produced. The first-hand testimony from former residents of Soviet apartment blocks, gathered in the oral-history projects of the 1990s, consistently emphasizes the inability to escape the radio's voice as one of the daily indignities of life in the system. The testimony from North Korean defectors emphasizes the morning anthem, the impossibility of not hearing it, the way it organized the day before any individual decision could. The dystopia that gets the sonic dimension right is closer to the historical truth than the dystopia that focuses on the visible apparatus.

There is an interesting cognitive-science reason why the sonic register of authoritarianism is so effective, which the political theory has been slow to integrate. Sound, in the architecture of human perception, has properties that vision does not. The visual field can be closed by closing the eyes; the auditory field cannot be closed by any equivalent gesture. The visual field is directional - the eyes face forward, and the rear hemisphere is invisible without head movement; the auditory field is omnidirectional, with the ears receiving signal from every direction simultaneously. The visual field can be ignored by attention; the auditory field has a much harder time being ignored, because the perceptual system processes sound for threat-relevance before consciousness has a chance to filter it. These are not subjective preferences. They are properties of how the perceptual machinery works.

The implication for political control is that sound is, in cognitive-architecture terms, the broadcast medium most resistant to the citizen's attempt to opt out. A propaganda poster can be looked away from. A state broadcast in the public square cannot be unheard. The Soviet apartment-block radio was not, on this analysis, choosing sound as its delivery medium because of any limitation of 1933 technology. The Soviet state had access to printing presses; the choice was deliberate. The acoustic delivery reached the citizen in a way the visual delivery did not.

Phonopolis carries the proposition into the player's body in the way that only an interactive medium can. The player, operating the protagonist through the city, encounters spaces where the dictator's voice is louder and spaces where it is softer. The puzzle of the game is, in part, the puzzle of navigation toward acoustic relief. The player's own auditory attention is being managed by the game's design in a small simulation of what the citizen of a sonically authoritarian state experiences continuously. The game does not have to lecture the player about what sonic authoritarianism feels like. The player, playing the game, is having a small version of the experience the historical record has been trying to describe.

The implication for the reader who has not played the game runs in the other direction. Once the frame is available - once the reader understands that authoritarian governance has always been at least as much sonic as visual - the acoustic environment of the reader's own life becomes legible in ways it was not before. The hold music in the corporate phone tree. The store muzak in the shopping mall. The unsolicited ambient music in restaurants and gyms. The constant background of devices producing sound the listener did not request. These are not, in any straightforward sense, authoritarian. They are, however, instances of someone else's organization of the listener's soundscape. The listener did not get to author what the room sounds like; the management did. The cumulative effect across daily adult life is that the contemporary developed-world citizen lives inside acoustic environments other people have designed for them, in degrees that would have been considered remarkable as recently as two generations ago.

This is not equivalent to North Korea's morning anthem. The differences in degree are enormous and matter. But the basic operation - the organization of someone else's soundscape according to interests that are not theirs - is structurally the same operation, just at much lower intensity and with much less coordination. The corporate phone tree is not the Pyongyang loudspeaker. The corporate phone tree is, on a careful look, an instance of the same broader category of sonic-environmental design, scaled down and decentralized but operating on the same cognitive vulnerability that the heavily centralized examples exploit at maximum.

There is one further note worth registering, about the cultural-historical accident that has made the sonic dimension of authoritarianism so hard to remember. The dominant medium of contemporary cultural production - film, television, internet video - has visual primacy in its formal structure. The image is the lead element; the sound design is supporting. This is not how the participants in heavily sonic environments experience them. The participants in those environments would describe the sound as primary and the visual as supporting. The cultural production that documents those environments after the fact has structurally biased the surviving record toward the visual. The historical memory of authoritarianism is correspondingly visual, even when the participants' actual experience was not.

What Phonopolis does, at the small scale of one Czech indie game, is reverse the bias for the duration of the player's session. The puzzle of finding silence is the game. The sonic environment is the antagonist. The visual register is genuinely subordinated to the audible one in a way few games have attempted. The result is closer to the historical experience the dystopian tradition has been trying to render than the standard visually-led version manages to be.

The next time the reader encounters a dystopian story - in fiction, on the news, in a contemporary political analysis - the question worth asking is what the place actually sounds like. The visible apparatus will usually be the part the documentation has preserved. The sonic apparatus will usually be more important and less remembered. Attali's claim from 1977 - that the political organization of sound has been one of the consistent unspoken structures of power - has been confirmed by every subsequent decade of careful historical work, and is doing more analytic work than the standard visual dystopian vocabulary has been letting on.

The Czech studio that made the small game about a city ruled by a voice has, in the small way that a puzzle game can do this, made the proposition audible in the only register that lets the proposition actually be heard.

One analysis. Every Tuesday.