In order to receive the signal far from the transmitter, some form of spread spectrum encoding could be used, like CDMA. The spreading factor could be negotiated.
I always assumed that PWM was the go-to method for this kind of low bandwidth / high noise medium, I wonder why the author didn't go that route and used FM instead
keep in mind I don't know much about waves, so all of this could be wrong, but I think PWM works by modulating the width of the pulse, in other words, the "duration of the note" if you will, so the frequency of the square wave remains constant. You have a high pulse of width t to represent a zero and a high pulse of t*2 to represent 1. The human ear might hear this as a modulating frequency, but that's only because the pulses are changing faster than our brains can recognize pitch, if that makes any sense . I don't know what a t state is, but I suspect the number is the duration of the pulse in microseconds or something of the sort.
I believe IR remotes work on a similar principle: a series of blinks of two different durations, which represent 0 and 1
IIRC, there was some commercial product - years ago - that worked by using ultrasonic data transfer.
It went something like this: You install some app on your phone, which then listens for incoming audio in the ultrasonic range. The audio is coded instructions, which then would do things like blink a light on your phone or whatever. The idea was that this could be used at events (sport, music, whatever) to create light shows on the mobiles, without relying on good wifi coverage or similar in the avenue. As you could use the PA for the data transmission.
I wrote a very similar web implementation 12 years ago as a proof of concept - if it got the go-ahead, the plan was to test it with some commercial TV broadcasting in the UK where a ultrasonic short code could be sent to phones using an app with a Web view, or possibly native app.
Sadly never got picked up, although we proved the concept could work - but it certainly had it's challenges.
Google explored doing this with devices that connect to each other, specifically Chromecast and their early Google Home devices. I don't think it ever launched but Google did some interesting experiments to test the ultrasonic transfer functions on millions of consumer devices. (I worked in audio research at Google at the time but not on this).
I believe the main problem is that it makes dogs go crazy
Yeah, 18 kHz isn't really ultrasound, but 50 kHz definitely is and nobody's hearing that. You definitely don't need to go up into the MHz range and can't if the goal is to use existing audio equipment.
would the stock equipment be able to produce MHz? consumer gear is targeted specifically at making sounds humans can hear. Since the 18-20kHz is within the 44.1-48kHz audio tends to playback, it makes since why it would be near-ultrasound frequencies. The headline specifically mentions without any special equipment
Theoretically, yes! With 192kbps sample rantes you can probably render up to 96khz, but the limitation is the speaker and amp components all the way from your DAC to the main amp to the speaker.
bitrate isn't really what we're concerned with here, though is it? if you use 96kHz sampling rate, you can reproduce 48kHz frequencies according to Nyquist. the bitrate would then be the sampling rate * bit depth * #channels. your 192kbps would potentially be the 96kHz * 2 channels, but you've left out the bit depth or using a 1-bit sample???
The web tool is fun! I decided to give it a harder test and turned up some fans and background music up pretty loud and it still managed to decode the message.
Though after a certain point it stops recognizing it, I was still surprised how well it did with noise. I'm sure noise in higher frequencies (or the right harmonics) would be much harder to handle, but solvable in interesting ways too
Isn't the "Illustrated for zoomers" version of the frequency domain wrong? I'm pretty sure the bars over the timeline show volume over time, not intensity over frequency. So the middle bar doesn't represent a specific frequency but a specific time interval in the song.
GGWave is a really great tool and does support audible and inaudible versions.
We are using it in XRWorkout to automatically sync up ingame recordings with external recordings, we are using the audible version instead of the ultrasound version so a human can sync it up too if they are using a regular video editor instead of doing it automatically in our own tools
It might be useful to study the techniques that modems used to transmit data over phone lines. I seem to recall trellis coded modulation being used:
https://en.wikipedia.org/wiki/Trellis_coded_modulation
The acoustic channel is bound to suffer from multipath too, so some equalization may be needed too.
https://en.wikipedia.org/wiki/Equalization_(communications) https://www.ti.com/lit/an/spra140/spra140.pdf
In order to receive the signal far from the transmitter, some form of spread spectrum encoding could be used, like CDMA. The spreading factor could be negotiated.
https://en.wikipedia.org/wiki/Direct-sequence_spread_spectru...
Did somebody say Spectrum?
https://softspectrum48.weebly.com/notes/tape-loading-routine...
I always assumed that PWM was the go-to method for this kind of low bandwidth / high noise medium, I wonder why the author didn't go that route and used FM instead
> Tape data is encoded as two 855 T-state pulses for binary zero, and two 1,710 T-state pulses for binary one.
Is that not FM, more specifically FSK, just with some extra harmonics?
keep in mind I don't know much about waves, so all of this could be wrong, but I think PWM works by modulating the width of the pulse, in other words, the "duration of the note" if you will, so the frequency of the square wave remains constant. You have a high pulse of width t to represent a zero and a high pulse of t*2 to represent 1. The human ear might hear this as a modulating frequency, but that's only because the pulses are changing faster than our brains can recognize pitch, if that makes any sense . I don't know what a t state is, but I suspect the number is the duration of the pulse in microseconds or something of the sort.
I believe IR remotes work on a similar principle: a series of blinks of two different durations, which represent 0 and 1
Another step to look into if you really want to have fun is implementing some sort of QAM.
IIRC, there was some commercial product - years ago - that worked by using ultrasonic data transfer.
It went something like this: You install some app on your phone, which then listens for incoming audio in the ultrasonic range. The audio is coded instructions, which then would do things like blink a light on your phone or whatever. The idea was that this could be used at events (sport, music, whatever) to create light shows on the mobiles, without relying on good wifi coverage or similar in the avenue. As you could use the PA for the data transmission.
We demonstrated this in 2003 using the smartphones of the era (they didn’t have good filters, so you could detect ultrasound quite easily).
See https://anil.recoil.org/papers/audio-networking.pdf sec 2.1 for the 2003 paper and some ancient videos at https://anil.recoil.org/projects/ubiqinteraction if you want some Nokia nostalgia :-)
I wrote a very similar web implementation 12 years ago as a proof of concept - if it got the go-ahead, the plan was to test it with some commercial TV broadcasting in the UK where a ultrasonic short code could be sent to phones using an app with a Web view, or possibly native app.
Sadly never got picked up, although we proved the concept could work - but it certainly had it's challenges.
https://github.com/tanepiper/adOn-soundlib
https://github.com/tanepiper/adon-ad-platform
Google explored doing this with devices that connect to each other, specifically Chromecast and their early Google Home devices. I don't think it ever launched but Google did some interesting experiments to test the ultrasonic transfer functions on millions of consumer devices. (I worked in audio research at Google at the time but not on this).
I believe the main problem is that it makes dogs go crazy
18-20khz is not really ultrasound. Many people can still hear this range, and it's very unpleasant when played (at least to me).
For comparison, medical imaging ultrasound is 2-20 MHz (that's MEGA hertz) I think,
Yeah, 18 kHz isn't really ultrasound, but 50 kHz definitely is and nobody's hearing that. You definitely don't need to go up into the MHz range and can't if the goal is to use existing audio equipment.
They are using the 18-20Khz range though (https://halcy.de/blog/images/2025_06_27_fsk.png)
would the stock equipment be able to produce MHz? consumer gear is targeted specifically at making sounds humans can hear. Since the 18-20kHz is within the 44.1-48kHz audio tends to playback, it makes since why it would be near-ultrasound frequencies. The headline specifically mentions without any special equipment
Theoretically, yes! With 192kbps sample rantes you can probably render up to 96khz, but the limitation is the speaker and amp components all the way from your DAC to the main amp to the speaker.
bitrate isn't really what we're concerned with here, though is it? if you use 96kHz sampling rate, you can reproduce 48kHz frequencies according to Nyquist. the bitrate would then be the sampling rate * bit depth * #channels. your 192kbps would potentially be the 96kHz * 2 channels, but you've left out the bit depth or using a 1-bit sample???
The web tool is fun! I decided to give it a harder test and turned up some fans and background music up pretty loud and it still managed to decode the message.
Though after a certain point it stops recognizing it, I was still surprised how well it did with noise. I'm sure noise in higher frequencies (or the right harmonics) would be much harder to handle, but solvable in interesting ways too
Chromecast has used ultrasound in lieu of pairing code for a while now.
Webex on a laptop can know which conference room it is in by ultrasound, which allows quick screen sharing to the screen/meeting.
Speaking for all the dogs and other pets with hearing abilities above 20kHz ... please don't do this.
So that’s how Bobby Tables got his start
Isn't the "Illustrated for zoomers" version of the frequency domain wrong? I'm pretty sure the bars over the timeline show volume over time, not intensity over frequency. So the middle bar doesn't represent a specific frequency but a specific time interval in the song.
See also: https://github.com/ggerganov/ggwave
GGWave is a really great tool and does support audible and inaudible versions.
We are using it in XRWorkout to automatically sync up ingame recordings with external recordings, we are using the audible version instead of the ultrasound version so a human can sync it up too if they are using a regular video editor instead of doing it automatically in our own tools
Here is an example how that sounds https://xrworkout.nyc3.digitaloceanspaces.com/data/video/036...
This seems like such a fun way to work out but not with a VR helmet on.
what is it about websites that can't be zoomed? do the authors not realise their choice of fonts won't always fit with mobile phone screens?
IIRC mobile Firefox ignores the "can't be zoomed" setting, which I find the right thing to do.