— Bluetooth, Audio, Kotlin Multiplatform, Android — 9 min read
The inspiration for this project came from this tweet.
AirPods will now automatically switch between devices by recognizing if you put down your phone and pick up your iPad. Or if you’re on your iPad and you get a phone call, it’ll switch back to the phone. That’s the type of experience only a complete ecosystem can create.
— Marques Brownlee (@MKBHD) June 22, 2020
A core tenet of the Apple camp is that their complete ecosystem makes every product "just work". This is a truism: Apple isn't plagued with fragmentation issues like other platforms. But replicating these functionalities on other platforms isn't rocket science.
Fun fact: There are 1,300 brands with over 24,000 distinct Android devices (source).
PCAS (Peripheral Connection Augmentation System) artificially augments a Bluetooth peripheral maximum number of concurrent connections. Based on user-initiated events and hardware configurations PCAS automatically connects/disconnects a profile on a peripheral. PCAS can also multiplex to a single sink: For example, on Android, this brings the theoretical maximum audio connections to 30 (the maximum allowed AudioTrack instances).
A single user with multiple hosts no longer has to manually connect/disconnect each peripheral. PCAS does this automatically. This works even on cheap peripherals that don't support multiple concurrent connections natively. Example scenarios:
...
pcas-libs
: This is a KMP Gradle project. The clients use the artifacts generated from this project. Run the publish_local.sh
script to publish the library artifacts to your maven local repository.pcas-clients/pcas-android-client
: This is an Android Gradle project.You can find prebuilt binaries in the repo releases section.
PCAS is designed to be simple and "just work". The only initial setup required is selecting a peripheral for each service of interest.
All hosts must be on the same LAN. E2E encryption is provided but optional (disabled by default). To enable E2E encryption, a key needs to be created and shared with the relevant hosts.
Building PCAS for all platforms is hard, but the underlying concept is quite simple:
This base layer provides a framework for fast, best-effort, zero-configuration, & E2E encrypted communication among devices within proximity.
This allows a host to be able to efficiently & securely send and receive messages from other nearby devices without any upfront configuration.
The "unreliable" prefix is misleading, just like when people say UDP is unreliable. This layer is just as reliable as the network stack below it; No additional reliability guarantees are provided.
The transport data unit is a parcel
. Parcels are just opaque byte buffers.
This layer is made of three core components:
This channel uses IP multicast to efficiently deliver parcels to multiple devices.
1internal interface MulticastChannel {23 @Throws(Exception::class)4 fun init(receiver: MessageReceiver)56 @Throws(Exception::class)7 fun send(parcel: ByteArray, size: Int)89 fun close()10}
All hosts can send a parcel to the PCAS multicast group and can join the group to receive parcels. The local IP addresses of discrete hosts are not required. The multicast configs can be found in the TransportConfig
class:
1internal object TransportConfig {23 // Max possible TTL value: Parcels could potentially leak to the internet.4 const val MULTICAST_TTL = 2555 const val MULTICAST_PORT = 491376 val MULTICAST_ADDRESS = Address.Ipv4("225.139.089.176")78 const val OFFSET_ZERO = 0910 const val MAX_PARCEL_SIZE_BYTES = 24 * 1024 // 24KB11 const val PARCEL_POOL_CAPACITY = 2412}
This channel uses IP unicast to offer a high-bandwidth point-to-point communication. This is used for data streaming as multicast has a lower data transfer rate.
1internal interface UnicastChannel {23 @Throws(Exception::class)4 fun init(receiver: MessageReceiver)56 @Throws(Exception::class)7 fun send(recipient: HostInfo, parcel: ByteArray, size: Int)89 @Throws(Exception::class)10 fun getPort(): Port1112 fun close()13}
This component adds E2E encryption to the two channels above. Encryption is optional and is only activated when an encryption key is generated or shared.
Data is encrypted with AES in GCM mode with no padding. A random initialization vector is used once per message and prepended to the head of a parcel. I won't go into details here as there are lots of good material on AES encryption on the internet: like this one.
An interface is created for the 3 components above in the common module which is implemented natively on each platform.
This layer is broadly split into two:
A ledger is just a simple in-memory local database made up of blocks. Blocks represent the current state of the active hosts in a network. Each block is uniquely identified by a 4-tuple (service, profile, owner, peripheral)
.
1data class Ledger(2 val self: HostInfo,3 val blocks: Set<Block> = emptySet()4)
1data class Block(2 val service: Service,3 val profile: Service.Profile,4 val peripheral: Peripheral,5 val priority: Int,6 val timestamp: Long,7 val bondSteadyState: PeripheralBond.State,8 val owner: HostInfo,9 val canStreamData: Boolean,10 val canHandleDataStream: Boolean11)
"immutable" isn't technically correct. Blocks can be overwritten ONLY by their owner. Any host can prune their ledgers to remove blocks from inactive hosts.
A resilient multicast protocol is built on top of the transport layer. There are three types of messages:
Each host maintains its local ledger. The network protocol guarantees that eventually, all ledgers will be consistent.
Currently, only Update messages are classified as essential.
Heartbeat messages are used as a form of NACK. A host detects synchronization issues from heartbeats and resends its current blocks.
While heartbeats are effective, the interval is too long to be relied on primarily for a highly interactive system like PCAS.
Reliable multicasting is an interesting problem. I explored two strategies:
Blindly resend essential messages x
times with a delay of y + random jitter
on each attempt.
Let's consider a simple model. If the probability of successfully delivering a message is fixed at 0.50. Assuming each attempt is independent, there is a 0.97 probability that at least one message gets delivered in 5 attempts.
Napkin math
Let b = 1 on success; 0 on failure
Pr(b = 1) = 0.50
Pr(b = 0) = 1 - Pr(b = 1) = 0.50
(b is a Bernoulli random variable)
n = 5
z = n tries of b
(z is a Binomial random variable)
Pr(1≤z≤n)=i=1∑n(in)Pr(b=1)iPr(b=0)n−i
Could also be calculated as 1 - Pr(z = 0)
In practice, delivery probabilities aren't fixed and attempts are not independent. Despite the shortcoming of this strategy, it has the following benefits:
c / second
.Essential messages have a monotonically increasing sequence number. The initial sequence number is 0
. All hosts are expected to send an Ack
message with the sequence number of the essential message. Retries are done with a truncated exponential backoff with jitter.
This strategy only sends fewer messages (more efficient) than the redundant strategy when the number of peers in a network is less than x
. Some issues with this strategy:
ACKs
like TCP isn't scalable and runs the risk of ACKs implosion.I initially went with ACKs but will be using the simple redundant strategy instead.
The network protocol data unit is a Message
. Messages are marshaled to the Protobuf
format and passed to the transport layer. The inverse happens when a message is received.
When two or more hosts use the same 3-tuple(service, profile, peripheral)
, a contention occurs. Even if a host doesn't actively require a profile, it can still contend with another host for that profile.
This is the meat or vegetable (for my vegetarian friends) of the system.
A Bluetooth profile is a specification regarding an aspect of Bluetooth-based wireless communication between devices. It resides on top of the Bluetooth Core Specification and (optionally) additional protocols - source
Recap: With A2DP you can only listen but with higher audio quality. With headset profiles, you can talk and listen, but at a lower quality. Next time you are playing a song while a call comes in, observe how the audio quality drops.
Each time a change is made to the ledger, a resolver looks at all the current contentions and resolves them.
Resolutions are derived using a rank associated with each block.
1val isConnected = bondSteadyState == PeripheralBond.State.CONNECTED23 val hasPriority = priority != NO_PRIORITY45 val maxPossibleConnectionAndInteractiveScore = 4 + 26 // Any device with a higher priority should always rank higher.7 val priorityScore = (maxPossibleConnectionAndInteractiveScore + 1.0).pow(priority)89 private val connectionScore: Int get() {10 // Connection should contribute more if we can't stream11 val trueValue = if(canStreamData) 2 else 412 return if(isConnected) trueValue else 113 }1415 val interactiveScore = if(owner.isInteractive) 2 else 11617 val timestampScore = log10(timestamp.toDouble())1819 val rank = priorityScore + connectionScore + interactiveScore + timestampScore
The rank is an estimate of the current importance of a block. Based on ranks, a contention object is created for each block a host has.
1data class Contention(2 val selfBlock: Block,3 // This is another block with the same service, peripheral, and profile but a different owner that is deemed the apex based on its rank4 val peersApexBlock: Block?5)
The contention object is then used to derive a resolution.
1fun getResolution(contention: Contention): Resolution {2 return when {3 // No contenders found yet.4 contention.peersApexBlock == null -> {5 Resolution.Connect(contention.selfBlock, contention.selfBlock.rank)6 }78 // I have a higher rank: So connect to profile9 contention.selfBlock.rank > contention.peersApexBlock.rank -> {10 Resolution.Connect(contention.selfBlock, contention.selfBlock.rank)11 }1213 // I have a lower rank: So disconnect from profile - If connected14 contention.selfBlock.rank < contention.peersApexBlock.rank -> {15 val rank = contention.peersApexBlock.rank16 // If possible stream data to the apex host17 if(contention.shouldStreamToApex()) {18 Resolution.Stream(contention.selfBlock, rank, contention.peersApexBlock.owner)19 } else {20 Resolution.Disconnect(contention.selfBlock, rank)21 }22 }2324 // Nothing decided: Keep the system as-is.25 contention.selfBlock.rank == contention.peersApexBlock.rank -> {26 Resolution.Ambiguous(contention.selfBlock, contention.selfBlock.rank)27 }2829 else -> throw IllegalStateException("Impossible!")30 }31}
Currently, only audio services are supported. Provision has been made to easily add other types of services.
All services have two key integrants:
Blocks are built from host state information. The relevant audio states are the current audio usages and the current peripheral bond state.
1data class AudioProperty(val usages: Set<Usage>) {23 enum class Usage(4 val priority: Int,5 val profile: PeripheralProfile6 ) {7 UNKNOWN(1, PeripheralProfile.A2DP),8 // Unknown media playback. It could be music, movie soundtracks, etc.9 MEDIA_UNKNOWN(2, PeripheralProfile.A2DP),10 // Music playback, eg: Music streaming, local audio playback, etc.11 MUSIC(2, PeripheralProfile.A2DP),12 // Speech playback, eg: Podcasts, Audiobooks, etc13 SPEECH(2, PeripheralProfile.A2DP),14 // Soundtrack, typically accompanying a movie or TV program.15 MOVIE(4, PeripheralProfile.A2DP),16 // Game audio playback17 GAME(4, PeripheralProfile.A2DP),18 // Such as VoIP.19 VOICE_COMMUNICATION(5, PeripheralProfile.HEADSET),20 // Telephony call21 TELEPHONY_CALL(6, PeripheralProfile.HEADSET)22 }23}
1data class PeripheralBond(2 val profile: PeripheralProfile,3 val hotState: State4) {56 enum class State {7 CONNECTED,8 CONNECTING,9 DISCONNECTED,10 DISCONNECTING;1112 fun getSteadyState(): State {13 return when(this) {14 CONNECTED -> CONNECTED15 CONNECTING -> DISCONNECTED16 DISCONNECTED -> DISCONNECTED17 DISCONNECTING -> CONNECTED18 }19 }20 }21}
Each time a host state changes, a new block is created. The ledger layer listens for these changes and automatically updates the local ledger and sends the blocks to remote peers.
ii. Resolution Handler
Each service gets to handle all resolutions from the resource allocation layer. For audio, this is actually where we connect or disconnect the audio profiles on a peripheral. A service can also choose to support streaming, in which case it will also handle that resolution here.
Multicast has issues: It requires all devices to be on the same network and it's blocked by some routers. PCAS was designed to be used in a "home network" where these issues are usually nonexistent.
I explored 2 other possible technologies:
Wi-Fi Aware and Wi-Fi Direct were not considered due to power consumption concerns. Google Nearby service was considered but quickly eliminated due to some unacceptable limitations.
This is easy using a service like FCM (it would be similar to Google Nearby Messaging API without the proximity part).
FCM and other push messaging services work using long-lived TCP sockets. A TCP socket on the device waits in accept mode on a Google server.
BLE devices can broadcast advertisement packets unidirectionally. I will do a quick overview of BLE advertisement. You can read the Bluetooth Core Specification v4.0 for more, or scroll to the pros & cons section to understand why it wasn't picked.
Physical Layer
BLE uses the same 2.4Ghz ISM band as classic Bluetooth and WiFi.
It operates in the same spectrum range (2.400–2.4835 GHz
) as Classic Bluetooth but has 40 2-Mhz channels
as opposed to the classic 79 1-Mhz channels
.
Data is transmitted within a channel using Frequency Shift Keying.
The data rate is 1 Mbps
(supporting 2Mbps on Bluetooth 5.0).
Advertisement Packet
Advertising & Interference
BLE is robust, using frequency hopping to work around interference.
BLE uses 3 dedicated channels for advertising: 37, 38, 39 (channels are zero-indexed). As can be seen in the image below, these channels are spread across the 2.4GHz band to minimize interference problems.
A relevant study: Coexistence and Interference Tests on a Bluetooth Low Energy Front-End.
In a nutshell: A peripheral device broadcast advertisement packets on at least 1 of the 3 channels, with a repetition period called the advertisement interval
.
A scanning central device listens on these channels to detect advertisement packets.
Advertisement Interval & Scanning
Page 2223 of the Bluetooth Core Specification v4.0 explains advertisement intervals perfectly.
A scanning device listens on the advertisement channels for a duration called the scan window
, which is repeated every scan interval
.
Discovery latency
We can use the three parameters: advertising interval, scan interval, and scan window to build a probabilistic model for discovery latency. Any such model will be practically flawed without considering the environment where devices will likely be used in.
Shorter intervals and a higher scan window lead to faster discovery times while consuming more power.
Power consumption
Ignore the general belief that advertisements are power-hungry. BLE advertisement is power efficient.
Have a look at this Android power consumption test.
A study by beacon software company Aislelabs reported that peripherals such as proximity beacons usually function for 1–2 years powered by a 1,000mAh coin cell battery
.