USB Device Stacks, on RTFM, part 2
Previously we talked about all the different kinds of descriptors which USB devices use to communicate their capability. This is important stuff because to write any useful USB device firmware we need to be able to determine how to populate our descriptors. However, having that data on the device is entirely worthless without an understanding of how it gets from the device to the host so that it can be acted upon. To understand that, let's look at the USB wire protocol.
Note, I'll again be talking mostly about USB2.0 low- and full-speed. I believe that high speed is approximately the same but with faster wires, except not quite that simple.
Down to the wire
I don't intend to talk about the actual electrical signalling, though it's not un-reasonable for you to know that USB is a pair of wires forming a differentially signalled bidirectional serial communications link. The host is responsible for managing all the framing and timing on the link, and for formatting the communications into packets.
There are a number of packet types which can appear on the USB link:
| Packet type | Purpose |
|---|---|
| Token Packet | When the host wishes to send a message to the Control endpoint to configure the device, read data IN, or write data OUT, it uses this to start the transaction. |
| Data(0/1) Packet | Following a Setup, In, or Out token, a Data packet is a transfer of data (in either direction). The 0 and 1 alternate to provide a measure of confidence against lost packets. |
| Handshake Packet | Following a data packet of some kind, the other end may ACK the packet (all was well), NAK the packet (report that the device cannot, temporarily, send/receive data, or that an interrupt endpoint isn't triggered), or STALL the bus in which case the host needs to intervene. |
| Start of Frame | Every 1ms (full-speed) the host will send a SOF packet which carries a frame number. This can be used to help keep time on very simple devices. It also divides the bus into frames within which bandwidth is allocated. |
As an example, when the host wishes to perform a control transfer, the following packets are transacted in turn:
- Setup Token - The host addresses the device and endpoint (
OUT0) - Data0 Packet - The host transmits a
GET_DESCRIPTORfor the device descriptor - Ack Packet - The device acknowledges receipt of the request
This marks the end of the first transaction. The device decodes the
GET_DESCRIPTOR request and prepares the device descriptor for transmission.
The transmission occurs as the next transaction on the bus. In this example,
we're assuming 8 byte maximum transmission sizes, for illustrative purposes.
- In Token - The host addresses the device and endpoint (
IN0) - Data1 Packet - The device transmits the first 8 bytes of the descriptor
- Ack Packet - The host acknowledges the data packet
- In Token - The host addresses the device and endpoint (
IN0) - Data0 Packet - The device transmits the remaining 4 bytes of the descriptor (padded)
- Ack Packet - The host acknowledges the data packet
The second transaction is now complete, and the host has all the data it needs to proceed. Finally a status transaction occurs in which:
- Out Token - The host addresses the device and endpoint (
OUT0) - Data1 Packet - The host transmits a 0 byte data packet to indicate successful completion
- Ack Packet - The device acknowledges the completion, indicating its own satisfaction
And thus ends the full control transaction in which the host retrieves the device descriptor.
From a high level, we need only consider the activity which occurs at the point of the acknowledgement packets. In the above example:
- On the first
ACKthe device preparesIN0to transmit the descriptor, readying whatever low level device stack there is with a pointer to the descriptor and its length in bytes. - On the second
ACKthe low levels are still thinking. - On the third
ACKthe transmission fromIN0is complete and the endpoint no longer expects to transfer data. - On the fourth
ACKthe control transaction is entirely complete.
Thinking at the low levels of the control interface
Before we can build a high level USB stack, we need to consider the activity which might occur at the lower levels. At the low levels, particularly of the device control interface, work has to be done at each and every packet. The hardware likely deals with the token packet for us, leaving the data packets for us to process, and the resultant handshake packets will be likely handled by the hardware in response to our processing the data packets.
Since every control transaction is initiated by a setup token, let's look at the setup requests which can come our way...
| Field Name | Byte start | Byte length | Encoding | Meaning |
|---|---|---|---|---|
| bmRequestType | 0 | 1 | Bitmap | Describes the kind of request, and the target of it. See below. |
| bRequest | 1 | 1 | Code | The request code itself, meanings of the rest of the fields vary by bRequest |
| wValue | 2 | 2 | Number | A 16 bit value whose meaning varies by request type |
| wIndex | 4 | 2 | Number | A 16 bit value whose meaning varies by request type but typically encodes an interface number or endpoint. |
| wLength | 6 | 2 | Number | A 16 bit value indicating the length of the transfer to come. |
Since bRequest is essentially a switch against which multiple kinds of setup
packet are selected between, here's the meanings of a few...
| Field Name | Value | Meaning |
|---|---|---|
| bmRequestType | 0x08 | Data direction is IN (from device to host), recipient is the device |
| bRequest | 0x06 | GET_DESCRIPTOR (in this instance, the device descriptor is requested) |
| wValue | 0x0001 | This means the device descriptor |
| wIndex | 0x0000 | Irrelevant, there's only 1 device descriptor anyway |
| wLength | 12 | This is the length of a device descriptor (12 bytes) |
| Field Name | Value | Meaning |
|---|---|---|
| bmRequestType | 0x00 | Data direction is OUT (from host to device), recipient is the device |
| bRequest | 0x05 | SET_ADDRESS (Set the device's USB address) |
| wValue | 0x00nn | The address for the device to adopt (max 127) |
| wIndex | 0x0000 | Irrelevant for address setting |
| wLength | 0 | There's no data transfer expected for this setup operation |
Most hardware blocks will implement an interrupt at the point that the Data
packet following the Setup packet has been receive. This is typically called
receiving a 'Setup' packet and then it's up to the device stack low levels
to determine what to do and dispatch a handler. Otherwise an interrupt will
fire for the IN or OUT tokens and if the endpoint is zero, the low level
stack will handle it once more.
One final thing worth noting about SET_ADDRESS is that it doesn't take
effect until the completion of the zero-length "status" transaction following
the setup transaction. As such, the status request from the host will still
be sent to address zero (the default for new devices).
A very basic early "packet trace"
This is an example, and is not guaranteed to be the packet sequence in all cases. It's a good indication of the relative complexity involved in getting a fresh USB device onto the bus though...
When a device first attaches to the bus, the bus is in RESET state and so
the first event a device sees is a RESET which causes it to set its address
to zero, clear any endpoints, clear the configuration, and become ready for
control transfers. Shortly after this, the device will become suspended.
Next, the host kicks in and sends a port reset of around 30ms. After this, the host is ready to interrogate the device.
The host sends a GET_DESCRIPTOR to the device, whose address at this point is
zero. Using the information it receives from this, it can set up the host-side
memory buffers since the device descriptor contains the maximum transfer size
which the device supports.
The host is now ready to actually 'address' the device, and so it sends another reset to the device, again around 30ms in length.
The host sends a SET_ADDRESS control request to the device, telling it that
its new address is nn. Once the acknowledgement has been sent from the host
for the zero-data status update from the device, the device sets its internal
address to the value supplied in the request. From now on, the device shall
respond only to requests to nn rather than to zero.
At this point, the host will begin interrogating further descriptors, looking
at the configuration descriptors and the strings, to build its host-side
representation of the device. These will be GET_DESCRIPTOR and
GET_STRING_DESCRIPTOR requests and may continue for some time.
Once the host has satisfied itself that it knows everything it needs to about
the device, it will issue a SET_CONFIGURATION request which basically starts
everything up in the device. Once the configuration is set, interrupt
endpoints will be polled, bulk traffic will be transferred, Isochronous
streams begin to run, etc.
Okay, but how do we make this concrete?
So far, everything we've spoken about has been fairly abstract, or at least "soft". But to transfer data over USB does require some hardware. (Okay, okay, we could do it all virtualised, but there's no fun in that). The hardware I'm going to be using for the duration of this series is the STM32 on the blue-pill development board. This is a very simple development board which does (in theory at least) support USB device mode.
If we view the schematic for the blue-pill, we can see a very "lightweight" USB
interface which has a pullup resistor for D+. This is the way that a device
signals to the host that it is present, and that it wants to speak at
full-speed. If the pullup were on D- then it would be a low-speed device.
High speed devices need a little more complexity which I'm not going to go into
for today.
The USB lines connect to pins PA11 and PA12 which are the USB pins on the
STM32 on the board. Since USB is quite finicky, the STM32 doesn't let you
remap that function elsewhere, so this is all looking quite good for us so far.
The specific STM32 on the blue-pill is the STM32F103C8T6. By viewing its
product page on ST's website we can find the
reference manual for the part. Jumping to section 23 we learn that
this STM32 supports full-speed USB2.0 which is convenient given the past
article and a half. We also learn it supports up to eight endpoints active at
any one time, and offers double-buffering for our bulk and isochronous
transfers. It has some internal memory for packet buffering, so it won't use
our RAM bandwidth while performing transfers, which is lovely.
I'm not going to distill the rest of that section here, because there's a large amount of data which explains how the USB macrocell operates. However useful things to note are:
- How
INOUTandSETUPtransfers work. - How the endpoint buffer memory is configured.
- That all bus-powered devices MUST respond to suspend/resume properly
- That the hardware will prioritise endpoint interrupts for us so that we only need deal with the most pressing item at any given time.
- There is an 'Enable Function' bit in the address register which must be set or we won't see any transactions at all.
- How the endpoint registers signal events to the device firmware.
Next time, we're going to begin the process of writing a very hacky setup routine to try and initialise the USB device macrocell so that we can see incoming transactions through the ITM. It should be quite exciting, but given how complex this will be for me to learn, it might be a little while before it comes through.