📄 usbdrvasm.s
字号:
/* Name: usbdrvasm.S * Project: AVR USB driver * Author: Christian Starkjohann * Creation Date: 2004-12-29 * Tabsize: 4 * Copyright: (c) 2005 by OBJECTIVE DEVELOPMENT Software GmbH * License: Proprietary, free under certain conditions. See Documentation. * This Revision: $Id: usbdrvasm.S 52 2005-04-12 16:57:29Z cs $ *//*General Description:This module implements the assembler part of the USB driver. See usbdrv.hfor a description of the entire driver.Since almost all of this code is timing critical, don't change unless youreally know what you are doing! Many parts require not only a maximum numberof CPU cycles, but even an exact number of cycles!*//* configs for io.h */#define __SFR_OFFSET 0#define _VECTOR(N) __vector_ ## N /* io.h does not define this for asm */#include <avr/io.h> /* for CPU I/O register definitions and vectors */#include "usbdrv.h" /* for common defs *//* register names */#define x1 r16#define x2 r17#define shift r18#define cnt r19#define x3 r20#define x4 r21#define nop2 rjmp .+0 /* jump to next instruction */.text.global SIG_INTERRUPT0 .type SIG_INTERRUPT0, @functionSIG_INTERRUPT0:;Software-receiver engine. Strict timing! Don't change unless you can preserve timing!;interrupt response time: 4 cycles + insn running = 7 max if interrupts always enabled;max allowable interrupt latency: 32 cycles -> max 25 cycles interrupt disable;max stack usage: [ret(2), x1, SREG, x2, cnt, shift, YH, YL, x3, x4] = 11 bytesusbInterrupt:;order of registers pushed:;x1, SREG, x2, cnt, shift, [YH, YL, x3] push x1 ;2 push only what is necessary to sync with edge ASAP in x1, SREG ;1 push x1 ;2;sync byte (D-) pattern LSb to MSb: 01010100 [1 = idle = J, 0 = K];sync up with J to K edge during sync pattern -- use fastest possible loops;first part has no timeout because it waits for IDLE or SE1 (== disconnected)#if !USB_CFG_SAMPLE_EXACT ldi x1, 5 ;1 setup a timeout for waitForK#endifwaitForJ: sbis USBIN, USBMINUS ;1 wait for D- == 1 rjmp waitForJ ;2#if USB_CFG_SAMPLE_EXACT;The following code represents the unrolled loop in the else branch. It;results in a sampling window of 1/4 bit which meets the spec. sbis USBIN, USBMINUS rjmp foundK sbis USBIN, USBMINUS rjmp foundK sbis USBIN, USBMINUS rjmp foundK nop nop2foundK:#elsewaitForK: dec x1 ;1 sbic USBIN, USBMINUS ;1 wait for D- == 0 brne waitForK ;2#endif;{2, 6} after falling D- edge, average delay: 4 cycles [we want 4 for center sampling];we have 1 bit time for setup purposes, then sample again: push x2 ;2 push cnt ;2 push shift ;2shortcutEntry: ldi cnt, 1 ;1 pre-init bit counter (-1 because no dec follows, -1 because 1 bit already sampled) ldi x2, 1<<USB_CFG_DPLUS_BIT ;1 -> 8 edge sync ended with D- == 0;now wait until SYNC byte is over. Wait for either 2 bits low (success) or 2 bits high (failure)waitNoChange: in x1, USBIN ;1 <-- sample, timing: edge + {2, 6} cycles eor x2, x1 ;1 sbrc x2, 0 ;1 | 2 ldi cnt, 2 ;1 | 0 cnt = numBits - 1 (because dec follows) mov x2, x1 ;1 dec cnt ;1 brne waitNoChange ;2 | 1 sbrc x1, USBMINUS ;2 rjmp sofError ;0 two consecutive "1" bits -> framing error;start reading data, but don't check for bitstuffing because these are the;first bits. Use the cycles for initialization instead. Note that we read and;store the binary complement of the data stream because eor results in 1 for;a change and 0 for no change. in x1, USBIN ;1 <-- sample bit 0, timing: edge + {3, 7} cycles eor x2, x1 ;1 ror x2 ;1 ldi shift, 0x7f ;1 The last bit of the sync pattern was a "no change" ror shift ;1 push YH ;2 -> 7 in x2, USBIN ;1 <-- sample bit 1, timing: edge + {2, 6} cycles eor x1, x2 ;1 ror x1 ;1 ror shift ;1 push YL ;2 lds YL, usbInputBuf ;2 -> 8 in x1, USBIN ;1 <-- sample bit 2, timing: edge + {2, 6} cycles eor x2, x1 ;1 ror x2 ;1 ror shift ;1 ldi cnt, USB_BUFSIZE;1 clr YH ;1 push x3 ;2 -> 8 in x2, USBIN ;1 <-- sample bit 3, timing: edge + {2, 6} cycles eor x1, x2 ;1 ror x1 ;1 ror shift ;1 ser x3 ;1 nop ;1 rjmp rxbit4 ;2 -> 8shortcutToStart: ;{,43} into next frame: max 5.5 sync bits missed#if !USB_CFG_SAMPLE_EXACT ldi x1, 5 ;2 setup timeout#endifwaitForJ1: sbis USBIN, USBMINUS ;1 wait for D- == 1 rjmp waitForJ1 ;2#if USB_CFG_SAMPLE_EXACT;The following code represents the unrolled loop in the else branch. It;results in a sampling window of 1/4 bit which meets the spec. sbis USBIN, USBMINUS rjmp foundK1 sbis USBIN, USBMINUS rjmp foundK1 sbis USBIN, USBMINUS rjmp foundK1 nop nop2foundK1:#elsewaitForK1: dec x1 ;1 sbic USBIN, USBMINUS ;1 wait for D- == 0 brne waitForK1 ;2#endif pop YH ;2 correct stack alignment nop2 ;2 delay for the same time as the pushes in the original code rjmp shortcutEntry ;2; ################# receiver loop #################; extra jobs done during bit interval:; bit 6: se0 check; bit 7: or, store, clear; bit 0: recover from delay [SE0 is unreliable here due to bit dribbling in hubs]; bit 1: se0 check; bit 2: se0 check; bit 3: overflow check; bit 4: se0 check; bit 5: rjmp; stuffed* helpers have the functionality of a subroutine, but we can't afford; the overhead of a call. We therefore need a separate routine for each caller; which jumps back appropriately.stuffed5: ;1 for branch taken in x2, USBIN ;1 <-- sample @ +1 andi x2, USBMASK ;1 breq se0a ;1 andi x3, 0xc0 ;1 (0xff03 >> 2) & 0xff ori shift, 0xfc ;1 rjmp rxbit6 ;2stuffed6: ;1 for branch taken in x1, USBIN ;1 <-- sample @ +1 andi x1, USBMASK ;1 breq se0a ;1 andi x3, 0x81 ;1 (0xff03 >> 1) & 0xff ori shift, 0xfc ;1 rjmp rxbit7 ;2; This is somewhat special because it has to compensate for the delay in bit 7stuffed7: ;1 for branch taken andi x1, USBMASK ;1 already sampled by caller breq se0a ;1 mov x2, x1 ;1 ensure correct NRZI sequence [we can save andi x3 here] ori shift, 0xfc ;1 in x1, USBIN ;1 <-- sample bit 0 rjmp unstuffed7 ;2stuffed0: ;1 for branch taken in x1, USBIN ;1 <-- sample @ +1 andi x1, USBMASK ;1 breq se0a ;1 andi x3, 0xfe ;1 (0xff03 >> 7) & 0xff ori shift, 0xfc ;1 rjmp rxbit1 ;2;-----------------------------rxLoop: brlo stuffed5 ;1rxbit6: in x1, USBIN ;1 <-- sample bit 6 andi x1, USBMASK ;1 breq se0a ;1 eor x2, x1 ;1 ror x2 ;1 ror shift ;1 cpi shift, 4 ;1 brlo stuffed6 ;1rxbit7: in x2, USBIN ;1 <-- sample bit 7 eor x1, x2 ;1 ror x1 ;1 ror shift ;1 eor x3, shift ;1 x3 is 0 at bit locations we changed, 1 at others st y+, x3 ;2 the eor above reconstructed modified bits and inverted rx data ser x3 ;1rxbit0: in x1, USBIN ;1 <-- sample bit 0 cpi shift, 4 ;1 brlo stuffed7 ;1unstuffed7: eor x2, x1 ;1 ror x2 ;1 ror shift ;1 cpi shift, 4 ;1 brlo stuffed0 ;1rxbit1: in x2, USBIN ;1 <-- sample bit 1 andi x2, USBMASK ;1se0a: ; enlarge jump range to SE0 breq se0 ;1 check for SE0 more often close to start of byte eor x1, x2 ;1 ror x1 ;1 ror shift ;1 cpi shift, 4 ;1 brlo stuffed1 ;1rxbit2: in x1, USBIN ;1 <-- sample bit 2 andi x1, USBMASK ;1 breq se0 ;1 eor x2, x1 ;1 ror x2 ;1 ror shift ;1 cpi shift, 4 ;1 brlo stuffed2 ;1rxbit3: in x2, USBIN ;1 <-- sample bit 3 eor x1, x2 ;1 ror x1 ;1 ror shift ;1 dec cnt ;1 check for buffer overflow breq overflow ;1 cpi shift, 4 ;1 brlo stuffed3 ;1rxbit4: in x1, USBIN ;1 <-- sample bit 4 andi x1, USBMASK ;1 breq se0 ;1 eor x2, x1 ;1 ror x2 ;1 ror shift ;1 cpi shift, 4 ;1 brlo stuffed4 ;1rxbit5: in x2, USBIN ;1 <-- sample bit 5 eor x1, x2 ;1 ror x1 ;1 ror shift ;1 cpi shift, 4 ;1 rjmp rxLoop ;2;-----------------------------stuffed1: ;1 for branch taken in x2, USBIN ;1 <-- sample @ +1 andi x2, USBMASK ;1 breq se0 ;1 andi x3, 0xfc ;1 (0xff03 >> 6) & 0xff ori shift, 0xfc ;1 rjmp rxbit2 ;2stuffed2: ;1 for branch taken in x1, USBIN ;1 <-- sample @ +1 andi x1, USBMASK ;1 breq se0 ;1 andi x3, 0xf8 ;1 (0xff03 >> 5) & 0xff ori shift, 0xfc ;1 rjmp rxbit3 ;2stuffed3: ;1 for branch taken in x2, USBIN ;1 <-- sample @ +1 andi x2, USBMASK ;1 breq se0 ;1 andi x3, 0xf0 ;1 (0xff03 >> 4) & 0xff ori shift, 0xfc ;1 rjmp rxbit4 ;2stuffed4: ;1 for branch taken in x1, USBIN ;1 <-- sample @ +1 andi x1, USBMASK ;1 breq se0 ;1 andi x3, 0xe0 ;1 (0xff03 >> 3) & 0xff ori shift, 0xfc ;1 rjmp rxbit5 ;2;################ end receiver loop ###############overflow: ; ignore package if buffer overflow rjmp rxDoReturn ; enlarge jump range;This is the only non-error exit point for the software receiver loop;{4, 20} cycles after start of SE0, typically {10, 18} after SE0 start = {-6, 2} from end of SE0;next sync starts {16,} cycles after SE0 -> worst case start: +4 from next sync start;we don't check any CRCs here because there is no time left.se0: ;{-6, 2} from end of SE0 / {,4} into next frame mov cnt, YL ;1 assume buffer in lower 256 bytes of memory
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -