timeit.doc David Lindauer gclind01@ulkyvx.louisville.edu December 16, 1995 1) Getting started To get started with timeit type timeit It will display a message saying it has loaded resident. You may unload it at any time subject to the normal ISR loading rules by typeing timeit At which time timeit will detect it was loaded and unload itself. Whenever timeit is loaded you may type ALT-F10. timeit will then begin watching the BIOS keyboard input routines; when a program requests a key and that key is a carriage return timeit will initialize its timers to zero. The next access to the BIOS keyoard routine will will result in the timer value being output to the screen. This sequence may be repeated as often as necessary. The results should probably be accurate to the nearest 0.05ms. (this is a conservative guess) However there are some qualifications which will be discussed. 2) Code Design issues Timeit is designed to measure the time it takes for calculation-bound programs to run. measuring the time for I/O bound programs is a tricky process, since many I/O operations induce variable-length delays. With some modification timeit can be used for I/O bound programs; however it is expected that it would be taylored to a specific application. The primary limitation as it stands is that all interrupts except the timer 0 int (int 8 or IRQ 0) are completely disabled while the timing takes place. This means for example that accidental keyboard presses won't trigger a spurious interrupt that causes the timing to be off... and in particular the 'release' keycode that the keyboard generates when CR is pressed won't interfere with the timing. This design was chosen to make timing of calculation-bound programs as accurate as possible. Some of the other implications of this design are: Int 8 support is limited to what is absolutley necessary; thus the BIOS int 8 routine is not called. Among other things this means that while timeit is enabled the system clock will be 'frozen'. To get an accurate system time back you will have to reboot. This lack of support for int 8 takes out a lot of possible timing variations but again, if, a program tries to hook int 8 or int 1Ch while timeit is enabled the program will interfere with the timing or perhaps not work. Some programs additionally depend upon int 8 updateing BIOS variables... I know of one which hangs because critical timing information is not being supplied to it. If a program you are timing tries to use any other interrupt (hard drive, floppy drive, sound card, serial ports, etc..) then at best the program won't work and at worst the system will hang. This could be fixed with minor changes to the code but allowing other interrupts in puts in variations in timing that may have nothing to do with what you are trying to measure. Additionally I had to bypass the DOS I/O routines due to reentrancy problems. Timeit expects that the program you are measuring will use the BIOS to do input (DOS uses the BIOS) and output from timeit is directed through the BIOS. This means timeit will not have access to certain information such as the colors to draw text in, or screen sizes if you are not using a 25x80 text-mode screen. These are deemed unnecessary for the present needs. Since the keyboard interrupt is disabled during timing, the program can't rely on user input to terminate a timing loop. If the program issues the equivalent of BIOS int 16h function 1 (e.g. kbhit() in C) this is classified as a keyboard access and timing immediately stops. The program only registers intervals up to 65535 x54.9ms long. This is approximately 50-55 minutes. The program design requires a 386 or better processor (real mode) This design should allow accurate timing to the nearest .05ms. However, that is without taking into account things like pipelines and caches. I had problems with something along these lines... I used delay loops to test the timing. The loop: lp: dec eax jne lp resulted in timing values that varied by as much as 0.5% from each other from one run to the next. However the loop: lp: dec eax nop nop nop nop nop nop nop nop nop nop jne lp resulted in timing values which matched from one run to the next. I don't know if it is a pipeline problem (I have a 386sx) or if there is a restriction on when an int 8 interrupt can come in or what... but depending on the program structure (and the hardware design) you will get different results from timeit. Your best bet is to run it several times and see how the results tally with each other. Be aware also that things like memory allocation systems can introduce variable-length delays depending on design and load... so take that into account if you need to measure such a program. If I was serious about needing predictable times I would statically allocate everything, or at the very least do no allocation/deallocation in the code to be measured. 3) Timing considerations The program uses the IRQ 0 timer, in mode 2. In this mode the timer counts down from 65536 to 1 (65535 steps) and then generates an interrupt and reloads with 65536. The interrupt comes every 54.9 ms (1/18.2 times a second). When timing is initiated, the program resets the count to 65536 and starts counting interrupts. When timing is completed the program reads the current value in the timer registers and subtracts it from 65536 to get the number of counts which has elapsed since the last interrupt. At this point a minor approximation is made. The program assumes the timer has 65536 steps for convenience, instead of the 65535 steps it actually has. Thus this fractional portion (of the interval between interrupts) is off by : 1 - (65535/65536) The lower four bits (approximately) will later be truncated so this is a pretty good approximation. Next, the number of interrupts which came in is multiplied by 65536 and added to the fraction; this is the total amount of time (in 54.9ms units) which the routine took. At this point, an adjustment is made for the amount of overhead which the timer interrupt takes up. During load of timeit a measurement is taken for how long it takes to process the int 8 routine 256 times in a row. This value is multiplied by the number of times the int 8 routine was invoked during the timing, divided by 256, and subtracted from the total time calculated above. This final total is divided by 18.2 to translate to seconds and the result is printed out to five decimal places. The final place is best used for rounding. I haven't done any calculations on the overhead in the program that has to fit in places 5,6,7; using the 5th place for rounding is a fairly conservative guess as to the accuracy of the routines. I leave it to you to inspect the code and calculate the timings on whatever platform you are going to run the program on. ; ; Generic timing program ; ; D. Lindauer, dec 16, 1995 ; .model tiny ; This is a .com program .386 ; ; Constants used for querying and releasing TSR ; QUERY EQU 0AA00h CLEAR EQU 0AB00h .code ; ; This is a .com file ; org 100h start: jmp entry ; ; <<<<<<<<<<<<<<<<<<<<<<<<<< TSR variables >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ; hotpress db 0 ; hotkey status old8h dd 0 ; old int 8 vector old15h dd 0 ; old int 15h vectoor old16h dd 0 ; old int 16h vector old80h dd 0 ; old int 80h vector ticks dw 0 ; tick count (54.9 ms intervals) endtime dw 0 ; Counter when end detected scaletime dw 0 ; Scale value (int 8 len * 256) p21h db 0 ; port 21 h image pa1h db 0 ; port 0a1h image elapsed db 10,"Elapsed time: ",'$' ; message ; ; <<<<<<<<<<<<<<<<<<<<<<<<< interrupts >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ; ; Clock tick timer (int 8h). System WILL lose time while this timer is active ; timerint: inc cs:[ticks] ; inc 54.9ms periods push ax ; Ack int mov al,20h ; out 20h,al ; pop ax ; iret ; ; TSR communication handler (int 80h) ; catcher: cmp ax,QUERY ; See if this is a query jne chk ; no mov byte ptr es:[di],1 ; Yes, inform caller iret chk: cmp ax,CLEAR ; See if is a request to unload TSR jne quit ; exit if not cld ; Clear dir push ds ; Load segs push es push cs pop ds sub ax,ax mov es,ax mov di,8h*4 ; Reset int 8 mov si,offset old8h movsw movsw mov di,15h*4 ; Reset int 15h mov si,offset old15h movsw movsw mov di,16h*4 ; Reset int 16h mov si,offset old16h movsw movsw mov di,80h*4 ; Reset int 80h mov si,offset old80h movsw movsw mov es,cs:[2ch] ; Free the environment mov ah,49h int 21h mov es,cx ; Free the program mov ah,49h int 21h pop es pop ds quit: iret ; Exit to caller ; ; Hotkey detection (int 15h) keypress: pushf cmp ah,4fh ; Check for keypress jne kpchain ; Chain if not cmp al,44h ; Check for F10 jne kpchain ; Chain if not push es ; Check for ALT key push ax mov ax,40h ; In BIOS data space mov es,ax test byte ptr es:[17h],8 pop ax pop es je kpchain ; Chain if not or cs:[hotpress],01h ; Flag we are waiting for carriage return popf clc ; Clear flag for BIOS to ignore this keypress retf 02 kpchain: popf ; Otherwise let old handler handle this jmp cs:[old15h] ; ; timer start/stop (int 16h) keyinput: push bp ; Make space on stack mov bp,sp sub sp,2 sti ; ints allowed (we are overriding the caller!) pushf mov [bp-1],ah ; Function code on stack test cs:[hotpress],2 ; See if timer runnng je ok and cs:[hotpress],0fdh ; Stop it if so call stoptimer ok: popf pushf ; Call BIOS kb handler (timer not running now) call cs:[old16h] pushf test cs:[hotpress],01h ; See if ALT-F10 pressed je kiexit ; Get out if not cmp byte ptr [bp-1],0 ; See if is function 0 or 10h je kiok cmp byte ptr [bp-1],10h jne kiexit ; Get out if not kiok: cmp al,0dh ; See if the key was CR jne kiexit ; Get out if not and cs:[hotpress],0feh ; no longer waiting for CR or cs:[hotpress],2 ; Timer is now on call starttimer ; Start the timer kiexit: popf mov sp,bp ; Clear stack pop bp retf 02 ; Return to caller ; Notice we have turned on interrupts ; at this point irregardless of what ; the caller was doing ; ; <<<<<<<<<<<<<<<<<<<<<<<<<<< Timer functions >>>>>>>>>>>>>>>>>>>>>>>>>>>> ; ; Start the timer ; starttimer: push ax ; ; First turn off all ints but int 8 ; in al,21h mov cs:[p21h],al ; Grab PIC 0 status or al,0feh out 21h,al ; reset pic 0 in al,0a1h mov cs:[pa1h],al ; Grab PIC 1 status or al,0ffh out 0a1h,al ; reset pic 1 sti mov word ptr cs:[ticks],0 ; Wait for 54.9ms int stlp: test word ptr cs:[ticks],7 je stlp cli ; Timer is zeroed mov word ptr cs:[ticks],0 ; pop ax ret ; ; Stop the timer and display the elapsed time ; stoptimer: push ax mov al,04h ; Latch timer 0 cli ; So we can't get any ticks now out 43h,al ; Time is frozen at this point mov di,offset endtime ; Grab the latched value call loadtimer ; pop ax pushad push ds push cs pop ds sub ebx,ebx ; Timer counts down so we have ; to subtract from 65536 sub bx,[endtime] movzx eax,[ticks] sti push eax ; scale * number of times int 8 called movzx ecx,[scaletime] mul ecx shr eax,8 mov esi,eax pop eax shl eax,16 ; ticks * 65536 + timer 0 end value add eax,ebx sub eax,esi ; - timing for int 8 mov ecx,10 ; Divide the whole thing by 18.2 mul ecx ; to get a period in seconds mov ecx,182 div ecx call displayit ; display the value mov al,[p21h] ; Restoer PIC 0 out 21h,al mov al,[pa1h] ; Restore PIC 1 out 0a1h,al pop ds popad ret ; ; grab the latched timer value ; loadtimer: push es push cs pop es cld in al,40h stosb in al,40h stosb pop es ret ; ; Set the timer mode ; setmode: out 43h,al sub al,al out 40h,al sub al,al out 40h,al ret ; ; <<<<<<<<<<<<<<<<<<<<<<<<<<<< Display routines >>>>>>>>>>>>>>>>>>>>>>>>>>> displayit: push eax mov bx,offset elapsed ; Write the elapsed message call writestring pop eax push eax ; Display integer part ; shr eax,16 ; Get integer mov bx,10 ; Divide value sub cx,cx ; Number of digits to print lp1: sub dx,dx ; for divide div bx ; Divide by 10 push dx ; Remainder to stack inc cx ; inc count or ax,ax ; Quit when no quotient jnz lp1 lp2: pop dx ; Get a digit from stack add dl,30h ; Make ASCII call writechar ; Display char loop lp2 ; Until stack is empty mov dl,'.' ; Put in a dot call writechar ; pop eax ; Get fractional part in ax ; ; Display fractional part mov cx,5 ; printing 5 digits mov bx,10 ; multiply value lp3: mul bx ; multiply by 10 add dl,30h ; Overflow is the digit in ASCII call writechar ; Write char loop lp3 ; Loop till done ret ; ; Display a string ; writestring: mov dl,cs:[bx] ; Get next char cmp dl,'$' ; Check for terminator je wsdone ; Quit if so inc bx ; Point at next char call writechar ; Write this char jmp writestring ; Loop wsdone: ret ; ; Display a char using BIOS ; writechar: push ax push bx push cx push dx cmp dl,10 ; Check for LF jne wcok ; Nope mov ah,3 ; Get cursor pos sub bh,bh ; int 10h ; jmp nextrow ; jump to row increment point wcok: mov ah,10 ; OTherwise output the character mov al,dl ; sub bx,bx ; mov cx,1 ; int 10h ; mov ah,3 ; Get cursor pos sub bh,bh ; int 10h ; inc dl ; inc column cmp dl,80 ; check if in range jc exit ; get out if so nextrow: inc dh ; OTherwise go to next row mov dl,0 ; column 0 cmp dh,25 ; See if off screen jc exit ; no, get out dec dh ; put us back on push dx ; scroll us up mov ax,601h mov bh,7 sub cx,cx mov dx,01950h int 10h pop dx exit: mov ah,2 ; Save new cursor pos sub bh,bh ; int 10h ; pop dx pop cx pop bx pop ax ret endtsr: ;<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>> ; ; Non-resident variables testvar DB 0 ; Used in querying for already resident banner db "timer tsr V1.0",10,13,10,13,'$' loadmsg db "TSR loaded",10,13,'$' unloadmsg db "TSR unloaded",10,13,'$' ;<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Program entry point >>>>>>>>>>>>>>>>>>>>>>> entry: mov dx,offset banner ; Display banner mov ah,9 int 21h sub ax,ax ; See if anything at int 80h mov es,ax mov ax,es:[80h*4] or ax,es:[80h*4+2] push cs pop es jz notloaded ; Nope, can't be loaded mov di,offset testvar ; Else query for residency mov ax,QUERY ; int 80h ; test byte ptr [testvar],0ffh ; Is it? jz notloaded ; No, load it mov al,36h ; Reset timer 0 to mode 3 call setmode ; mov ax,CLEAR ; unload TSR int 80h mov dx,offset unloadmsg ; Tell us we're unloaded mov ah,9 int 21h mov ah,4ch ; Exit to DOS int 21h ; ;Come here to load resident ; notloaded: mov al,34h ; Timer to mode 2 call setmode ; cld push ds ; Set segs sub ax,ax mov ds,ax mov si,8h*4 ; grab old int 8 and hook mov di,offset old8h movsw movsw mov word ptr [si-2],cs mov word ptr [si-4],offset timerint mov si,15h*4 ; grab old int 15h and hook mov di,offset old15h movsw movsw mov word ptr [si-2],cs mov word ptr [si-4],offset keypress mov si,16h*4 ; grab old int 16h and hook mov di,offset old16h movsw movsw mov word ptr [si-2],cs mov word ptr [si-4],offset keyinput mov si,80h*4 ; grab old int 80h and hook mov di,offset old80h movsw movsw mov word ptr [si-2],cs mov word ptr [si-4],offset catcher pop ds ; ; Nowe we are going to decide how long int 8 takes ; call starttimer ; start the timer mov cx,16 ; 16 loops of 16 int 8s ; Stop routine must divide by 256 q: int 8 ; Interrupts are disabled throughout this int 8 int 8 int 8 int 8 int 8 int 8 int 8 int 8 int 8 int 8 int 8 int 8 int 8 int 8 int 8 loop q mov al,04h ; Latch timer 0 out 43h,al mov di,offset scaletime ; Get scale time call loadtimer neg [scaletime] ; Adjust for it being a count-down mov al,[p21h] ; Reset PIC 0 out 21h,al mov al,[pa1h] ; reset PIC 1 out 0a1h,al mov dx,offset loadmsg ; Tell us we're loaded mov ah,9 int 21h mov dx,offset endtsr ; Calculate space to keep add dx,15 mov cl,4 shr dx,cl mov ah,31h ; Exit to DOS int 21h end start ;