📄 readme.cv
字号:
}...David Schwartz <davids@webmaster.com> wrote:>> > It's compliant>>>> That is really good.>>> Tomorrow (I have to go urgently now) I will try to>> demonstrate the lost-signal "problem" of current>> pthread-win32 and ACE-(variant w/o SingleObjectAndWait)>> implementations: players start suddenly drop their balls :-)>> (with no change in source code).>>Signals aren't lost, they're going to the main thread,>which isn't coded correctly to handle them. Try this:>> // Wait for players to stop> do {>> pthread_cond_wait( &cndGameStateChange,&mtxGameStateLock );>printf("Main thread stole a signal\n");>> } while ( eGameState < BOTH_PLAYERS_GONE );>>I bet everytime you thing a signal is lost, you'll see that printf.>The signal isn't lost, it was stolen by another thread.well, you can probably loose your bet.. it was indeed stolenby "another" thread but not the one you seem to think of.I think that what actually happens is the following:H:\SA\UXX\pt\PTHREADS\TESTS>tennis3.exePLAYER-APLAYER-B----PLAYER-B: SPURIOUS WAKEUP!!!PLAYER-A GONEPLAYER-B GONEGAME OVERH:\SA\UXX\pt\PTHREADS\TESTS>here you can see that PLAYER-B after playing his firstball (which came via signal from PLAYER-A) just droppedit down. What happened is that his signal to player Awas consumed as spurious wakeup by himself (player B).The implementation has a problem:================waiting threads:================{ /** Critical Section inc cond.waiters_count} /* /* Atomic only if using Win32 SignalObjectAndWait /* cond.mtx.release /*** ^^-- A THREAD WHICH DID SIGNAL MAY ACQUIRE THE MUTEX, /*** GO INTO WAIT ON THE SAME CONDITION AND OVERTAKE /*** ORIGINAL WAITER(S) CONSUMING ITS OWN SIGNAL! cond.sem.waitPlayer-A after playing game's initial ball went intowait (called _wait) but was pre-empted before reachingwait semaphore. He was counted as waiter but was notactually waiting/blocked yet.===============signal threads:==============={ /** Critical Section waiters_count = cond.waiters_count} if ( waiters_count != 0 ) sem.post 1 endifPlayer-B after he received signal/ball from Player Acalled _signal. The _signal did see that there wasone waiter blocked on the condition (Player-A) andreleased the semaphore.. (but it did not unblockPlayer-A because he was not actually blocked).Player-B thread continued its execution, called _wait,was counted as second waiter BUT was allowed to slipthrough opened semaphore gate (which was opened forPlayer-B) and received his own signal. Player B remainedblocked followed by Player A. Deadlock happened whichlasted until main thread came in and said game over.It seems to me that the implementation fails tocorrectly implement the following statementfrom specification:http://www.opengroup.org/onlinepubs/007908799/xsh/pthread_cond_wait.html"These functions atomically release mutex and causethe calling thread to block on the condition variablecond; atomically here means "atomically with respectto access by another thread to the mutex and then thecondition variable". That is, if another thread isable to acquire the mutex after the about-to-blockthread has released it, then a subsequent call topthread_cond_signal() or pthread_cond_broadcast()in that thread behaves as if it were issued afterthe about-to-block thread has blocked."Question: Am I right?(I produced the program output above by simplyadding ?Sleep( 1 )?:================waiting threads:================{ /** Critical Section inc cond.waiters_count} /* /* Atomic only if using Win32 SignalObjectAndWait /* cond.mtx.releaseSleep( 1 ); // Win32 /*** ^^-- A THREAD WHICH DID SIGNAL MAY ACQUIRE THE MUTEX, /*** GO INTO WAIT ON THE SAME CONDITION AND OVERTAKE /*** ORIGINAL WAITER(S) CONSUMING ITS OWN SIGNAL! cond.sem.waitto the source code of pthread-win32 implementation:http://sources.redhat.com/cgi-bin/cvsweb.cgi/pthreads/condvar.c?rev=1.36&content-type=text/x-cvsweb-markup&cvsroot=pthreads-win32 /* * We keep the lock held just long enough to increment the count of * waiters by one (above). * Note that we can't keep it held across the * call to sem_wait since that will deadlock other calls * to pthread_cond_signal */ cleanup_args.mutexPtr = mutex; cleanup_args.cv = cv; cleanup_args.resultPtr = &result; pthread_cleanup_push (ptw32_cond_wait_cleanup, (void *)&cleanup_args); if ((result = pthread_mutex_unlock (mutex)) == 0) {((resultSleep( 1 ); // @AT /* * Wait to be awakened by * pthread_cond_signal, or * pthread_cond_broadcast, or * a timeout * * Note: * ptw32_sem_timedwait is a cancelation point, * hence providing the * mechanism for making pthread_cond_wait a cancelation * point. We use the cleanup mechanism to ensure we * re-lock the mutex and decrement the waiters count * if we are canceled. */ if (ptw32_sem_timedwait (&(cv->sema), abstime) == -1) { result = errno; } } pthread_cleanup_pop (1); /* Always cleanup */BTW, on my system (2 CPUs) I can manage to getsignals lost even without any source code modificationif I run the tennis program many times in differentshell sessions....David Schwartz <davids@webmaster.com> wrote:>terekhov@my-deja.com wrote:>>> well, it might be that the program is in fact buggy.>> but you did not show me any bug.>>You're right. I was close but not dead on. I was correct, however,>that the code is buggy because it uses 'pthread_cond_signal' even>though not any thread waiting on the condition variable can do the>job. I was wrong in which thread could be waiting on the cv but>unable to do the job.Okay, lets change 'pthread_cond_signal' to 'pthread_cond_broadcast'but also add some noise from main() right before declaring the gameto be over (I need it in order to demonstrate another problem ofpthread-win32/ACE implementations - broadcast deadlock)......It is my understanding of POSIX conditions,that on correct implementation added noisein form of unnecessary broadcasts from main,should not break the tennis program. Theonly 'side effect' of added noise on correctimplementation would be 'spurious wakeups' ofplayers (in fact they are not spurious,players just see them as spurious) unblocked,not by another player but by main beforeanother player had a chance to acquire themutex and change the game state variable:...PLAYER-BPLAYER-A---Noise ON...PLAYER-BPLAYER-A...PLAYER-BPLAYER-A----PLAYER-A: SPURIOUS WAKEUP!!!PLAYER-BPLAYER-A---Noise OFFPLAYER-B---Stopping the game...PLAYER-A GONEPLAYER-B GONEGAME OVERH:\SA\UXX\pt\PTHREADS\TESTS>On pthread-win32/ACE implementations theprogram could stall:...PLAYER-APLAYER-BPLAYER-APLAYER-BPLAYER-APLAYER-BPLAYER-APLAYER-B---Noise ON...PLAYER-A---Noise OFF^CH:\SA\UXX\pt\PTHREADS\TESTS>The implementation has problems:================waiting threads:================{ /** Critical Section inc cond.waiters_count} /* /* Atomic only if using Win32 SignalObjectAndWait /* cond.mtx.release cond.sem.wait /*** ^^-- WAITER CAN BE PREEMPTED AFTER BEING UNBLOCKED...{ /** Critical Section dec cond.waiters_count /*** ^^- ...AND BEFORE DECREMENTING THE COUNT (1) last_waiter = ( cond.was_broadcast && cond.waiters_count == 0 ) if ( last_waiter ) cond.was_broadcast = FALSE endif} if ( last_waiter ) /* /* Atomic only if using Win32 SignalObjectAndWait /* cond.auto_reset_event_or_sem.post /* Event for Win32 cond.mtx.acquire /*** ^^-- ...AND BEFORE CALL TO mtx.acquire (2) /*** ^^-- NESTED BROADCASTS RESULT IN A DEADLOCK else cond.mtx.acquire /*** ^^-- ...AND BEFORE CALL TO mtx.acquire (3) endif==================broadcast threads:=================={ /** Critical Section waiters_count = cond.waiters_count if ( waiters_count != 0 ) cond.was_broadcast = TRUE endif}if ( waiters_count != 0 ) cond.sem.post waiters_count /*** ^^^^^--- SPURIOUS WAKEUPS DUE TO (1) cond.auto_reset_event_or_sem.wait /* Event for Win32 /*** ^^^^^--- DEADLOCK FOR FURTHER BROADCASTS IF THEY HAPPEN TO GO INTO WAIT WHILE PREVIOUS BROADCAST IS STILL IN PROGRESS/WAITINGendifa) cond.waiters_count does not accurately reflectnumber of waiters blocked on semaphore - that couldresult (in the time window when counter is not accurate)in spurios wakeups organised by subsequent _signalsand _broadcasts. From standard compliance point of viewthat is OK but that could be a real problem fromperformance/efficiency point of view.b) If subsequent broadcast happen to go into wait oncond.auto_reset_event_or_sem before previousbroadcast was unblocked from cond.auto_reset_event_or_semby its last waiter, one of two blocked threads willremain blocked because last_waiter processing codefails to unblock both threads.In the situation with tennisb.c the Player-B was putin a deadlock by noise (broadcast) coming from mainthread. And since Player-B holds the game statemutex when it calls broadcast, the whole programstalled: Player-A was deadlocked on mutex andmain thread after finishing with producing the noisewas deadlocked on mutex too (needed to declare thegame over)(I produced the program output above by simplyadding ?Sleep( 1 )?:==================broadcast threads:=================={ /** Critical Section waiters_count = cond.waiters_count if ( waiters_count != 0 ) cond.was_broadcast = TRUE endif}if ( waiters_count != 0 )Sleep( 1 ); //Win32 cond.sem.post waiters_count /*** ^^^^^--- SPURIOUS WAKEUPS DUE TO (1) cond.auto_reset_event_or_sem.wait /* Event for Win32 /*** ^^^^^--- DEADLOCK FOR FURTHER BROADCASTS IF THEY HAPPEN TO GO INTO WAIT WHILE PREVIOUS BROADCAST IS STILL IN PROGRESS/WAITINGendifto the source code of pthread-win32 implementation:http://sources.redhat.com/cgi-bin/cvsweb.cgi/pthreads/condvar.c?rev=1.36&content-type=text/x-cvsweb-markup&cvsroot=pthreads-win32 if (wereWaiters) {(wereWaiters)sroot=pthreads-win32eb.cgi/pthreads/Yem...m /* * Wake up all waiters */Sleep( 1 ); //@AT#ifdef NEED_SEM result = (ptw32_increase_semaphore( &cv->sema, cv->waiters ) ? 0 : EINVAL);#else /* NEED_SEM */ result = (ReleaseSemaphore( cv->sema, cv->waiters, NULL ) ? 0 : EINVAL);#endif /* NEED_SEM */ } (void) pthread_mutex_unlock(&(cv->waitersLock)); if (wereWaiters && result == 0) {(wereWaiters
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -