⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 readme

📁 fsmlabs的real time linux的内核
💻
字号:
Scheduler BUG 6-11-02: The RTLinux scheduler for version 3.1 is re-entrant when it wants to suspend a thread. The procedure is the following.  When it wants to suspend a thread different from  pthread_self() (with  pthread_suspend_np()) it sends a signal to that thread (  RTL_SIGNAL_SUSPEND). Then the scheduler decides to give the CPU for that thread so it has a pending signal. But the handler for that signal marks the thread as suspended and calls the scheduler again. If the process is repeated, as in the following example (sched_bug.c), what we are doing is to push calls to the scheduler in the threads stack. If we are suspended and nobody wakes up us (in a mutex, calling pthread_suspend_np,etc ..) the stack becomes exhausted in a finite time. In version 3.0 the scheduler wasn't reentrant and this error doesn't occur. A good test is to change the stack size and observe that the iterates proportionaly to the stack increment.We have the same problem with user signals handlers. Proposed solution:The solution was to return to the scheduler policy of version 3.0, being non re-entrant.Mutex BUG 4-12-02:      Well, the fact is that do_signal when receives the signal RTL_SIGNAL_SUSPENDS sets t->abort to zero. So, when it calls do_abort(t) it has no effect.      In our example, blocked thread  was blocked on a mutex. Meantime mutex owner thread is sending signal to it (pthread_suspend_np, pthread_wakeup_np,pthread_wakeup_np).      What happens is the following:      1.- The blocked gets blocked on the mutex (calling rtl_wait_sleep on the pthread_mutex_lock loop), and it is queued in the mutex wait queue.      2.- At this point, mutex owner sends the following signals:	2.1.- RTL_SIGNAL_WAKEUP: blocked thread takes the CPU to manage it. Then it calls to do_abort and rtl_wait_abort takes it out of the mutex wait queue. Then it loops again and calls rtl_wait_sleep and it is queued on mutex wait queue.          	2.2.- Next blocked thread receives the signal RTL_SIGNAL_SUSPEND which sets t->abort to zero, marks blocked thread suspended and calls the scheduler.     	2.3.- Finally, blocked thread receives the RTL_SIGNAL_WAKEUP again. But this time when do_abort is called, it has no effect (so RTL_SIGNAL_SUSPEND set its to zero). So blocked thread isn't removed from mutex wait queue. But the code follows (rtl_wait_sleep returns) looping at pthread_mutex_lock loop, calling rtl_wait_sleep again. This function queues blocked thread in mutex wait queue which was queued allready (well, not exactly since the struct queued was local to rtl_wait_sleep.). At this time mutex wait queue contains the following: head -> thread1 -> head. 	And here is the bug we got a double linked wait queue where the head and the tail are storing the same waiter (blocked thread ).     4.- When thread 0 calls pthread_mutex_unlock and runs the mutex wait queue to wake up blocked threads it runs an infinite loop and the user lost the machine control (so linux never enters). Proposed solution:	Possibly, setting do_abort to zero when managing RTL_SIGNAL_SUSPEND is for future compatibility or a simple mistake. Now abort field of thread's structure is only used for mutexes and semaphores. One solution is to not set to zero and execute do_abort when managing the RTL_SIGNAL_SUPEND signal.timespec_add_ns BUG 22-11-02The macro timespec_add-ns available in include/rtl_time.h is implemented as:#define old_timespec_add_ns(t,n) do { \        (t)->tv_nsec += n;  \        timespec_normalize(t); \} while (0)                                                                                        and timespec_normalize is implemented as:                                                                                                        #define timespec_normalize(t) {\        if ((t)->tv_nsec >= NSECS_PER_SEC) { \                (t)->tv_nsec -= NSECS_PER_SEC; \                (t)->tv_sec++; \        } else if ((t)->tv_nsec < 0) { \                (t)->tv_nsec += NSECS_PER_SEC; \                (t)->tv_sec--; \        } \      }               What should happen if the result of (t)->tv_nsec += n; is bigger than two seconds. Clearly, this will lead to an invalid time specification having tv_nsec field a value bigger of NSECS_PER_SEC (1000*1000*1000). Also overflow could happen if the result is bigger than 2^31 ( 2147483648 ).     The alternative solution is to implement timespec_normalize as:#define TWOSECONDS (NSECS_PER_SEC*2)#define timespec_add_ns(t,n) do { \  long long aux=(t)->tv_nsec+(n);\  \  if ((aux > TWOSECONDS) || (aux < -TWOSECONDS)) /*check overflow*/ {\    (t)->tv_nsec +=((n) % NSECS_PER_SEC) ; \    (t)->tv_sec += ((n) / NSECS_PER_SEC); \  } else {  (t)->tv_nsec=aux; }\  \  timespec_normalize(t); \}  while (0)      The file timespec_add_ns placed in the dirtory examples/bug tests the both implementations.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -