Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

How did the multiprocessor system handle CPU crashes?

April 26, 2017CPU crashes multiprocessor system

0

Posted

How did the multiprocessor system handle CPU crashes?

2 Answers

0

Posted

Surprisingly enough, very badly. When one CPU crashed, all the CPUs crashed. “The philosophy of the 11/74 was high availability, not high reliability. As such, from a philosophical viewpoint, we wanted crash dumps of all the CPUs to catch software problems.” “Pragmatically speaking, continuing would be difficult. The crashing CPU is in the kernel, owning at least $EXECL in all likelihood, and perhaps some other spin locks. Of course, any lock it owned was owned to protect an atomic transaction, and the crash caused some decay.” “The fork list may not be intact, the Pool may not be intact, device states may be inconsistent, the context of the running task on the crashed CPU (which could be MCR or F11ACP) is lost in what may have been an atomic transaction inside the component (remember $LOCKL?), and a host of other problems may exist. [These] will simply cascade into a mass of wreckage where a crash dump ought to be.” Source: Brian S. McCarthy (July 2005) How was the multiprocessor syst

0

Posted

Surprisingly enough, very badly. When one CPU crashed, all the CPUs crashed. “The philosophy of the 11/74 was high availability, not high reliability. As such, from a philosophical viewpoint, we wanted crash dumps of all the CPUs to catch software problems. “Pragmatically speaking, continuing would be difficult. The crashing CPU is in the kernel, owning at least $EXECL in all likelihood, and perhaps some other spin locks. Of course, any lock it owned was owned to protect an atomic transaction, and the crash caused some decay.” “The fork list may not be intact, the Pool may not be intact, device states may be inconsistent, the context of the running task on the crashed CPU (which could be MCR or F11ACP) is lost in what may have been an atomic transaction inside the component (remember $LOCKL?), and a host of other problems may exist. [These] will simply cascade into a mass of wreckage where a crash dump ought to be.” Source: Brian S. McCarthy (July 2005) How was the multiprocessor syste