| Go on, admit it, you've thought about it yourself. | | | | 3214235409234472020393848453 is prime?), and |
| Wouldn't it be satisfying to set your computer alight? | | | | runs some code specifically written to run the CPU at |
| Sadly, that is not what this article is about. Burning In | | | | its hottest. |
| is the term used to describe the process of testing | | | | Hard Drive |
| new managed server hardware for faults before | | | | Cerberus writes large volumes of data to the hard |
| putting it to use in a live environment. This is done by | | | | drives over and over again to ensure that the drive |
| running 'Stress testing' software for some period of | | | | platters are functional, and it will also delete and |
| time. | | | | move files, and check the disks for errors. |
| Whenever we get new server hardware, we always | | | | If after a week the server is still running (not |
| do a complete burn in to ensure that the server | | | | smoking) and hasn't crashed, it is considered good |
| hardware is up to our high standards. If the hardware | | | | enough for use as a production machine. If it fails the |
| fails at any point, we send it back to the supplier. | | | | tests anywhere along the way, it is packed up and |
| The actual process is easy, although setting it up isn't. | | | | returned to be replaced. Web servers that have |
| Memory | | | | survived this process will certainly survive anything |
| First, when the new server is turned on, we boot off | | | | you can through at them. |
| of the network, which allows us to boot multiple | | | | You would normally expect that this level of testing |
| machines at once without needing 20+ bootable | | | | would be completed by the hardware manufacturers |
| disks. The first test run is the well known Memtest, | | | | and so these test shouldn't show up any faults. In |
| you'll find it in Google, this thoroughly checks the | | | | our experience testing hundreds of machines we do |
| computers memory, and runs for about 1 day. | | | | regularly find faults, and we do send components |
| If the computer passes the Memtest, it is restarted | | | | back. |
| and booted into a custom Red Hat kickstart install | | | | The reason it is so important to perform this level of |
| that will install a bare Red Hat environment, and | | | | testing on computers that will be used as servers is |
| Cerberus Test Control System, special software that | | | | that the uptime demands are so high. The slightest |
| runs numerous tests on all the hardware in the | | | | faults will cause outages and downtime. Once a web |
| system. | | | | server is deployed, never again will you have the |
| CPU | | | | opportunity to take it offline and perform such |
| Cerberus performs several tasks to test the CPU. It | | | | detailed testing. Even if it were to crash, there is |
| compiles the Linux kernel over and over again, runs | | | | always a demand that it be put back online as quickly |
| complicated mathematical problems (how long does it | | | | as possible, not left offline whilst thorough diagnostics |
| take you to work out if | | | | are completed. |