a simple problem with a simple solution

Posted by Jason on January 12, 2010

Today at work, we experienced one of those puzzles you read but rarely experience in real life.

We have a relatively new row in our server room at work, where it is surrounded closely on three sides by walls. on one of the short ends, there is a plastic barrier you can walk through (think the thick plastic strips that cover the opening of a walk-in freezer).

Over time we have slowly filled the cabinets in the row with servers, generally working from the far end towards the center, then filling the other end. We ended up with a mostly full row.

Now in this row there are APC In-Row chillers, inserted between the cabinets, which pull air in front the back (the ‘hot isle’) and push cold air out the front (the ‘cold isle’). They measure the temperature coming out of the back of the cabinets and into the front of the cabinets to ensure they are cooling enough for the demand.

Something changed over the break, where the temperature in the server room actually jumped several degrees. And last week while installing some new servers in the remaining cabinet, we noticed the chillers suddenly had a hard time keeping the row cool. I’d enter either side, and the chiller fans would immediately speed up.

We couldn’t explain the increased temperature elsewhere in the server room, and the strange behavior of the fans. Furthermore, today we noticed the fans were going full-speed all day.

My coworker Chris finally figured it out this afternoon.

He opened the the door of the cabinet with the remaining cabinet, the one with the most open space. The door happens to have a temperature sensor attached to it, so when the door was open the sensor was in the surrounding cool air and the chillers slowed down. Close the door, the chillers speed up. He figured out that this cabinet was the only opening left in the row, so the hot air behind the row was coming forward through the cabinet – and directly into the temperature sensor. Of course, the chillers thought the row was super hot, they were measuring the cabinet exhaust temperature, not the intake temperature.

Chris shoved an unused ceiling tile in the cabinet, covering the large opening behind the sensor. The chillers immediately settled down.

It turned out to be a simple problem with a simple solution.