10 January
2012

Fimm: filesystem glitch on login node

There was a temporary filesystem failure on the login node. Seems OK after reboot.


Posted by borisw at 11:05 | Comments (0) | Trackbacks (0)
03 January
2012

Hexagon: system crash 25.12.2011

We had to restart hexagon due to multiple seastar heartbeat failures in c10 and c12 cabinets. Probably related to power and extreme weather which we had.
This happened on 25.12.2011 23:30.


Posted by oltu at 11:47 | Comments (0) | Trackbacks (0)
22 December
2011

Fimm: job scheduling problems

still maui irregularities

Hello,
Maui job scheduler on fimm is still behaving strange. Jobs get scheduled to random nodes. This can break already running jobs on these nodes. Please check results of completed jobs and expect irregular job cancellations over the next days.

We are working on resolving the problem and will let you know when we're back with regular job running conditions.


Posted by borisw at 14:14 | Comments (0) | Trackbacks (0)
20 December
2011

Fimm: maui down

Hi,

Update: 11:00

Maui job scheduler on fimm is taken down due to some problem.
we are working on resolving problem. will keep you updated.

Update: 13:20

We restart maui and some other processes, due to restart some of your jobs was killed, please check your job status , and submit it again if necessary.

We are sorry for inconvenience.


Posted by saerda at 10:34 | Comments (0) | Trackbacks (0)
14 December
2011

Hexagon: scheduled maintenance, Dec 19th

We will have a scheduled maintenance for hexagon on Monday, December 19th. Approximate time slot is from 10:00 to 14:00

We need to replace 2 PDUs in failed cabinets and some CPUs, memory.
Update: 14:50 Machine is back available.


Posted by oltu at 13:53 | Comments (0) | Trackbacks (0)
12 December
2011

Hexagon: thunderstorm power failure

Hexagon has shutdown automatically due to thunderstorm power blink. We are diagnosing.

Update: 22:00 Machine is up again.


Posted by eithor at 21:07 | Comments (0) | Trackbacks (0)