Ultimate Guide To Folding@home

Twice now since I've been using F@H, it's crashed taking hours of CPU time with it. The first time I lost about 71% of a work unit, due to a communications error with the server:

"[06:21:45] Completed 355000 out of 500000 steps (71)
[06:33:02] CoreStatus = 1 (1)
[06:33:02] Client-core communications error: ERROR 0x1
[06:33:02] Deleting current work unit & continuing...
[06:33:22] - Preparing to get new work unit...
[06:33:24] + Attempting to get work packet
[06:33:24] - Connecting to assignment server
[06:33:27] + Could not connect to Assignment Server
[06:33:27] + Could not connect to Assignment Server 2
[06:33:27] + Couldn't get work instructions.
[06:33:27] - Error: Attempt #1 to get work failed, and no other work to do.
Waiting before retry."

Which prompted me to stop using it. I retried it the other day, got one WU completed. My PC crashed again today, and no surprise, F@H lost its WU again:

"[16:33:30] Completed 480000 out of 500000 steps (96)"

:(
 
Originally posted by 4th gen@16 March 2004 - 17:22
Twice now since I've been using F@H, it's crashed taking hours of CPU time with it. The first time I lost about 71% of a work unit, due to a communications error with the server:

"[06:21:45] Completed 355000 out of 500000 steps (71)
[06:33:02] CoreStatus = 1 (1)
[06:33:02] Client-core communications error: ERROR 0x1
[06:33:02] Deleting current work unit & continuing...
[06:33:22] - Preparing to get new work unit...
[06:33:24] + Attempting to get work packet
[06:33:24] - Connecting to assignment server
[06:33:27] + Could not connect to Assignment Server
[06:33:27] + Could not connect to Assignment Server 2
[06:33:27] + Couldn't get work instructions.
[06:33:27] - Error: Attempt #1 to get work failed, and no other work to do.
Waiting before retry."

Which prompted me to stop using it. I retried it the other day, got one WU completed. My PC crashed again today, and no surprise, F@H lost its WU again:

"[16:33:30] Completed 480000 out of 500000 steps (96)"

:(
The first bit (up to [06:33:02]) indicates some sort of file system error, not a communication error. Basically it could not write the results to the disk, presumably because it thought the file was corrupt. Rather than send bad results back to the source it deleted them.

The next part (up to [06:33:27]) is an attempt to get another WU. First of all it has to connect to an assignment server. This then routes the request to a Work Unit server, but as the error messages suggest it could not even connect to the assignment server.

Both of these events occuring together suggest you have some inherent instability in your system, which is causing the crash. I'm guessing that you are using FAT32 file system. When your system crashes with FAT32, the last part of the file will be lost (just like with Kazaa). So F@H thinks your results are corrupt and discards them (again, like Kazaa downloads).

Suggestions:
Convert to NTFS if you are using FAT32. It won't cure the crash but it may stop you from losing your work.
Sort out your system instability - I've found from experience that raising Vcore slightly (0.025V is usually enough) can cure this problem, even though the system is theoretically running at the correct settings. Watch your temps though.
 
It was just a thought.

What sort of crash is it? BSOD, reboot or just freeze?

I used to find that my system would just freeze when running F@H, until I raised Vcore. Other than that it was usually alright, except if it got fairly hot after about an hour of game playing, and then the same thing would happen. Cured that problem too.
 
Originally posted by lynx@16 March 2004 - 20:52
It was just a thought.

What sort of crash is it? BSOD, reboot or just freeze?

I used to find that my system would just freeze when running F@H, until I raised Vcore. Other than that it was usually alright, except if it got fairly hot after about an hour of game playing, and then the same thing would happen. Cured that problem too.
Freeze in both F@H cases. It was frozen from when I left uni until I got home (I was checking the PC using VNC in uni)

I don't really want to up the Vcore, running 46 degrees c at full load at the moment (Athlon 2700+, aluminium HS), I might give it a try if I get more instability problems though, cheers :)
 
Originally posted by Livy@16 March 2004 - 18:38
...... and since summerlinda may not be as active they onyl have 3 days worth of data?........

Not as active?
Man, I'm folding like crazy, 24/7, I need a break, its exhausting ;)
 
Just got my first 2500 part WU. My other 4 up until now have been 500 parts, so this one is gonna take a long, long time...

Just out of interest, how long (on average) does it take you chaps to complete a 500 part WU, and what CPU have you got? :unsure:
 
Back
Top