Message boards :
Graphics cards (GPUs) :
output file missing
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 12 Jan 09 Posts: 36 Credit: 1,075,543 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I had the following occur on one work unit today, messages saying output file missing. I don't know of any errors that occured as there are no error messages, but here are the lines that did appear in the message tab. 6/2/2009 9:36:27 AM GPUGRID Started download of p395000-IBUCH_pYEpYIk1_2105-9-LICENSE 6/2/2009 9:36:28 AM GPUGRID Finished download of p395000-IBUCH_pYEpYIk1_2105-9-LICENSE 6/2/2009 9:36:28 AM GPUGRID Started download of p395000-IBUCH_pYEpYIk1_2105-9-COPYRIGHT 6/2/2009 9:36:29 AM GPUGRID Finished download of p395000-IBUCH_pYEpYIk1_2105-9-COPYRIGHT 6/2/2009 9:36:29 AM GPUGRID Started download of p395000-IBUCH_pYEpYIk1_2105-9-p395000-IBUCH_pYEpYIk1_2105-8-10-RND1107_1 6/2/2009 9:36:31 AM GPUGRID Finished download of p1160000-IBUCH_phYIphYI_rpdb_2905-5-psf_file 6/2/2009 9:36:31 AM GPUGRID Started download of p395000-IBUCH_pYEpYIk1_2105-9-p395000-IBUCH_pYEpYIk1_2105-8-10-RND1107_2 6/2/2009 9:36:32 AM GPUGRID Starting p1160000-IBUCH_phYIphYI_rpdb_2905-5-10-RND4180_0 6/2/2009 9:36:33 AM GPUGRID Starting task p1160000-IBUCH_phYIphYI_rpdb_2905-5-10-RND4180_0 using acemd version 664 6/2/2009 9:36:37 AM GPUGRID Finished download of p395000-IBUCH_pYEpYIk1_2105-9-p395000-IBUCH_pYEpYIk1_2105-8-10-RND1107_1 6/2/2009 9:36:37 AM GPUGRID Started download of p395000-IBUCH_pYEpYIk1_2105-9-p395000-IBUCH_pYEpYIk1_2105-8-10-RND1107_3 6/2/2009 9:36:43 AM GPUGRID Finished download of p395000-IBUCH_pYEpYIk1_2105-9-p395000-IBUCH_pYEpYIk1_2105-8-10-RND1107_2 6/2/2009 9:36:43 AM GPUGRID Started download of p395000-IBUCH_pYEpYIk1_2105-9-pdb_file 6/2/2009 9:36:44 AM GPUGRID Finished download of p395000-IBUCH_pYEpYIk1_2105-9-p395000-IBUCH_pYEpYIk1_2105-8-10-RND1107_3 6/2/2009 9:36:44 AM GPUGRID Started download of p395000-IBUCH_pYEpYIk1_2105-9-psf_file 6/2/2009 9:37:01 AM GPUGRID Finished download of p395000-IBUCH_pYEpYIk1_2105-9-pdb_file 6/2/2009 9:37:01 AM GPUGRID Started download of p395000-IBUCH_pYEpYIk1_2105-9-par_file 6/2/2009 9:37:06 AM GPUGRID Finished download of p395000-IBUCH_pYEpYIk1_2105-9-par_file 6/2/2009 9:37:06 AM GPUGRID Started download of p395000-IBUCH_pYEpYIk1_2105-9-p395000 6/2/2009 9:37:07 AM GPUGRID Finished download of p395000-IBUCH_pYEpYIk1_2105-9-p395000 6/2/2009 9:37:11 AM GPUGRID Finished download of p395000-IBUCH_pYEpYIk1_2105-9-psf_file 6/2/2009 9:37:12 AM GPUGRID Starting p395000-IBUCH_pYEpYIk1_2105-9-10-RND1107_0 6/2/2009 9:37:12 AM GPUGRID Starting task p395000-IBUCH_pYEpYIk1_2105-9-10-RND1107_0 using acemd version 664 6/2/2009 10:41:37 AM GPUGRID Computation for task p395000-IBUCH_pYEpYIk1_2105-9-10-RND1107_0 finished 6/2/2009 10:41:37 AM GPUGRID Output file p395000-IBUCH_pYEpYIk1_2105-9-10-RND1107_0_1 for task p395000-IBUCH_pYEpYIk1_2105-9-10-RND1107_0 absent 6/2/2009 10:41:37 AM GPUGRID Output file p395000-IBUCH_pYEpYIk1_2105-9-10-RND1107_0_2 for task p395000-IBUCH_pYEpYIk1_2105-9-10-RND1107_0 absent 6/2/2009 10:41:37 AM GPUGRID Output file p395000-IBUCH_pYEpYIk1_2105-9-10-RND1107_0_3 for task p395000-IBUCH_pYEpYIk1_2105-9-10-RND1107_0 absent 6/2/2009 10:41:39 AM GPUGRID Started upload of p395000-IBUCH_pYEpYIk1_2105-9-10-RND1107_0_0 6/2/2009 10:41:42 AM GPUGRID Finished upload of p395000-IBUCH_pYEpYIk1_2105-9-10-RND1107_0_0 The other work unit I had running at the time completed without issue. |
|
Send message Joined: 12 Jan 09 Posts: 36 Credit: 1,075,543 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Update: I just found this over at the BOINC client forums. It's part of the change log for the just released 6.6.33 version. I've been running 6.6.28, perhaps this is what I saw. If so, maybe it's fixed now. ---------------- I want to stress one change though, especially for the CUDA users: - client: fixed nasty bug that caused GPU jobs to crash on startup when they're preempting another GPU job. The problem was as follows: * job A is chosen to preempt job B * we tell job B to quit, and initialize job A but don't start it; however, we set if scheduler state to SCHEDULED (rather than UNINITIALIZED) * job B exits, and we start job A. Since its state is not UNITIALIZED, we don't set up its slot dir. * job A runs in an empty slot dir, doesn't find its files, and bombs out. * client: add <slot_debug> option (prints messages about allocation of slots, creating/removing files in slot dirs). ----------------------- I'll install the new version and watch to see if it happens again. |
Paul D. BuckSend message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
It is not related. The bug addressed in the change would be experienced as a zero time crash. The task starts and IMMEDIATELY dies. It will have no run time on the clock at all. The scenario is you would have a GPU Grid task running. Down load a new task with an earlier report time. The currently running task is stopped and the new task is started and it will die immediately. I don't think that we see this issue here because of the differences in issuing work from SaH where you can be running tasks with deadlines 2 weeks hence and then download tasks with a deadline in a week ... those will preempt the running tasks ... {edit} In your case I would be looking to the "standard" causes, other applications running, games, heat, drivers, imps, trolls, mice, and other evil spirits ... :) |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
"output file missing" is not the error. You got some error (probably "Incorrect function. (0x1) - exit code 1 (0x1)" here), the GPU-Grid app terminated itself and didn't write all result files - because it didn't get to actually calculating these results. MrS Scanning for our furry friends since Jan 2002 |
©2025 Universitat Pompeu Fabra