Message boards :
News :
Experimental Python tasks (beta)
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author | Message |
---|---|
Send message Joined: 11 Sep 08 Posts: 18 Credit: 1,551,929,462 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'm creating some experimental tasks for the Python app (made Beta). They are Linux and CUDA specific and serve in preparation for future batches. What type of card minimum for this app. My 980Ti don't load WU. ![]() |
Send message Joined: 4 Aug 14 Posts: 266 Credit: 2,219,935,054 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'm creating some experimental tasks for the Python app (made Beta). They are Linux and CUDA specific and serve in preparation for future batches. In "GPUGRID Preferences", ensure you select "Python Runtime (beta)" and "Run test applications?" Your GPU, driver and OS should run these tasks fine |
Send message Joined: 11 Sep 08 Posts: 18 Credit: 1,551,929,462 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'm creating some experimental tasks for the Python app (made Beta). They are Linux and CUDA specific and serve in preparation for future batches. Merci, I just forgot Run test applications :) ![]() |
Send message Joined: 4 Jun 15 Posts: 19 Credit: 8,813,058,416 RAC: 78,330 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
All of these seem now to error out after computation has finished. On several computers: <message> upload failure: <file_xfer_error> <file_name>2p95312000-RAIMIS_NNPMM-0-1-RND8920_1_0</file_name> <error_code>-131 (file size too big)</error_code> </file_xfer_error> </message> What causes this and how it can be fixed? |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 295,172 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
What causes this and how it can be fixed? I've just posted instructions in the Anaconda Python 3 Environment v4.01 failures thread (Number Crunching). Read through the whole post. If you don't understand anything, or you don't know how to do any of the steps I've described - back away. Don't even attempt it until you're sure. You have to edit a very important, protected, file - and that needs care and experience. |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
What causes this and how it can be fixed? really needs to be fixed server side (or would be nice if it were configurable via cc_config but that doesnt look to be the case either). stopping and starting the client is a recipe for instant errors, and where successful, this process will need to be repeated for every time you download new tasks. not really a viable option unless you want to babysit the system all day. ![]() |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 295,172 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Stopping and starting the client is a recipe for instant errors, and where successful, this process will need to be repeated for every time you download new tasks. not really a viable option unless you want to babysit the system all day. By itself, it's fairly safe - provided you know and understand the software on your own system well enough. But you do need to have that experience and knowledge, which I why I put the caveats in. I agree about having to re-do it for every new task, but I'd like to get my APR back up to something reasonable - and I'm happy to help nudge the admins one more step along the way to a fully-working, 'set and forget', application. |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 295,172 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
|
Send message Joined: 4 Jun 15 Posts: 19 Credit: 8,813,058,416 RAC: 78,330 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
What causes this and how it can be fixed? Excaltly so. I don't know about others, but I have no time to sit and watch my hosts working. A host is working 10 hours to get the task done, and then everything turns out to be just a waste of time and energy because of this file size limitation. This is somewhat frustrating. |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 295,172 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Opt out of the Beta test programme if you don't want to encounter those problems. But as it happens, I haven't had a single over-run since they cancelled the one I highlighted in the post before yours. |
Send message Joined: 4 Jun 15 Posts: 19 Credit: 8,813,058,416 RAC: 78,330 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Opt out of the Beta test programme if you don't want to encounter those problems. Yes, I agree - something has changed. It looks like the last full time (successful) computation on my hosts that produced too large output file was WU 26900019, ended 29 Dec 2020 | 15:00:52 UTC after 31,056 seconds of run time. |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
I see some new Python tasks have gone out. however they seem to be erroring for everyone. https://www.gpugrid.net/results.php?userid=552015&offset=0&show_names=0&state=0&appid=31 seems to always error with this "os" not defined error. GPU load 0% Environment ![]() |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
now seeing this:
and this: 09:57:32 (340085): wrapper (7.7.26016): starting ![]() |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
just had my first two successful completions. doesn't look like it ran any GPU work though, the GPU was never loaded. just unpacked the WU, ran the setup. then exited. marked as complete with no error. only ran for about 45 seconds. https://www.gpugrid.net/result.php?resultid=32570561 ![]() |
Send message Joined: 23 Dec 09 Posts: 189 Credit: 4,798,881,008 RAC: 311 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
just had my first two successful completions. doesn't look like it ran any GPU work though, the GPU was never loaded. just unpacked the WU, ran the setup. then exited. marked as complete with no error. only ran for about 45 seconds. Did you have to up-date conda for the two successful tasks? I received a few new WUs but all errored. I will not have access to this computer until tomorrow. |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
just had my first two successful completions. doesn't look like it ran any GPU work though, the GPU was never loaded. just unpacked the WU, ran the setup. then exited. marked as complete with no error. only ran for about 45 seconds. I didnt make any changes to my system between failed tasks and successful tasks. AFAIK the project is sending conda packaged into these WUs so it doesn't matter what you have installed, it contains everything you should need. looks like testrun93+ ish are OK, but test runs in the 80s and lower all fail with some form of error like the errors I listed above. ![]() |
Send message Joined: 4 Jun 15 Posts: 19 Credit: 8,813,058,416 RAC: 78,330 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
All of these Python WU's seem to fail. A pair of examples with different problems: http://www.gpugrid.net/result.php?resultid=32583864 http://www.gpugrid.net/result.php?resultid=32583210 |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
some succeed. but very few. out of the 94 python tasks i've received recently. only 4 of them succeeded. ![]() |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
i see some new tasks going out. still broken. https://www.gpugrid.net/result.php?resultid=32584011 11:06:39 (1387708): /usr/bin/flock exited; CPU time 281.233647 ![]() |
![]() ![]() Send message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 998,578 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
some succeed. but very few. out of the 94 python tasks i've received recently. only 4 of them succeeded. 65 received / 64 errored / 1 successful is my current balance |
©2025 Universitat Pompeu Fabra