Author |
Message |
|
# Using CUDA device 0
Cuda error in file 'deviceQuery.cu' in line 59 : out of memory.
http://www.ps3grid.net/result.php?resultid=51269
http://www.ps3grid.net/result.php?resultid=50907
http://www.ps3grid.net/result.php?resultid=50789
http://www.ps3grid.net/result.php?resultid=50903
i know how to get this error :(
these wu where (re)started by boinc when i was playing tf2. the fact is that nvidia driver can't allocate enough memory for the cuda application when there is a heavy gfx game running :(
i think that boinc should take a look at the video memory before starting this kind of tasks. |
|
|
|
what OS, BOINC version, and driver version do you have? Also, what type of video card are you running? |
|
|
GDFVolunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message
Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level
Scientific publications
|
# Using CUDA device 0
Cuda error in file 'deviceQuery.cu' in line 59 : out of memory.
http://www.ps3grid.net/result.php?resultid=51269
http://www.ps3grid.net/result.php?resultid=50907
http://www.ps3grid.net/result.php?resultid=50789
http://www.ps3grid.net/result.php?resultid=50903
i know how to get this error :(
these wu where (re)started by boinc when i was playing tf2. the fact is that nvidia driver can't allocate enough memory for the cuda application when there is a heavy gfx game running :(
i think that boinc should take a look at the video memory before starting this kind of tasks.
This is by default in CUDA. Graphics has precedence vs a CUDA program. So if you a game request more memory, the CUDA application terminates to release the memory.
gdf |
|
|
|
what OS, BOINC version, and driver version do you have? Also, what type of video card are you running?
xp 32b / boinc 6.3.10 / 177.84
geforce 8800gts 512 |
|
|
|
I had an 8800GT running just fine but had to re-install Vista Ultimate from scratch (HD foo-faa).
New HD.
Installed the Cuda drivers (Vista 64) just like before. Installed 6.3.10 just like before.
Now I get this message with the w/u's.
Cuda error in file 'deviceQuery.cu" Line 59 Initialization error
Any ideas???? |
|
|
GDFVolunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message
Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level
Scientific publications
|
I had an 8800GT running just fine but had to re-install Vista Ultimate from scratch (HD foo-faa).
New HD.
Installed the Cuda drivers (Vista 64) just like before. Installed 6.3.10 just like before.
Now I get this message with the w/u's.
Cuda error in file 'deviceQuery.cu" Line 59 Initialization error
Any ideas????
Something is not as before.
Have you installed boinc NOT in protected mode?
Have you installed the latest service packs and drivers?
g |
|
|
|
Today i got the same message as below.
But i don't change anything on this computer, just the drivers (177.92).
I don't use this computer, i'm at work!
Boinc isn't install with service option and secure mode.
6.3.10, XP x64, 177.92
So what's happen?
Jim PROFIT |
|
|
|
I've the same on 3 WU on my P5bv / 4Gb DDR2-8500 with :
XP32 / 6.3.10 / driver 177.92 on NVIDIA 8600GTS 256Mb ---> G84
I sugest to see in the file deviceQuery.cu at ligne 59, what king of parameters (memory size, ... )it used and put then im the error message to give more informations to debug.
|
|
|
|
After i load se cuba SDK to see what look like the file. I think the deviceQuery.cu is to be reWritten completly.
I sugest :
// Program main
int
main( int argc, char** argv)
{
int deviceCount=0, dev=0;
CUDA_SAFE_CALL(cudaGetDeviceCount(&deviceCount));
if (deviceCount == 0)
printf("There is no device supporting CUDA\n");
else
{
if (deviceCount == 1)
printf("There is 1 device supporting CUDA\n");
else
printf("There are %d devices supporting CUDA\n", deviceCount);
cudaDeviceProp deviceProp[deviceCount];
for (dev = 0; dev < deviceCount; ++dev) {
CUDA_SAFE_CALL(cudaGetDeviceProperties(&deviceProp[dev], dev));
if (dev == 0) { // see CudaRefenreceManuel 1.1.1 cudaGetDeviceCount
if (deviceProp[dev].major == 9999 && deviceProp[dev].minor == 9999)
printf("The device is no supporting CUDA.\n");
}
printf("\n Device %d: \"%s\"\n", dev, deviceProp[dev].name);
printf(" Major revision number: %d\n", deviceProp[dev].major);
printf(" Minor revision number: %d\n", deviceProp[dev].minor);
printf(" Total amount of global memory: %u bytes\n", deviceProp[dev].totalGlobalMem);
#if CUDART_VERSION >= 2000
printf(" Number of multiprocessors: %d\n", deviceProp[dev].multiProcessorCount);
printf(" Number of cores: %d\n", 8 * deviceProp[dev].multiProcessorCount);
#endif
printf(" Total amount of constant memory: %u bytes\n", deviceProp[dev].totalConstMem);
printf(" Total amount of shared memory per block: %u bytes\n", deviceProp[dev].sharedMemPerBlock);
printf(" Total number of registers available per block:%d\n", deviceProp[dev].regsPerBlock);
printf(" Warp size: %d\n", deviceProp[dev].warpSize);
printf(" Maximum number of threads per block: %d\n", deviceProp[dev].maxThreadsPerBlock);
printf(" Maximum sizes of each dimension of a block: %d x %d x %d\n",
deviceProp[dev].maxThreadsDim[0],
deviceProp[dev].maxThreadsDim[1],
deviceProp[dev].maxThreadsDim[2]);
printf(" Maximum sizes of each dimension of a grid: %d x %d x %d\n",
deviceProp[dev].maxGridSize[0],
deviceProp[dev].maxGridSize[1],
deviceProp[dev].maxGridSize[2]);
printf(" Maximum memory pitch: %u bytes\n", deviceProp[dev].memPitch);
printf(" Texture alignment: %u bytes\n", deviceProp[dev].textureAlignment);
printf(" Clock rate: %.2f GHz\n", deviceProp[dev].clockRate * 1e-6f);
#if CUDART_VERSION >= 2000
printf(" Concurrent copy and execution: %s\n", deviceProp[dev].deviceOverlap ? "Yes" : "No");
#endif
}
}
printf("\nTest PASSED\n");
CUT_EXIT(argc, argv);
} |
|
|
|
Are you running any other applications that might be taking GPU memory? (Games or other 3D applications for example.)
MJH
|
|
|
SandroSend message
Joined: 19 Aug 08 Posts: 22 Credit: 3,660,304 RAC: 0 Level
Scientific publications
|
A member of our Team Planet3DNow also reports errored Wus with similar problems
<core_client_version>6.3.10</core_client_version>
<![CDATA[
<message>
Unzul?ssige Funktion. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
Cuda error in file 'deviceQuery.cu' in line 59 : unknown error.
</stderr_txt>
]]>
it's a 9800GTX with default setting, so no OC.
http://www.ps3grid.net/show_host_detail.php?hostid=8340
Can you say something whats happened with this WUs? And how to avoid this errors?
|
|
|
GDFVolunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message
Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level
Scientific publications
|
A member of our Team Planet3DNow also reports errored Wus with similar problems
<core_client_version>6.3.10</core_client_version>
<![CDATA[
<message>
Unzul?ssige Funktion. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
Cuda error in file 'deviceQuery.cu' in line 59 : unknown error.
</stderr_txt>
]]>
it's a 9800GTX with default setting, so no OC.
http://www.ps3grid.net/show_host_detail.php?hostid=8340
Can you say something whats happened with this WUs? And how to avoid this errors?
This seems a device driver problem.
g
|
|
|
|
I've been out of town for a few days. I piled up a few errors with a bunch of 6.43 and 1 - 6.45 apps:
http://www.ps3grid.net/results.php?hostid=8288
All with the same error message...
<core_client_version>6.3.10</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
Cuda error in file 'deviceQuery.cu' in line 59 : out of memory.
Win X64 pro
177.84 driver
8800GT No OC on GPU
Boinc 6.3.10
I am over my quota for today. But I have rebooted the box. I don't game on this box, and run 1 other Boinc Project alongside PS3Grid.
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.
|
|
|
|
Ok, I picked up a new wu, and it is running 00:08 seconds at 18% finished....
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.
|
|
|
|
This just started happening to my machine (XP64, 6gb, 6.3.10, 177.92, GTX 280). It is a dedicated cruncher. It successfully crunched 4 tasks under 6.45, then 7 of these errors in a row. I am rebooting to see if that helps.
____________
Reno, NV
Team: SETI.USA
|
|
|
|
I will have to wait a day, to see if rebooting helped. The failures maxed out my daily quota.
____________
Reno, NV
Team: SETI.USA
|
|
|
|
After the reboot. The wu finished with no errors. I'll keep an eye on the memory usage....
http://www.ps3grid.net/result.php?resultid=58599
But the run time increased from ~54,000 to 65,000 seconds.
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.
|
|
|
|
This just started happening to my machine (XP64, 6gb, 6.3.10, 177.92, GTX 280). It is a dedicated cruncher. It successfully crunched 4 tasks under 6.45, then 7 of these errors in a row. I am rebooting to see if that helps.
Rebooting seems to have solved the problem, at least for now. I'll post if it happens again.
____________
Reno, NV
Team: SETI.USA
|
|
|
|
This just started happening to my machine (XP64, 6gb, 6.3.10, 177.92, GTX 280). It is a dedicated cruncher. It successfully crunched 4 tasks under 6.45, then 7 of these errors in a row. I am rebooting to see if that helps.
Rebooting seems to have solved the problem, at least for now. I'll post if it happens again.
4 days later, it happened again. 8 tasks errored out in a row. I will reboot again.
____________
Reno, NV
Team: SETI.USA
|
|
|
Kokomiko Send message
Joined: 18 Jul 08 Posts: 190 Credit: 24,093,690 RAC: 0 Level
Scientific publications
|
4 days later, it happened again. 8 tasks errored out in a row. I will reboot again.
Have you checked the size of the boincmgr.exe in the taskmanager? I had the problem, that he is growing in memory when he's open. Up to 1,5 GB in one day. If you close him, he's running without problems. I close and restart the boincmanager daily, so I have no problem furthermore. I wait for the 6.3.11 and hope, this is fixed.
____________
|
|
|
STE\/ESend message
Joined: 18 Sep 08 Posts: 368 Credit: 4,173,502,885 RAC: 27,150,888 Level
Scientific publications
|
I had the same "Cuda error in file 'deviceQuery.cu' in line 59 : out of memory" this morning on 4 or 5 Wu's in a row, I rebooted & the next Wu started running ok.
What it is if you have Time ticking down in the Manager Window for every second it ticks down the Memory consummation increases for the Manager. Setting the Manager Window to the Transfer Tab when not viewing the Wu's or something else stops the increase in consummation & actually after some time the Memory usage went down ... |
|
|
|
What do you mean by "manager window?" Tasks or messages maybe?
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
|
My winx64 box has burped this '...out of memory error..', with the associated corrupt wu's, twice since I have first posted. I do not see any hugh increase in memory usage before or after the problem. I am only looking in 'Task Manager'.
using 6.3.10
Should I be looking else where?
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.
|
|
|
|
I noticed this problem only on XP-64 and 2K3-64.
But on Vista, i don't have this problem!
Jim PROFIT |
|
|
STE\/ESend message
Joined: 18 Sep 08 Posts: 368 Credit: 4,173,502,885 RAC: 27,150,888 Level
Scientific publications
|
What do you mean by "manager window?" Tasks or messages maybe?
MrS
I mean the Main Project Window, If you have a lot of Projects you are attached to some of them will Try to connect even if you have the Project set to NNW & Suspended, the Time will count down until the Project tries to connect again & then will repeat itself again ...
|
|
|
|
It happened again. This is after upgrading to 6.3.11. So that was not a fix.
Edit: I'm not saying that 6.3.11 was supposed to fix it. Just that it doesn't.
____________
Reno, NV
Team: SETI.USA
|
|
|
|
I guess the next thing is to 'shut down' Boincmgr in Taskmanager, and see what happens....
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.
|
|
|
|
FWIW, I have not seen boincmgr memory usage increase. Not with .10 or .11.
____________
Reno, NV
Team: SETI.USA
|
|
|
STE\/ESend message
Joined: 18 Sep 08 Posts: 368 Credit: 4,173,502,885 RAC: 27,150,888 Level
Scientific publications
|
I just looked @ 1 Box & the boincmgr.exe was using 1,100,777 of Memeory, I shut Boinc Down & restarted BOINC again and the boincmgr.exe was only using 10,584 of Memeory so it must increase in some cases ... This isn't the first time I've seen it that high either ... |
|
|
|
[size=15]
I never saw boincmgr using more than 12-13k of memory. But something is messing up my wu's.....
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it. |
|
|
|
As far as I understood the excessive memory usage of BOINC manager only affects 64 Bit system. I don't know which ones exactly, but 32 bit versions are certainly not affected.
And this is probably not directly related to the error, which is referring to GPU memory and not to system memory.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
STE\/ESend message
Joined: 18 Sep 08 Posts: 368 Credit: 4,173,502,885 RAC: 27,150,888 Level
Scientific publications
|
As far as I understood the excessive memory usage of BOINC manager only affects 64 Bit system. I don't know which ones exactly, but 32 bit versions are certainly not affected.
And this is probably not directly related to the error, which is referring to GPU memory and not to system memory.
MrS
All my Systems are 64-Bit so that might explain why I'm seeing more than some people are ...
|
|
|
|
All my Systems are 64-Bit so that might explain why I'm seeing more than some people are ...
I'm using Vista x64 on 2 systems, and I have not seen the boinc manager go daft with memory usage.
The 2nd of my 2 systems has been running for near 5 days, and boincmgr.exe is using 3,564K.
While my main PC that has been running for about 9 hours show boincmgr.exe using 4,924K
Perhaps it is a Windows XP x64 bug, as since I changed from XP to Vista PS3GRID has been more stable for me, but that could be due to anything, from new o.s install, different drivers, to me changing Anti-Virus from NOD32 to Bitdefender.
However I don't ever remember seeing the boincmgr.exe memory problem with XP x64 either :(
Using Boinc v6.3.10 on Both systems, with the v178.13 (only installed tonight, on the 1st system) and v177.92 Nvidia drivers (on the 2nd system).
____________
Down with the Kredit Kops!!! |
|
|
|
I too am using win 64 bit. And started getting memory errors a little ways back. I have closed boincmgr and the wu's are fine (so far).
I just opened boincmgr this morning, and it only showed ~3k of memory usage, but I closed it out (end process) again anyway. I will let it run like this for a few days, or until it errors out a wu again.
Win x64
177.84 driver
2 gig ram
boinc 6.3.10
8800gt (no oc)
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.
|
|
|
|
Seems that closing Boincmgr in Taskmanager did not stop my computer memory errors. It ran for a few days and today I got the same memory error.
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.
|
|
|
|
Seems that closing Boincmgr in Taskmanager did not stop my computer memory errors. It ran for a few days and today I got the same memory error.
Same here. I had quit boincmgr (File -> Exit), and still got the error last night. Clearly, it has nothing to do with boingmgr running (or not).
____________
Reno, NV
Team: SETI.USA
|
|
|
|
I'm still getting aborted wu's with the 'out of memory' error. It seems to happen after 2-3 days, reboot, wait 2-3 days, etc...
http://www.ps3grid.net/results.php?hostid=8288
Win XP 64
6.3.14 (upgraded a week ago)
177.84 driver
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.
|
|
|
|
Same error here... http://www.ps3grid.net/results.php?hostid=15279
WinXP64
178.08
boinc 6.3.14
It was running fine at first try but when I remote connect (boincview or boinc manager from diff pc) things gone wrong. http://www.ps3grid.net/result.php?resultid=83224
any ideas?
trying 6.3.10 now and waiting for my qouta to reset. |
|
|
|
has anyone with vista ran into this? I know that vista has redone the graphics API, and it also allocates some of the system memory as addressable to the video card, thus increasing the work space the video apps has available. The graphics drivers also doesn't have legacy code in them, making them more efficient. I wonder if these changes are the reason vista doesn't run into the same problems as XP? |
|
|
|
This also happened to me with 6.3.10...
I had hoped that .14 would solve that issue.
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.
|
|
|
|
I'm getting this again...It seems to happen after 2-3 days
Going to try vista.
Maybe the developers can take a look if there are some memory leaked. |
|
|
Kokomiko Send message
Joined: 18 Jul 08 Posts: 190 Credit: 24,093,690 RAC: 0 Level
Scientific publications
|
Today I got 9 WUs crashed on XP Pro 64 with my GTX260²:
Cuda error in file 'deviceQuery.cu' in line 59 : out of memory.
Now I get no new WUs for today, also not after reboot, detach and re-attach.
Any hint to avoid this situation?
____________
|
|
|
|
Today I got 9 WUs crashed on XP Pro 64 with my GTX260²:
Cuda error in file 'deviceQuery.cu' in line 59 : out of memory.
Now I get no new WUs for today, also not after reboot, detach and re-attach.
Any hint to avoid this situation?
There is no known root cause, or corrective action at this time. You will have to wait 24 hours for your daily quota to be reset, in order to get more tasks. Bottom line: When this problem happens, it takes a machine down for at least a day.
____________
Reno, NV
Team: SETI.USA
|
|
|
|
I'm wondering: is the GPU memory actually blocked due to some memory leak?
Maybe test it like that: run 3DMark 2006 (should need some mem) after boot, note the score and rerun after you got this error. If the scores are identical (+/- 1%) I'd say the GPU memory is still available and something else is going on. There are also tools which can show you the utilization of GPU mem, I just don't know if they're any good.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
Kokomiko Send message
Joined: 18 Jul 08 Posts: 190 Credit: 24,093,690 RAC: 0 Level
Scientific publications
|
... run 3DMark 2006 (should need some mem) after boot, note the score and rerun after you got this error. If the scores are identical ...
3DMark06 is not running under XP 64 bit, need incompatible DirectX Version.
____________
|
|
|
|
try upgrading DirectX9 to the latest August 2008 build:
http://www.microsoft.com/downloads/details.aspx?familyid=2DA43D38-DB71-4C1B-BC6A-9B6652CD92A3&displaylang=en |
|
|
Kokomiko Send message
Joined: 18 Jul 08 Posts: 190 Credit: 24,093,690 RAC: 0 Level
Scientific publications
|
I've used Rivatuner instead of 3DBenbch06 to check the used memory of the graphic card under XP 64 bit. There is a plugin named Vidmem.dll in the installation of the Rivatuner you can use. After a fresh reboot the card use 70.13 MB of Video RAM. After 12 hours runtime the card use 138.26 MB. I had quit the BOINC manager and restart him, the card used then 206.32 MB. I checked again and made a second restart of the BOINC manager and the card used 274.82 MB. You can provoke the memory fault with repeated restart of the BOINC manager, then you surely get the message:
Cuda error in file 'deviceQuery.cu' in line 59 : out of memory
after some restarts or latest after 3 or 4 days of running time.
My tip for XP 64 user: Restart your XP 64 often while using CUDA or change to Vista 64 or Linux, they don't have this problem.
____________
|
|
|
|
I've used Rivatuner instead of 3DBenbch06 to check the used memory of the graphic card under XP 64 bit. There is a plugin named Vidmem.dll in the installation of the Rivatuner you can use. After a fresh reboot the card use 70.13 MB of Video RAM. After 12 hours runtime the card use 138.26 MB. I had quit the BOINC manager and restart him, the card used then 206.32 MB. I checked again and made a second restart of the BOINC manager and the card used 274.82 MB. You can provoke the memory fault with repeated restart of the BOINC manager, then you surely get the message:
Cuda error in file 'deviceQuery.cu' in line 59 : out of memory
after some restarts or latest after 3 or 4 days of running time.
My tip for XP 64 user: Restart your XP 64 often while using CUDA or change to Vista 64 or Linux, they don't have this problem.
My hunch of a memory leak seems true. Does Berkeley knows about this bug?
Anyways, changed my system to vista and apparently no issues with not enough memory but it took 600-700 secs longer and from 24ms/step to 25ms/step
I thought vista is 2000secs faster?
|
|
|
|
Cool, finally we could grab the problem by the horns! :D
> I thought vista is 2000secs faster?
That was my guess with the first reasonable performance data in. But quite a bit has changed since then, client and driver-wise.. and I could have been wrong from the beginning. Though the comparison would probably only be valid between XP32 and Vista anyway, because XP64 uses a different driver than XP32.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
|
I have been rebooting my machine daily, to try to avoid this problem. It seemed to help for a while. It didn't happen for a couple of weeks anyway. Then last night, it happened again. It's with 6.3.19. Now I have to wait 24 hours until I can get more tasks to crunch.
____________
Reno, NV
Team: SETI.USA
|
|
|
|
I have been rebooting my machine daily, to try to avoid this problem. It seemed to help for a while. It didn't happen for a couple of weeks anyway. Then last night, it happened again. It's with 6.3.19. Now I have to wait 24 hours until I can get more tasks to crunch.
Do you leave tasks in memory or are the removed when suspened/switch to waiting ?
I don't know but if removed, that may help.
Another note, this may be related ?
There are some recent improvments goigg into the next client(s) to help detect memory leaks.
I'm not exactly sure all what they are doing on this.
See changeset
16357
client: include precompiled header in rr_sim.cpp so memory leak detection will work.
|
|
|
|
Well, i got hit hard again by these errors. About a week ago I reported them in another thread and was directed here. After reading suggestions here, I rebooted the offending machine and it started to play nicely, until today. During that period I also upgraded BOINC client to 6.3.21, which is finnaly assigning correct number of tasks to CPU and GPU cores. Next I'll try upgrading graphics drivers to see if that sorts this problem.
Greetings,
____________
|
|
|
|
I got hit by memory errors again on 2 wu's.
6.3.21
XP 64
177.84 driver
<core_client_version>6.3.21</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
Cuda error in file 'deviceQuery.cu' in line 59 : out of memory.
I performed a re-boot. Hope that fixes it for a while..
EDIT: After the re-boot, I recieved 2 new tasks, and 1 is running fine now...
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.
|
|
|
|
Sorry, now I see already a thread about this.
I have a GTX280/9800GT (no SLI/not overclocked) and I am using 6.3.21 client with XP Pro 32bit and had this second times in a few days. Before I was using 6.3.19 and never had this problem!
My system is not running 24/7, only crunching when I am at home. Normally it takes me 2-3 days to finish.
It seems there is no solution for now?
<core_client_version>6.3.21</core_client_version>
<![CDATA[
<message>
Unzul�ssige Funktion. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 1
# Device 0: "GeForce GTX 280"
# Clock rate: 1404000 kilohertz
# Device 1: "GeForce 9800 GT"
# Clock rate: 1620000 kilohertz
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# Device 0: "GeForce GTX 280"
# Clock rate: 1404000 kilohertz
# Device 1: "GeForce 9800 GT"
# Clock rate: 1620000 kilohertz
Cuda error: Kernel [reduce4_kernel] failed in file 'reduction.cu' in line 143 : unspecified launch failure.
<core_client_version>6.3.21</core_client_version>
<![CDATA[
<message>
Unzul�ssige Funktion. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce GTX 280"
# Clock rate: 1404000 kilohertz
# Device 1: "GeForce 9800 GT"
# Clock rate: 1620000 kilohertz
MDIO ERROR: cannot open file "restart.coor"
Cuda error: Kernel [angle_kernel] failed in file 'bonded.cu' in line 547 : unspecified launch failure.
____________
|
|
|
|
Your error does not show the "out of memory" line, which this thread is about ;)
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
|
Sorry! But is there anyone who had this error or knows about it??
____________
|
|
|
Stony666 Send message
Joined: 15 Apr 08 Posts: 8 Credit: 3,512,241,737 RAC: 4,903,845 Level
Scientific publications
|
Hi,
I have a systems running with GTX 9800+ 512MB. Running with Vista64 and BOINC 6.3.21...
I have this one since Oct 28th:
<core_client_version>6.3.21</core_client_version>
<![CDATA[
<message>
Unzul�ssige Funktion. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
Cuda error in file 'deviceQuery.cu' in line 59 : initialization error.
</stderr_txt>
]]>
I have updated to the latest nvidia driver (178.08) BOINC to the latest Version and yesterday a bigger PSU (550W).
The error is still the same. No probs before with BOINC 6.3.19 and appl. 6.42.
What I have tested is restarting BOINC before getting one WU (settings to get only 0,05 days work, don't end in the 4WU per day limit...) and in most of the cases the WU is done.
This could only be a temporary solution!
Has somebody an idea about this?
____________
|
|
|
|
Do you mean you're getting it with every unit (if you don't baby-sit BOINC) or do you get the error occasionally?
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
Stony666 Send message
Joined: 15 Apr 08 Posts: 8 Credit: 3,512,241,737 RAC: 4,903,845 Level
Scientific publications
|
You can have a look an it :)
http://www.ps3grid.net/results.php?hostid=9664
The successful results are by baby-sitting GPUGrid. |
|
|
JayarghSend message
Joined: 21 Dec 07 Posts: 47 Credit: 5,252,135 RAC: 0 Level
Scientific publications
|
This is really not all that much different than what I posted in the 6.3.21 thread and never got a response on....that is my Linux 8.04 Q9550 9800GT will not d/l work under 6.3.21. I have to revert back to 6.3.19 get work and then re-install 6.3.21 to run. This is not a permanent solution. I also get major screen flicker in the message tab.I get the normal out of work message in 6.3.21...only cell proccesor work available. |
|
|
|
@Marodeur
- can you upgrade to 178.24, which is the latest one?
- 6.42 .. do you mean 6.45? It's been weeks since we switched from 6.45 to 6.48
@Jay
Are you referring to Marodeurs recent posts or something else? Because his problem seems actually very different than your's. He's having crashes upon WU startup, which can be avoided by restarting BOINC before each WU. You are talking about not getting work and screen flicker which, with all due respect, seem like some strange linux issues.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
JayarghSend message
Joined: 21 Dec 07 Posts: 47 Credit: 5,252,135 RAC: 0 Level
Scientific publications
|
Sorry ETA the similarity is having to babysit Boinc. |
|
|
|
This comp is just BOINC farm. There is no ANY other software, just drivers and BOINC.
Win XP x64
178.24
6.3.19
6.48
As always!
2GB RAM + 280GTX(1GB)
It`s crunching 24/7. No any changes in last week. ANY. Everything was fine until yesterday evening.
Today, right before I waked up:
<core_client_version>6.3.19</core_client_version>
<![CDATA[
<message>
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce GTX 280"
# Clock rate: 1296000 kilohertz
Cuda error in file '..\cuda/cutil.h' in line 298 : out of memory.
Memory usage: host: bytes device: bytes
Assertion failed: 0, file ..\cuda/cutil.h, line 298
OR
<core_client_version>6.3.19</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
Cuda error in file 'deviceQuery.cu' in line 59 : out of memory.
</stderr_txt>
]]>
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
</stderr_txt>
]]>
Last 15 tasks... !!! :/.
http://www.gpugrid.net/results.php?hostid=15721
And now my daily quota is ofc 4 :/. Dunno why, haven`t changed ANYTHING in comp (hardware or software). I could understand if I`ll do something with that, but this comp was crunching correct hole night... And I`ll repeat: it`s just an BOINC farm. Then pls don`t write about SLi, different drivers or anything like that... This comp was crunching correct last weeks.
Why it happened? Any1 knows?
____________
|
|
|
|
I guess if anyone can say anything helpful regarding your problem it would be GDF. The problem appeared with 6.48-WUs, so it's not related to the new application. Edit: oh.. did you reboot?
Unrelated: I've seen your other host uses 6.3.19 and occasionally gets this error that BOINC quit (likely on a file transfer). I don't have this error any more since I swtiched to 6.3.21. Give it a try, it seems very well-behaved.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
|
this appears to be a problem with win64 here are the latest fails with 6.3.21 and app 6.52
<core_client_version>6.3.21</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
Cuda error in file 'deviceQuery.cu' in line 59 : out of memory.
</stderr_txt>
]]>
<core_client_version>6.3.21</core_client_version>
<![CDATA[
<message>
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
# Using CUDA device 3
# Device 0: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 1: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 2: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 3: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 3
# Device 0: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 1: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 2: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 3: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Using CUDA device 3
# Device 0: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 1: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 2: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 3: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
Cuda error in file '..\cuda/cutil.h' in line 298 : out of memory.
Memory usage: host: bytes device: bytes
Assertion failed: 0, file ..\cuda/cutil.h, line 298
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
</stderr_txt>
]]>
<core_client_version>6.3.21</core_client_version>
<![CDATA[
<message>
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
# Using CUDA device 1
# Device 0: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 1: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 2: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 3: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 1
# Device 0: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 1: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 2: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 3: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Using CUDA device 1
# Device 0: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 1: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 2: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 3: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
Cuda error in file '..\cuda/cutil.h' in line 298 : out of memory.
Memory usage: host: bytes device: bytes
Assertion failed: 0, file ..\cuda/cutil.h, line 298
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
</stderr_txt>
]]>
<core_client_version>6.3.21</core_client_version>
<![CDATA[
<message>
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
# Using CUDA device 2
# Device 0: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 1: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 2: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 3: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 2
# Device 0: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 1: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 2: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 3: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Using CUDA device 2
# Device 0: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 1: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 2: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 3: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
Cuda error in file '..\cuda/cutil.h' in line 298 : out of memory.
Memory usage: host: bytes device: bytes
Assertion failed: 0, file ..\cuda/cutil.h, line 298
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
</stderr_txt>
]]>
<core_client_version>6.3.21</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 1: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 2: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 3: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# Device 0: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 1: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 2: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Device 3: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
# Using CUDA device 0
Cuda error in file 'deviceQuery.cu' in line 59 : out of memory.
</stderr_txt>
]]>
|
|
|
|
I guess if anyone can say anything helpful regarding your problem it would be GDF. The problem appeared with 6.48-WUs, so it's not related to the new application. Edit: oh.. did you reboot?
Unrelated: I've seen your other host uses 6.3.19 and occasionally gets this error that BOINC quit (likely on a file transfer). I don't have this error any more since I swtiched to 6.3.21. Give it a try, it seems very well-behaved.
MrS
I`ll switch to 6.3.21 ASAP. Unfortunatelly some of my comps are unavailable for me now (and soon). And I`m without WU`s :(.
Anyway thx :)
But, it seems to be (4 me) a problem with app. Tell me guys, not enough memory? Which memory - 2GB RAM not used for anything else (just Milkyway)? Or 1GB RAM on 280GTX? Impossible...
____________
|
|
|
|
But, it seems to be (4 me) a problem with app. Tell me guys, not enough memory? Which memory - 2GB RAM not used for anything else (just Milkyway)? Or 1GB RAM on 280GTX? Impossible...
I agree, it can not be that someone is actually using that memory for something else. But I wouldn't say "problem with app". The thing is, GPU-Grid asks the driver for memory, usually get allocated some space and is fine.. but if the driver decides for some reason that this amount of space is not available, then you get this "out of mem" error. I'd say the error lies somewhere in the range of app / driver / windows. Which, unfortunately, is a very large range..
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
Stony666 Send message
Joined: 15 Apr 08 Posts: 8 Credit: 3,512,241,737 RAC: 4,903,845 Level
Scientific publications
|
@Marodeur
- can you upgrade to 178.24, which is the latest one?
- 6.42 .. do you mean 6.45? It's been weeks since we switched from 6.45 to 6.48
Hi,
yes, it was 6.45 :)
I have upgraded the drivers for my card to 178.24.
I got 2 WUs in the morning. One of them is running at the moment. Hope to see, that the 2nd WU starts without the memory error.
btw. the combo 6.3.19 and 6.45 runs for more than a month without an error on my box...
One other thing I see is that the total calculation time has changed from about 38000 per WU to 48000. Is that a slowdown by the application or more calculations?
And one last thing - THANKS for the help!! :)
|
|
|
|
It's a slowdown because of CPU <1 on Windows. The speed of my GTX 260 on Vista 64 also went down from 33 ms/step to 47 ms/step only because of CPU <1...
Speed on Linux 64 is the same like before.
I already reported that in an email to GDF 4 days ago, but got no reply... :?
____________
pixelicious.at - my little photoblog |
|
|
Stony666 Send message
Joined: 15 Apr 08 Posts: 8 Credit: 3,512,241,737 RAC: 4,903,845 Level
Scientific publications
|
It seems to me that my problem is fixed now :)
Two new workunits were started without an intervention from me.
The last driver update to 178.24 was the winner.
Great help, thx again! |
|
|
|
That's good to hear! we should really keep in mind that 178.08 is a *Doh* for GPU-Grid.
Regarding the speed difference: actually you only have a few WUs with ~40ks, whereas most others took about 48ks (+/- a lot). If you run 3+1 instead of 4+1 it might be worth it, if you'd get consistent 40ks afterwards.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
|
It's a slowdown because of CPU <1 on Windows. The speed of my GTX 260 on Vista 64 also went down from 33 ms/step to 47 ms/step only because of CPU <1...
Speed on Linux 64 is the same like before.
I already reported that in an email to GDF 4 days ago, but got no reply... :?
I've reported my Windows XP slowdown also.
I'm not sure if it is a NVIDIA problem or a Windows problem.
GDF's reply, something to this effect, not exact words,
will be traveling November 15th to the US for the supercomputing conference trade show in Texas. He will take these issues up with the NVIDIA people in person when he is there.
We just have to wait and see what the outcome is. |
|
|
|
I have been rebooting my machine daily, to try to avoid this problem. It seemed to help for a while. It didn't happen for a couple of weeks anyway. Then last night, it happened again. It's with 6.3.19. Now I have to wait 24 hours until I can get more tasks to crunch.
And again, this time with 6.3.21. And another day of crunching lost.
____________
Reno, NV
Team: SETI.USA
|
|
|
|
Me too....another day lost crunching toast, er...another day lost toasting crunchy bagels, hmmmm...
WinXP Pro64
6.3.21
177.84
Is there a solution on the horizon??
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.
|
|
|
|
And again, this time with 6.3.21. And another day of crunching lost.
Same here. Unfortunately. Is there a way to monitor mem. usage on GFX cards - something like process explorer - as I suspect a major memory leak?
BR,
____________
|
|
|
|
Is there a way to monitor mem. usage on GFX cards - something like process explorer - as I suspect a major memory leak?
Search for Rivatuner at the end of this thread.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
GDFVolunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message
Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level
Scientific publications
|
So, nvidia said that they are aware of a leaking problem, but could never reproduce the problem.
Can anyone affected by this issue specify:
OS: (WIN XP, Vista, Linux,etc)
Bit: 32 or 64
Driver: Nvidia driver version
or any other information useful to them to debug it
thanks.
GDF |
|
|
|
I never got an "out of memory" error and Riva Tuner does not show an increase of used GPU memory in my case.
However, I do have a memory leak which was not there a few months ago. The only major change was going from ATI to nVidia, so I suspect a connection.
After 1 - 2 weeks my system "lost" about 1 GB of my 2 GB and I reach the point where it starts to hurt, so I have to reboot. Closing all apps (which I could find) didn't help. Also it is not related to gaming, as I first suspected. This time I was away from the Pc for a few days and it happened nevertheless. Is there a way to spot which app is responsible for this? Maybe some trick using Process Explorer?
I hope I'm not hijacking the thread, as any suggestions for further diagnostic given to me might also help the people with GPU errors.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
Kokomiko Send message
Joined: 18 Jul 08 Posts: 190 Credit: 24,093,690 RAC: 0 Level
Scientific publications
|
So, nvidia said that they are aware of a leaking problem, but could never reproduce the problem.
Can anyone affected by this issue specify:
OS: (WIN XP, Vista, Linux,etc)
Bit: 32 or 64
Driver: Nvidia driver version
or any other information useful to them to debug it
Yes. It's here only present on XP 64 bit with BOINC 64 bit. Every WU needs approximately 70 MB of the video memory. This problem is still present since the beginning of the use of his system for PrimeGrid/GPUGrid. I've never seen this problem on my XP 32 bit or Vista 64 bit PCs. When I forget to reboot my XP 64 bit PC every 2 daysw, they crash every WU from the third day on, when I have not further 70,2 MB free for the start of a new WU.
I have this problem with any CUDA enabled driver on my GTX260 card and before on the 8800GT card, which I had replaced.
____________
|
|
|
|
So, nvidia said that they are aware of a leaking problem, but could never reproduce the problem.
Can anyone affected by this issue specify:
OS: (WIN XP, Vista, Linux,etc)
Bit: 32 or 64
Driver: Nvidia driver version
or any other information useful to them to debug it
Yes. It's here only present on XP 64 bit with BOINC 64 bit. Every WU needs approximately 70 MB of the video memory. This problem is still present since the beginning of the use of his system for PrimeGrid/GPUGrid. I've never seen this problem on my XP 32 bit or Vista 64 bit PCs. When I forget to reboot my XP 64 bit PC every 2 daysw, they crash every WU from the third day on, when I have not further 70,2 MB free for the start of a new WU.
I have this problem with any CUDA enabled driver on my GTX260 card and before on the 8800GT card, which I had replaced.
I agree. It seems to be an XP 64 bit running 64 bit Boinc issue...
WinXP Pro64 SP2
6.3.21
177.84
eVGA 8800GT
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.
|
|
|
|
Would you mind checking with 32 Bit BOINC under XP 64? I doubt it will help, but better be safe than sorry. Or did anyone already test this?
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
GDFVolunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message
Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level
Scientific publications
|
This could actually work.
They said that it is an issue with 32, 64 bit.
gdf |
|
|
|
I have had no issues with out of memory on 64bit server 2008 and windows 7 running the 64bit BOINC. I have used the latest drivers (even beta) soon after they come out, same with the BOINC client. I am currently running driver 180.42 beta with BOINC 6.3.21. Tomorrow I will upgrade both the driver and the client to the latest. I have left server 2008 running for a month straight with no reboot, and still no issues. I also have 4GB ram, DXdiag says that approx 2,298MB is allocated for video use. I have a 8800 GTS 512mb card.
I have riva tuner installed, but I dont know how to see how much memory is being used for an application. |
|
|
|
TankMaster, the problem seems to affect WinXP 64 but not Vista/7/Server2008. I think they use different drivers. Regarding RivaTuner and mem usage:
"Install RivaTuner, click on this strange "button with a triangle" next to the line which tells you the physical details of your card, then search for the hardware monitoring in the list of symbols, which pops up and if it asks, tell it to "handle the plugins automatically". This should enable the vid mem module.. at least it did for me :)"
Just ask if you need any further help.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
STE\/ESend message
Joined: 18 Sep 08 Posts: 368 Credit: 4,173,502,885 RAC: 27,150,888 Level
Scientific publications
|
All my GPU Capable Box's are 64-Bit Windows XP-Pro, I made it a Ritual several weeks ago to Stop BOINC and Re-Boot the Box every 1.5 Days. If I wait 2 Days I get stung by the out of memeory error once in awhile so 1.5 days is best for me anyway. |
|
|
|
PoorBoy, would you mind checking [if there error is still present] with 32 Bit BOINC under XP 64?
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
STE\/ESend message
Joined: 18 Sep 08 Posts: 368 Credit: 4,173,502,885 RAC: 27,150,888 Level
Scientific publications
|
Okay, I'm changing 1 Box over now to 32-Bit but I just Re-Booted a few hours ago so it may take at least 2-3 days before I know or not ...
PS: I don't know what the problem is but the 32-Bit GPU Client refuses to install, I tried where I wanted it installed to & tried where it wanted to install to numerous times. It looks like it's installing but theres nothing there where it should be, I'm lucky I didn't lose the whole Directory as many times as I tried.
I tried on another box & it was a no go too ... |
|
|
|
Mhh, thanks for trying, anyway!
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
STE\/ESend message
Joined: 18 Sep 08 Posts: 368 Credit: 4,173,502,885 RAC: 27,150,888 Level
Scientific publications
|
I'm asking about it over in the Boinc Dev Forum now to see what the deal is on that ... |
|
|
|
Hi PoorBoy,
I had an old 32 bit Boinc (5.10.X) on my box. So I tried an upgrade to 6.3.21 (32 bit), and it werked.
I just finished loading a GPU task and some other Boinc wu's. I performed a re-boot, and will let it run on it's own for a few days....
WinXP pro x64
6.3.21 (32 bit)
8800GT
177.84 driver
I'm using the existing x64 video driver...as the 32 bit driver will not install due to the OS...
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.
|
|
|
STE\/ESend message
Joined: 18 Sep 08 Posts: 368 Credit: 4,173,502,885 RAC: 27,150,888 Level
Scientific publications
|
Hi Bender, I did finally just get a 32-Bit Client to install on 1 Box but it was v5.10.45 which isn't a Cuda Version, strange thing is though it's running 3 GPU Wu's as I type this and none of the regular Wu's, it's only a 8800GT OC on that Box & not a 8800GTx3 OC ... hahaha ... Weird ???
I'm going to shut BOINC down now on that box & see if it will install the Proper 32-Bit Cuda Client v6.3.21 ... Laterz
PS: Got the 32-Bit Version of 6.3.21 to install in the Directory's I wanted the files in so it's running okay now so now we have 2 people trying the 32-Bit Client out on a 64-Bit OS & checking to see if they get the Memory Error or not. 3 GPU Wu's got Toasted in the Process though but I got new Wu's. They probably Erred out because they ran under v5.10.45 for a few Minutes, I didn't think to Suspend them thinking they wouldn't even start up with that version of client.
I Rebooted & will let the Box run for 4-5 Days like that, if it doesn't get a Memory Error by then it Probably won't. I can let it run longer if need be as long as I'm running the regular Wu's I'm running now because they run just as fast with a 32-Bit Client as they do with a 64-Bit Client it looks like ... |
|
|
|
LOL.. 5.x doesn't know anything about CUDA, but it seems it's still able to run the code, because you have the CUDA drivers. Interesting :D
@PoorBoy-Edit: to my knowledge ABC is the only project which really benefits from 64 Bit mode. Quite logical, since basically all other projects use floating point instead of integer, where most calculations are double precision anyway (i.e. 64 Bit external, 80 Bit internal).
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
STE\/ESend message
Joined: 18 Sep 08 Posts: 368 Credit: 4,173,502,885 RAC: 27,150,888 Level
Scientific publications
|
LOL.. 5.x doesn't know anything about CUDA, but it seems it's still able to run the code, because you have the CUDA drivers. Interesting :D
Yes, I figured it was so confused with the v5.10.45 Client it didn't know what to do so it ran the GPU Wu's instead of the regular Wu's it should have run, and thats why they probably erred out when I installed the v6.3.21 Client I figure ...
@PoorBoy-Edit: to my knowledge ABC is the only project which really benefits from 64 Bit mode. Quite logical, since basically all other projects use floating point instead of integer, where most calculations are double precision anyway (i.e. 64 Bit external, 80 Bit internal).MrS
The Sieve Wu's at Prime Grid use a 64-Bit OS to their Benefit, Cosmology, MilkyWay, there may be others too but not sure. You also get extra Credit using a 64-Bit OS @ those Projects ... :)
On second thought I don't know if those Projects actually benefit from using a 64-Bit OS but the Wu's sure do run faster thus gaining you more Credit per time spent Processing.
Example > When all my Boxes were 32-Bit I was only able to get about 45,000 to 50,000 Credits Per Day, after I changed them all to 64-Bit I was able to get 85,000 to 95,000 Credits Per Day. So it's worth the switch & you can always still run a 32-Bit Client if need be at the Projects that require you to run that to get any Wu's from them ...
|
|
|
|
TankMaster, the problem seems to affect WinXP 64 but not Vista/7/Server2008. I think they use different drivers. Regarding RivaTuner and mem usage:
"Install RivaTuner, click on this strange "button with a triangle" next to the line which tells you the physical details of your card, then search for the hardware monitoring in the list of symbols, which pops up and if it asks, tell it to "handle the plugins automatically". This should enable the vid mem module.. at least it did for me :)"
Just ask if you need any further help.
MrS
All I see is core/shader/mem speeds, core/abient tems, fan RPM/dudty cycle, and supply voltage. No mem usage. |
|
|
|
All I see is core/shader/mem speeds, core/abient tems, fan RPM/dudty cycle, and supply voltage. No mem usage.
Then you have to customize that yourself like I had to do, too.
In the hardware monitoring, click on the "setup"-button in the lower right corner.
Here, go to "plugins" in the lower left corner.
Then search for "vidmem.dll" and give it a hook.
When you click OK now, three more monitorings should appear.
____________
Member of BOINC@Heidelberg and ATA!
|
|
|
STE\/ESend message
Joined: 18 Sep 08 Posts: 368 Credit: 4,173,502,885 RAC: 27,150,888 Level
Scientific publications
|
All I see is core/shader/mem speeds, core/abient tems, fan RPM/dudty cycle, and supply voltage. No mem usage.
Then you have to customize that yourself like I had to do, too.
In the hardware monitoring, click on the "setup"-button in the lower right corner.
Here, go to "plugins" in the lower left corner.
Then search for "vidmem.dll" and give it a hook.
When you click OK now, three more monitorings should appear.
I don't know what Version of RivaTuner your using DoctorNow but the Version I'm Using (2.0 Final) doesn't have no "setup-button" in the lower right corner. In fact it doesn't have a "setup-button" under any of the Tabs.
|
|
|
|
All I see is core/shader/mem speeds, core/abient tems, fan RPM/dudty cycle, and supply voltage. No mem usage.
Then you have to customize that yourself like I had to do, too.
In the hardware monitoring, click on the "setup"-button in the lower right corner.
Here, go to "plugins" in the lower left corner.
Then search for "vidmem.dll" and give it a hook.
When you click OK now, three more monitorings should appear.
thx! foiund out. However...- Videomemory usage monitoring is not available under Vista due to Vista videomemory virtualization. so thats why it wasnt avalible. |
|
|
|
In the hardware monitoring panel there is a setup button. I didn't do any ddl stuff there, just selecting "vid mem usage".. on XP.
And thanks for the update on 64 Bit crunching.. seem like I lost track of these developments ;)
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
STE\/ESend message
Joined: 18 Sep 08 Posts: 368 Credit: 4,173,502,885 RAC: 27,150,888 Level
Scientific publications
|
Just had the "Cuda error in file 'deviceQuery.cu' in line 59 : out of memory" bug bite 1 of my Box's & it's only been about 36 hours since I Re-Booted them all. So I guess every day & 1/2 isn't enough. I'm going to start trying every 24 hours if possible & see If I still get the Error ...
Luckily I caught it in time before it could suck up it's daily Quota of Wu's & then I wouldn't be able to get anymore on that box for 24 Hr's or until the next Server Day, I Rebooted it & got 4 more Wu's though ... :)
Actually I don't think theres any set time for this Error to happen, I know I've had it happen in less than 24 Hours after Rebooting but generally it takes about 2 day's to occur.
In the hardware monitoring panel there is a setup button. I didn't do any ddl stuff there, just selecting "vid mem usage".. on XP.
I don't see a Hardware Monitoring Panel either, your not talking about the Power User Panel are you ??? I see things related to Memory usage in there but when activating them I don't see any new show up to where I can actually see the usage. |
|
|
|
The box which errored out again, was that the one with 32 Bit BOINC or still 64?
Regarding RivaTuner: sorry, I don't know what you mean. I made a screen shot and circled the relevant buttons in, that should help. BTW, I think the interface of RivaTuner is really awful, the functionality is hidden in a strange way..
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
STE\/ESend message
Joined: 18 Sep 08 Posts: 368 Credit: 4,173,502,885 RAC: 27,150,888 Level
Scientific publications
|
The box which errored out again, was that the one with 32 Bit BOINC or still 64?
It was a 64-Bit Box that was Rebooted about 36 hours ago, I keep track when I Reboot them so I do it on a regular schedule which like I said is going to a daily one instead of a 1.5 Day one ...
Regarding RivaTuner: sorry, I don't know what you mean. I made a screen shot and circled the relevant buttons in, that should help. BTW, I think the interface of RivaTuner is really awful, the functionality is hidden in a strange way..MrS
I agree, I don't actually use RiveTuner but wanted to see what you guys were talking about so I installed it on 1 Box. I also don't have that Button you've circled, I looked there earlier thinking that's where it was supposed to be but it's not there for some reason.
PS: I figured it out, I was using the 2.0 Final Version of RivaTuner & there was a 2.20 Version available. I downloaded that version & installed it and the Buttons appeared with that Version ... |
|
|
|
Regarding RivaTuner: sorry, I don't know what you mean. I made a screen shot and circled the relevant buttons in, that should help. BTW, I think the interface of RivaTuner is really awful, the functionality is hidden in a strange way..
MrS
yeah, did all that, but memory usage is not in the list for vista/server2008/win7 do to the new way they manage video memory. |
|
|
|
Hi PoorBoy,
I had an old 32 bit Boinc (5.10.X) on my box. So I tried an upgrade to 6.3.21 (32 bit), and it werked.
I just finished loading a GPU task and some other Boinc wu's. I performed a re-boot, and will let it run on it's own for a few days....
WinXP pro x64
6.3.21 (32 bit)
8800GT
177.84 driver
I'm using the existing x64 video driver...as the 32 bit driver will not install due to the OS...
Early readings:
Sunday - video mem 121.7 mb
Monday - video mem 181.9 mb
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it. |
|
|
|
@Bender: so it seems the 32 Bit BOINC will not help? Which would mean the fault is likely in the driver.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
|
Yes MrS, It looks like 32 bit client may have the same problem running on an XPpro x64 machine...
Readings:
Sunday Noonish - video mem 121.70 mb
Monday AM - video mem 181.90 mb
Monday PM - video mem 242.02 mb
I have completed 2 wu's so far, and the 3rd is at 14%. Seems like video mem is not being released at wu completion??
No errors yet...
WinXP pro x64
6.3.21 (32 bit)
8800GT
177.84 driver
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.
|
|
|
|
Noob here.
I may not have time to babysit the installation, but on:
* WinXP AMD64
* GF 8800GT
* 178.28 driver
* 6.3.21 BOINC (x86_64)
<core_client_version>6.3.21</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
Cuda error in file 'deviceQuery.cu' in line 59 : out of memory.
</stderr_txt>
]]>
Problems occuring straight from boot. |
|
|
|
Jure.. straight from boot is new. Are you running anything else graphics intensive along with GPU-Grid? Maybe a minimized game or something?
Edit: don't about render programs which use open-gl.. maybe they could also take the entire vid mem.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
STE\/ESend message
Joined: 18 Sep 08 Posts: 368 Credit: 4,173,502,885 RAC: 27,150,888 Level
Scientific publications
|
Yes MrS, It looks like 32 bit client may have the same problem running on an XPpro x64 machine...
Readings:
Sunday Noonish - video mem 121.70 mb
Monday AM - video mem 181.90 mb
Monday PM - video mem 242.02 mb
I have completed 2 wu's so far, and the 3rd is at 14%. Seems like video mem is not being released at wu completion??
No errors yet...
WinXP pro x64
6.3.21 (32 bit)
8800GT
177.84 driver
Hi Bender, are you using something else besides RivaTuner to monitor your Video Memory usage ??? Mine never changes according to RivaTuner on my 8800GT OC, it's been @ 181.96 for the last 4-5 Hours. That does kinda coincide with your "Monday AM - video mem 181.90 mb" usage though.
The work unit it's running now won't be done until about 6:00AM EST so I'll have to check it again after it finishes & see if the Memory Usage goes up or not ... |
|
|
|
Hi PB,
Nope, I'm using RivaTuner 2.2.
Without delving too deep into the logfile. Some video mem is getting frozen after every completed wu (~60 mb?). If it increments again when this wu finishes sometime in the early am on Tuesday....
I'm going to let it crunch away until it fails...if it fails that is..;)
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.
|
|
|
STE\/ESend message
Joined: 18 Sep 08 Posts: 368 Credit: 4,173,502,885 RAC: 27,150,888 Level
Scientific publications
|
Yup, the next Wu on the 8800GT OC finished early this morning and when I checked RivaTuner the Video Memory Usage had gone up to 242.21mb to go right along with Benders "Monday PM - video mem 242.02 mb" ... :)
PS: Something else I discovered was that even just shutting BOINC Down & Restarting it again will increase the Video Memory Usage just as much as running a whole Wu will.
On the 8800GT OC the Video Memory usage goes up about 60mb with each finished Wu, I shut down BOINC on the 32-Bit Box & Restarted it to see if the Video Memory got cleared to a lower figure & the Video Memory Usage had gone up another 60mb instead. I did it again & it went up another 60mb even though just a few minutes had passed.
This probably explains why sometimes it only takes less than a day for the errors to crop up again after re-booting a Box. I may be messing around with BOINC shutting it off & restarting it for some other reason and boom all off a sudden I get the out off memory errors ... |
|
|
STE\/ESend message
Joined: 18 Sep 08 Posts: 368 Credit: 4,173,502,885 RAC: 27,150,888 Level
Scientific publications
|
@Bender: so it seems the 32 Bit BOINC will not help? Which would mean the fault is likely in the driver.
MrS
Which Driver do you mean ET, the Video Driver ... ???
|
|
|
|
The 3rd wu just finished, and again, the rivatuner logfile shows a 60 mb jump in video mem 'in use'...at the same time the wu finished.
Readings:
Sunday Noonish - video mem 121.70 mb
Monday AM - video mem 181.90 mb
Monday PM - video mem 242.02 mb
Tuesday AM - video mem 302.15 mb
I think I've gone far enough. I set the project to 'no new tasks'. If this wu completes without going nuclear, I'm going to switch back to the 64 bit client.
Oh well, It's off to werk I go...
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.
|
|
|
|
@PoorBoy: yes, the vid driver.
@Bender: yep, sounds far enough.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
|
Could anyone try the 180.60 driver and see if the problem is solved?
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
GDFVolunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message
Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level
Scientific publications
|
Hi,
the bug should be in the driver when a 32 bit application is run on a 64 bit Win machine. We will try to find a workaround, but no promises. We might end up distributing the 64 bit application for windows as well.
gdf |
|
|
|
I've had 180.60 on my Winx64 box since last night. Just started it's 1st wu. we will see...
driver 180.60
60.21 mb video usage
Boinc 6.3.21 (64 bit)
8800GT
I already don't like 180.60, as my fan speed control is gone from Ntune.
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.
|
|
|
|
Is there an nTune upgrade available? Generally I don't like nTune that much, it .. kind of works, but a lot of stuff doesn't work.
When I change GPU clocks in the NV control panel (enabled by nTune, I think) it almost all the time says "invalid clocks, not applied" but it applies the settings (most of the time) anyway. I'd call this software beta.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
STE\/ESend message
Joined: 18 Sep 08 Posts: 368 Credit: 4,173,502,885 RAC: 27,150,888 Level
Scientific publications
|
I've had 180.60 on my Winx64 box since last night. Just started it's 1st wu. we will see...
driver 180.60
60.21 mb video usage
Boinc 6.3.21 (64 bit)
8800GT
I already don't like 180.60, as my fan speed control is gone from Ntune.
I tried the 180.60 too but went back the older driver because the Fan Speed Controler was gone in both nTune & EVGA Precision Tune too ...
|
|
|
|
Hi PoorBoy
I just fixed the fan problem (WinXP 64, 8800GT, using 180.60), with RivaTuner (2.20)...
http://www.overclock.net/nvidia/180529-how-permanently-set-8800-fan-speed.html
Try this and see if it werks. I played with the fan duty cycle to find the lowest temp(56 c), with the lowest duty cycle(90%),for my box, and saved that.
Checked the box to allow it to re-load RivaTuner (and fan setting), and it worked after a re-boot.
@MrS: current Ntune (i think) is 5.05.54...
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.
|
|
|
|
Sorry, seems like I mixed up nTune and this 80+ MB "nVidia system tools". I deinstalled the first one, because it was fairly useless (for me), and kept the second one. Which is the one I was complaining about.. very limited functionality to start with and most of it doesn't work, most of the time. But nevermind, this is just getting off topic..
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
|
I just finished my 1st wu using the 180.60 driver. The 'video usage' increased by 58 mb on completion.
Start of 1st wu - 60.21 mb
Finish 1st wu - 118.90 mb
WinXP 64
180.60
6.3.21 (64 bit)
____________
Consciousness: That annoying time between naps......
Experience is a wonderful thing: it enables you to recognize a mistake every time you repeat it.
|
|
|
|
So the problem is still there. Thanks for testing!
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
|
Everyone affected by the problem please take a look here! Seems like the problem is finally solved :)
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|