Warning: Parameter 1 to Language::getMagic() expected to be a reference, value given in /home/boincnew/public_html/boinc-wiki.info/w/includes/StubObject.php on line 58
Computer is overcommitted - Unofficial BOINC Wiki

Computer is overcommitted

From Unofficial BOINC Wiki

(Redirected from Computer is over committed)
Jump to: navigation, search

Contents

[edit] General

Message Type: Information Message

This message is caused by the computer having too much work on hand. Because there is so much work on hand, it is possible that some of the work might not be completed before a Result Deadline occurs. The remainder of this explanation can be by-passed if you don't want to get into the deep technical details.

[edit] Version information

This Message has been changed into a debugging only message so it should not be seen in the normal course of events for BOINC Versions 4.44 and greater.

This Message is obsolete in BOINC Versions 5.6.0 and greater.

[edit] Detailed Explanation (Version 4.45)

If the cumulative remaining time to process all Work Units with Result Deadlines earlier than or equal to any Work Unit is greater than 80% of the time to the Result Deadline for that Work Unit the CPU Scheduler will enter Earliest Deadline First mode.

Another way to say that: For each Work Unit in our queue, if the sum of estimated remaining times for itself and all work units due before it add up to greater than 80% of the time remaining before it's Result Deadline, there is a potential for it missing the deadline, and we will enter Earliest Deadline First mode to try to prevent the problem.

[edit] Example(s)

(Times are in hours.)

result CPU time remaining remaining time to deadline required time fraction cumulative required time fraction trouble?
A 10 100 0.1 0.1 (10 / 100) no
B 90 120 0.75 0.83 ((10+90) / 120) yes
C 1000 5000 0.20 0.22 ((10+90+1000) / 5000) no

In the above example the CPU will be in Earliest Deadline First because of Work Unit B and will process WUs A and then B until the situation is resolved, but it may not process Work Unit B completely in Earliest Deadline First mode. Let's look at the same example 10 hours later assuming that Earliest Deadline First was in force for the entire time and that the time estimates were perfect. Work Unit A has just finished, having gotten 100% of our time.

result CPU time remaining remaining time to deadline required time fraction cumulative required time fraction trouble?
B 90 110 0.82 0.82 yes
C 1000 4990 0.20 0.22 no

Notice that there is still trouble but it has been reduced. We will remain in Earliest Deadline First mode. Now again 10 hours later:

result CPU time remaining remaining time to deadline required time fraction cumulative required time fraction trouble?
B 80 100 0.80 0.80 yes
C 1000 4980 0.20 0.22 no

Now the CPU has just about worked its way out of trouble. In another hour it will be temporarily out of trouble.

result CPU time remaining remaining time to deadline required time fraction cumulative required time fraction trouble?
B 79 99 0.797 0.797 no
C 1000 4979 0.20 0.22 no

During the next hour WU C will be processed as it almost certainly has a very large short term debt accrued.

result CPU time remaining remaining time to deadline required time fraction cumulative required time fraction trouble?
B 79 98 0.806 0.806 yes
C 999 4978 0.20 0.22 no

And we start a cycle of about 4 hours processing WU B followed by an hour of processing WU C.


[edit] Detailed Explanation (Version 4.72)

If a simulation shows that continuing to process the current Work Units in round-robin fashion, following the user's Resource Shares, would cause any Work Unit's completion to be greater than 90% of the time to the Result Deadline for that Work Unit the CPU Scheduler will enter Earliest Deadline First mode.

[edit] Example(s)

(Times are in hours.)

result CPU time remaining remaining time to deadline resource share status
A1 10 100 100 running
A2 35 280 100 ready to run
B1 2 10 100 paused
B2 14 110 100 ready to run

Round-robin (with "switch between applications every x minutes" set to 60) we would spend one hour on Work Unit A1 and one hour on Work Unit B1 alternating. B1 will complete four hours from now, well within it's 10-hour deadline. We then begin processing B2, still alternating with A1.

result CPU time remaining remaining time to deadline resource share status
A1 8 96 100 running
A2 35 276 100 ready to run
B2 14 106 100 paused

After another fifteen hours (8 to A1 and 7 to B2), Work Unit A1 will be completed. Again, well within it's 96 hour deadline. We then begin processing A2 and B2 alternating.

result CPU time remaining remaining time to deadline resource share status
A2 35 261 100 running
B2 7 91 100 paused

After yet another fifteen hours, (8 to A2 and 7 to B2), Work Unit B2 will be completed. Well within it's 91 hour deadline. We then process A2 non-stop.

result CPU time remaining remaining time to deadline resource share status
A2 27 246 100 running

After 27 hours, A2 will complete, well within it's 246 hour deadline. This simulation shows that there is no need to enter Earliest Deadline First mode.

However, if we change the Resource Share for Project A to 500 instead of 100...

result CPU time remaining remaining time to deadline resource share status
A1 10 100 500 running
A2 35 280 500 ready to run
B1 2 10 100 paused
B2 14 110 100 ready to run

Round-robin (with "switch between applications every x minutes" set to 60) we would spend five hours on Work Unit A1 and one hour on Work Unit B1 alternating. B1 will complete twelve hours from now, past it's 10-hour deadline. This simulation shows that there is a need to enter Earliest Deadline First mode, which will complete Work Unit B1 two hours from now.

The Version 4.72 BOINC Daemon takes a few milliseconds to run this simulation every time "something changes" that could affect the Work Units currently in your local queue finishing before their Deadlines.

[edit] Example Log(s)

[edit] Computer Is Overcommitted

(1) 2005-06-13 23:08:09 [Einstein@Home] Started upload of H1_0343.5__0343.8_0.1_T25_Fin1_1_0
(2) 2005-06-13 23:08:11 [             ] Computer is overcommitted
(3) 2005-06-13 23:08:11 [             ] New work fetch policy: no work fetch allowed.
(4) 2005-06-13 23:08:11 [Einstein@Home] Finished upload of H1_0343.5__0343.8_0.1_T25_Fin1_1_0
(5) 2005-06-13 23:08:11 [Einstein@Home] Throughput 153258 bytes/sec
(6) 2005-06-13 23:08:12 [Einstein@Home] Sending scheduler request to
                                        http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
(7) 2005-06-13 23:08:13 [Einstein@Home] Scheduler request to 
                                        http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded

Line-By-Line Explanation

  1. Started upload of '(file)'
  2. Computer is overcommitted
    • In checking the current Work Buffer size, there is plenty of work on hand. Possibly enough to make some Deadlines difficult to meet.
  3. New work fetch policy: no work fetch allowed.
    • So there is no need to fetch additional work.
  4. Finished upload of '(file)'
    • And we finished the upload.
  5. Throughput 'x' bytes/sec
    • With a pretty darn good throughput!
  6. Sending scheduler request to '(url)'
  7. Scheduler request to '(url)' succeeded
    • And we sucessfully reported the Result.

[edit] Other Related Messages

  • None.

--Bill Michael 21:15, 26 Aug 2005 (CDT) based on original work by, and further explanations from:

--John McLeod VII 14:35, 4 Jun 2005 (EDT)

Personal tools
RSS Feeds
BOINC Wiki RSS feeds RSS Feeds
Powered by BOINC!
Powered by BOINC