Reporting Process
From Unofficial BOINC Wiki
Contents |
[edit] General
This is the process whereby a Participant's Computer completes the processing of a Work Unit and then returns the information and Credit claims back to the Project.
This is a two stage process and this works as follows:
- The Work Unit is completely processed by the Science Application. The Result Data File, which contains all of the information of interest to the Project is uploaded to the Project's Data Server.
Note: The network setting, if set to "Disable BOINC Network Access", will prevent this stage from happening until the Participant re-enables network access.
Once the upload has been completed the Work Unit's status in the BOINC Manager's Work Tab will be set to "Ready to report".
- In the second stage the fact that the Work Unit has completed processing is "reported" to the Project's Scheduling Server. This is the point where the Participant's Claimed Credit is established in the Project's database.
The reason for the two step policy is that the first stage only has to upload a file to a directory on the Data Server. The second stage requires a connection and contact with the BOINC Database. Because it is easier to report a list of files, the process was divided in this manner so that each part can be performed independently of the other.
If there is a heavy load on the server, if we tried to do both actions within the same transaction, any failure of the process invalidates the entire process. Since the file upload is a very simple transaction it can be easily performed independently of other activities.
The second stage is also nicely "atomic" but also lets us "report" more than one Result at the same time.
[edit] Scheduler RPC Triggers
RPCs to the Project's Scheduler are triggered for a few reasons. In most cases allowing the BOINC Client Software to automatically decide when a Scheduler RPC is needed is the best option.
- More work is needed. When more work is needed for the same Project the Scheduler RPC to request more work will also Report completed work.
- Ready To Report work is less than 24 hours from the Deadline.
- Ready To Report work has been Ready To Report longer than the Work Buffer size. First implemented in Version 4.43, but was buggy.
- To send a Trickle-message in Climateprediction.net (CPDN)
- Manually requested by the Participant.
[edit] Reporting Process Design Considerations
Well, the original design was to fill up to a high water-mark, and not re-fill before the content of the Work Buffer dropped below the low water-mark. Both the high and low water-mark were configurable by the Participant as Preference Settings. In this design, there could be many saved-up Results before the BOINC Daemon finally asked for more work, and grouping everything together into one RPC therefore made sense.
Later, not sure if it was around v4.05 or something, the design was changed to where there was only one cache-setting, and the BOINC Client Software asked for 2x of this Work Buffer size setting. This again meant could be multiple Results reported each time the BOINC Daemon asked for more work.
In v4.20, the BOINC Client Software stopped asking for 2x of the Work Buffer size setting, meaning you're at the current behavior, where you'll crunch Result #1, start on Result #2, and ask for more work during Result #2. Well, you don't follow this all the time, but this is basically the current behavior, except for Participants with non-permanent connection.
With the introducing of the new CPU Scheduler in v4.35, there is a possibility can take days/weeks from the time a Result was finished before next time the BOINC Client Software would be asking for more work. Results will of course be reported after the rule "if less than 24h from deadline", but due to the potentially very long wait it was decided to add the new rule, "report result if N days since finished, there N is cache-setting". But, with the adding of this new functionality, a bug also snuck in, meaning v4.45 reported after each Result instead of waiting as it really should have done.
v5.2.x just removes the bug introduced in v4.4x, and you're now back to v4.2x-behavior, with normally one RPC per Result due to asking for more work when the buffer size is small (less than or near the run time of the Result processing). When the Work Buffer is much larger than the run time of a Result you may get multiple Results to "report" in the RPC, but normally only 1 or 2 Results per RPC.
Why the variation? A small example: Let's say you've just crunched many "fast" Results, taking 1.5h, and expected "To completion" in BOINC Client Software has also dropped to 1.5h. If you've got a 5-days Work Buffer, it means 80 Results. If you now completes a "normal" result taking 2h, expected "To completion" also increases to 2h, meaning the BOINC Daemon now thinks you've got 6.58 days in Work Buffer. Till expected run-time of Work Buffer again drops below 5 days, you'll not ask for more work, and therefore can finish many Results before reporting again.
Now, haven't looked too closely into the working of the Scheduling Server, but a quick look indicates each RPC gives 3 db-reads and 1 db-write for looking-up the Participant, host and team information, and writing host information. This is done once per connection.
On top of this, there is a db-read and db-write for each Result sent-out or "reported".
This means, if you add an extra RPC just to report a Result, just like v4.45 due to a bug was doing, there will be 8 db-reads and 4 db-writes per Result.
If, on the other hand, the system works as designed, with the BOINC Daemon waiting to report until next time there is a need to ask for more work, you will need 5 db-reads and 3 db-writes per Result.
If you move the reporting to the upload-server, you will still have 4 reads and 2 writes for assigning a "Result". Also, the upload-server at the minimum will have 1 read and 1 write per Result, and must also look-up user-info, host-info and team-info to make sure it's not an impostor returning a Result, meaning 3 reads. At the end, since Claimed Credit normally relies on a possibly changed Benchmark, also updating host-info can be a good idea, meaning 1 more db-write.
Meaning, you're back at 8 db-reads and 4 db-writes per Result ... This is the "worse-case" database-load-scenario in the current system...
Well, it is a possibility you can get away without looking-up team-info, and choose to not update computer-info, this will mean 7 db-reads and 3 db-writes.
Meaning, waiting on reporting till next time asks for more work gives lower database-load than you can get by letting the reporting of a result be part of the result-file-upload.
Moving "reporting" to the time when the upload-file occurs also has other weaknesses, like:
1) Makes it more difficult to run multiple upload-servers, since each one now needs database-access. For example, the Climateprediction.net (CPDN) project runs multiple upload-servers with two in the UK and another in Switzerland. All of these would require direct access to a common database.
2a) To solve this the Upload Server can either "Ok" everything and have delayed db-update (but this will break the re-issue of "lost" results projects can choose to use); or
2b) the Upload Server must update the database and the BOINC Client must wait on an "Ok" before closing the connections.
3) If, for example, a download-error occurs you must either duplicate code for reporting in the Scheduling Server and upload-handler, or add an unnessesary load to upload-server just to report the error.
4) Opening a db-connection before a result is fully uploaded can greatly increase the load on db-server, and if upload-server starts dropping connections you'll basically kill the db. Therefore, you need to finish a result-upload before opening db.
5) If the BOINC System uses 2b; there is a 25% chance in the SETI@Home Project that an uploaded Result was the 4th Result. If for any reason the client uploading this last Result did NOT get the "Ok", it means result-file is deleted from disk, and because of #4 you need to re-upload the full result-file just to get "Ok" from server.
Normally, #5 isn't a problem in SETI@Home since the Results are so small, but you can expect some angry dialup-users if they're forced to re-upload maybe a 30minute-1h upload just to get an "Ok"...
So, to sum it up, a re-design of the BOINC System to let the reporting be part of the Result upload process will give higher db-load, higher upload-server-load, clients using longer on uploads, meaning upload-server can handle less results/day.
Also, you will either break the re-issue of "ghost-results", get some angry dial up-users, or program a "db-killer" in #4...
The only gain is less load on Scheduling Server, since the upload-server is handling the reporting instead and thus gets the higher load.
As for a project, reporting immediately after upload will often not change anything, since the other Results for same Work Unit can be stuck in a 10-day Work Buffer, meaning that reporting now or a couple hours later will not result in the validation of the Work Unit any faster...
RSS Feeds

