Scheduling Server Timing and Retry Policies

From Unofficial BOINC Wiki

Jump to: navigation, search

Contents

[edit] General

Each Scheduler RPC reports Results, gets work, or both. The BOINC Client Software's Scheduler RPC Policy has several components: when to make a Scheduler RPC, which Project to contact, which Scheduling Server for that project, how much work to ask for, and what to do if the RPC fails.

The Scheduler RPC Policy has the following goals:

[edit] Long Term Debt

The BOINC Client Software maintains a sum of the CPU Time it has devoted to each Project. This sum is adjusted so that the average of all Projects is zero. Long Term Debt is a measure of how much work the BOINC Client Software "owes" the Project, and in general the Project with the greatest Long Term Debt is the one from which work should be requested.

[edit] Minimum RPC time

The BOINC Client Software maintains a minimum RPC Time for each Project. This is the earliest time at which a Scheduling RPC should be done to that Project (if zero, an RPC can be done immediately). The minimum RPC time can be set for various reasons:

[edit] Scheduler RPC Sessions

Communication with the Project's Schedulers is organized into sessions, each of which may involve many RPCs. There are two types of sessions:

  • Get-work sessions, whose goal is to get a certain amount of work. Results may be reported as a side-effect.
  • Report-result sessions, whose goal is to Report Results. Work may be fetched as a side-effect.

The internal logic of Scheduler Sessions is encapsulated in the class SCHEDULER_OP. This is implemented as a state machine, but its logic expressed as a process might look like:

get_work_session() {
    while estimated work < high water mark
        P = project with greatest debt and min_rpc_time < now
        for each scheduler URL of P
            attempt an RPC to that URL
            if no error break
        if some RPC succeeded
            P.nrpc_failures = 0
        else
            P.nrpc_failures++
            P.min_rpc_time = exponential_backoff(P.min_rpc_failures)
            if P.nrpc_failures mod MASTER_FETCH_PERIOD = 0
                P.fetch_master_flag = true
    for each project P with P.fetch_master_flag set
        read and parse master file
        if error
            P.nrpc_failures++
            P.min_rpc_time = exponential_backoff(P.min_rpc_failures)
        if got any new scheduler urls
            P.nrpc_failures = 0
            P.min_rpc_time = 0
}

report_result_session(project P) {
    for each scheduler URL of project
        attempt an RPC to that URL
        if no error break
    if some RPC succeeded
        P.nrpc_failures = 0
    else
        P.nrpc_failures++;
        P.min_rpc_time = exponential_backoff(P.min_rpc_failures)
}

The logic for initiating Scheduler Sessions is embodied in the scheduler_rpcs->poll() function.

if a scheduler RPC session is not active
    if estimated work is less than low-water mark
        start a get-work session
    else if some project P has overdue results
        start a report-result session for P;
        if P is the project with greatest resource debt,
        the RPC request should ask for enough work to bring us up
        to the high-water mark


[edit] UCB Source

[edit] Copyright ©

  • 2005 University of California
  • 2005 Paul D. Buck

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.

Personal tools