Project Configuration File
From Unofficial BOINC Wiki
Contents |
[edit] General
A BOINC Powered Project is described by a configuration file named config.xml in the Project Directory. A config.xml file looks like this:
<boinc>
<config>
<host> project.hostname.ip </host>
<db_name> databasename </db_name>
<db_host> database.host.ip </db_host>
<db_user> database_user_name </db_user>
<db_passwd> database_password </db_passwd>
<shmem_key> shared_memory_key </shmem_key>
<download_url> http://A/URL </download_url>
<download_dir> /path/to/directory </download_dir>
<download_dir_alt> /path/to/directory </download_dir_alt>
<uldl_dir_fanout> N </uldl_dir_fanout>
<upload_url> http://A/URL </upload_url>
<upload_dir> /path/to/directory </upload_dir>
<cgi_url> http://A/URL </cgi_url>
<stripchart_cgi_url> http://A/URL </stripchart_cgi_url>
<log_dir> /path/to/directory </log_dir>
[ <disable_account_creation/> ]
[ <show_results/> ]
[ <one_result_per_user_per_wu/> ]
[ <workload_sim>0|1</workload_sim> ]
[ <max_wus_to_send> N </max_wus_to_send> ]
[ <min_sendwork_interval> N </min_sendwork_interval> ]
[ <daily_result_quota> N </daily_result_quota> ]
[ <ignore_delay_bound/> ]
[ <locality_scheduling/> ]
[ <locality_scheduling_wait_period> N </locality_scheduling_wait_period> ]
[ <min_core_client_version> N </min_core_client_version ]
[ <choose_download_url_by_timezone/> ]
[ <cache_md5_info/> ]
[ <min_core_client_version_announced> N </min_core_client_version_announced> ]
[ <min_core_client_upgrade_deadline> N </min_core_client_upgrade_deadline> ]
[ <choose_download_url_by_timezone> N </choose_download_url_by_timezone> ]
[ <cache_md5_info> N </cache_md5_info> ]
[ <nowork_skip> N </nowork_skip> ]
[ <sched_lockfile_dir> path </sched_lockfile_dir> ]
<!-- optional; defaults as indicated: -->
<project_dir> ../ </project_dir> <!-- relative to location of 'start' -->
<bin_dir> bin </bin_dir> <!-- relative to project_dir -->
<cgi_bin_dir> cgi-bin </cgi_dir> <!-- relative to project_dir -->
</config>
<daemons>
<daemon>
<cmd> feeder -d 3 </cmd>
[ <host> hostname.ip </host> ]
[ <disabled> 1 </disabled> ]
</daemon>
<daemon>
...
</daemon>
</daemons>
<tasks>
<task>
<cmd> get_load </cmd>
<output> get_load.out </output>
<period> 5 min </period>
[ <host> host.ip </host> ]
[ <disabled> 1 </disabled> ]
[ <always_run> 1 </always_run> ]
</task>
<task>
<cmd> echo "HI" | mail root@example.com </cmd>
<output> /dev/null </output>
<period> 1 day </period>
</task>
<task>
...
</task>
</tasks>
</boinc>
[edit] Configuration Element Definitions
The general project configuration elements are:
| host | name of project's main host, as given by Python's socket.hostname(). BOINC Server-Side Daemon Programs and Tasks run on this host by default. |
| db_name | Database name |
| db_host | Database host machine |
| db_user | Database user name |
| db_passwd | Database password |
| shmem_key | ID of scheduler shared memory. Must be unique on host. |
| download_url | URL of data server for download |
| download_dir | absolute path of download directory |
| download_dir_alt | absolute path of old download directory (see Hierarchical Upload/Download Directories) |
| upload_url | URL of file upload handler |
| uldl_dir_fanout | fan-out factor of upload and download directories (see Hierarchical Upload/Download Directories) |
| upload_dir | absolute path of upload directory |
| cgi_url | URL of scheduling server |
| stripchart_cgi_url | URL of stripchart server |
| log_dir | Path to the directory where the assimilator, feeder, transitioner and
cgi output logs are stored. This allows you to change the default log directory path. If set explicitly, you can also use the 'grep logs' features on the administrative pages. Note: enabling 'grep logs' with very long log files can hang your server, since grepping GB files can take a long time. If you enable this feature, be sure to rotate the logs so that they are not too big. |
| sched_lockfile_dir | Directory where scheduler lockfiles are stored. Must be writeable to the Apache user. |
| workload_sim | Do a simulation, based on current client workload, in deciding whether a job's deadline can be met. |
[edit] Web Site Options
The following control features that you may or may not want available to Participants.
| disable_account_creation | If present, disallow account creation |
| show_results | Enable web site features that show results (per user, host, etc.) |
[edit] Controlling Results
The following settings control the way in which Results are scheduled, sent, and assigned to Participants and the Participant's Hosts.
| one_result_per_user_per_wu |
If present, send at most one Result of a given Work Unit to a given Participant. This is useful for checking accuracy/validity of Results. It ensures that the results for a given Work Unit are generated by different Participants. If you have a Validator that compares different Results for a given Work Units to ensure that they are equivalent, you should probably enable this. Otherwise, you may end up Validating Results from a given Participant with results from the same Participant. |
| max_wus_to_send | The Maximum number of Results that are sent per Scheduler RPC. This helps to prevent those Participant's Hosts with trouble from getting too many results and trashing them. But you should set this large enough so that a host which is only connected to the net at intervals has enough work to keep it occupied in between connections. |
| min_sendwork_interval | The Minimum number of seconds to wait after sending Results to a given Participant's Host, before new Results are sent to the same Host. This helps to prevent those Hosts with download or application problems from trashing lots of Results by returning lots of Error Results. But don't set it to be so long that a host goes idle after completing its work, before getting new work. |
| daily_result_quota | The Maximum number of Results (per CPU) sent to a given Participant's Host in a 24-hour period. Helps prevent hosts with download or application problems from returning lots of error results. Be sure to set it large enough that a host does not go idle in a 24-hour period, and can download enough work to keep it busy if disconnected from the net for a few days. The maximum number of CPUS is bounded at four. |
| ignore_delay_bound |
By default, results are not sent to hosts too slow to complete them within delay bound. If this flag is set, this rule is not enforced. |
| locality_scheduling |
When possible, send work that uses the same files that the Participant's Host already has. This is intended for projects which have large Work Unit Data Files, where many different Work Units use the same data file. In this case, to reduce download demands on the Data Server, it may be advantageous to retain the data files on the hosts, and send them work for the files that they already have. See Locality Scheduling. |
| locality_scheduling_wait_period | This element only has an effect when used in conjunction with the previous Locality Scheduling element. It tells the Scheduler to use 'trigger files' to inform the Project that more work is needed for specific files. The period is the number of seconds which the Scheduler will wait to see if the Project can create additional work. Together with project-specific Daemons or scripts this can be used for 'just-in-time' Work Unit creation. See Locality Scheduling. |
| min_core_client_version | If the scheduler gets a request from a Participant's Host with a version number less than this, it returns an error message and doesn't do any other processing. |
| choose_download_url_by_timezone |
When the Scheduler sends work to a Participant's Host, it replaces the download URL appearing in the data and executable file descriptions with the download URL closest to the host's timezone. The Project must provide a two-column file called download_servers' in the Project Root Directory. This is a list of all Download Servers that will be inserted when work is sent to Participant's Hosts. The first column is an integer listing the server's offset in seconds from UTC. The second column is the server URL in the format such as http://einstein.phys.uwm.edu. The download servers must have identical file hierarchies and contents, and the path to file and executables must start with '/download/...' as in 'http://einstein.phys.uwm.edu/download/123/some_file_name'. |
| cache_md5_info | When creating work, keep a record (in files called foo.md5) of the file length and md5 sum of data files and executables. This can greatly reduce the time needed to create work, if (1) these files are re-used, and (2) there are many of these files, and (3) reading the files from disk is time-consuming. |
| min_core_client_version_announced | Announce a new version of the BOINC Daemon, which in the future will be the minimum required version. In conjunction with the next tag, you can warn Participants with version below this to upgrade by a specified Deadline. Example value: 419. |
| min_core_client_upgrade_deadline |
Use in conjunction with the previous tag. The value given here is the Unix epoch returned by time(2) until which hosts can update their BOINC Daemon. After this time, they may be shut out of the Project. Before this time, they will receive messages warning them to upgrade. |
| nowork_skip |
If the Scheduling Server has no work, it replies to RPCs without doing any database access (e.g., without looking up the user or host record). This reduces the Database load, but it fails to update Preferences when users click on Update. Use this setting if your Database Server is overloaded. |
[edit] Project Tasks and Daemons
Tasks are periodic, short-running jobs. <cmd> and <period> are required. OUTPUT specifies the file to output and by default is COMMAND_BASE_NAME.out. Commands are run in the <bin_dir> directory which is a path relative to <project_dir> and output to <log_dir>.
The BOINC Server-Side Daemon Programs are continuously-running programs. The process ID is recorded in the <pid_dir> directory and the process is sent a SIGHUP in a DISABLE operation.
Both Tasks and BOINC Server-Side Daemon Programs can run on a different Host (specified by the <host> element). The default is the project's main host, which is specified in config.host A BOINC Server-Side Daemon Program or Task can be turned off by adding the <disabled> element. As well, there may be some tasks you wish to run via cron regardless of whether or not the project is enabled (for example, a script that logs the current CPU load of the host machine). You can do so by adding the <always_run> element (<disabled> takes precedence over <always_run>).
[edit] UCB Source
[edit] Copyright ©
- 2005 University of California
- 2005 Paul D. Buck
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.

