Hive future
So here is a list of features we are missing or which need improvement in future releases:
Security
- GetPlugins currently returns all plugins from the server. This exposes all uploaded assemblies. When confidentiality for plugins is relevant this method should be removed and only GetPlugin(s)ById and GetPlugin(s)ByHash should be available.
- Slave-user: Each hive slave uses the same username and password. A slave is allowed to download jobs. When a slave downloads a job it should be checked if the job is assigned to this slave or a parent-slave-group (not implemented yet). However it is still possible for an attacker to fake the ID of another slave (if it is known) and get access to jobs.
Statistics
- Further measures to include (as total sums, also keep deleted jobs in DeletedJobStatistics):
- Globally: FinishedJobs, WaitingJobs, FailedJobs, AbortedJobs, TransferringJobs, PausedJobs
- Per user: total jobdata-size (MB)
Server performance
- Increasing number of slaves puts pressure on the server with increasing response times and some deadlock-situations. Ideas to resolve:
- Increase heartbeat-interval (maybe dynamic when the number of slaves gets higher). Remember to increase the SlaveHeartbeatTimeout in the web.config too.
- Make GetWaitingJobs faster by using stored procedure or use a job-queue instead of querying the whole job-table.
- Large jobs (>15MB) are sometimes result in database-timeouts, especially if multiple of them are uploaded concurrently. Ideas to resolve:
- Use Filestream as db-type instead of Varbinary as it is supposed to be faster for large data-blobs.
- As streaming is not an option (no security, encryption), using a chunking channel could work (http://msdn.microsoft.com/en-us/library/aa717050.aspx).
Scheduling
Some ideas for a scheduler:
- 3 levels of priorities:
- Job priority (fixed at upload)
- User priority (fixed)
- Time (dynamic: f(Now-Uploaded))
- Those 3 priority values are aggregated (average, (weighted-)sum) represent the final priority by which the jobs are ordered.
- Fast-slaves-first: Faster slaves get the jobs first, slow slaves later. This would require:
- Performance-index: Let each slave calculate a benchmark-job before it is used.
- Job-queues per slaves: Right now every slave who sends a heartbeat gets a job (if one is available). One queue per slave would allow the server to actively assign jobs to slaves. Such a queue could also ease performance issues and race conditions.
- Re-scheduling: Sometimes fast slaves finish their jobs and slow slaves are still calculating. In those cases it might be reasonable to pause the jobs and have them calculated on the faster slaves.
Last modified 13 years ago
Last modified on 11/10/11 16:27:18