Context Navigation

#1233 closed enhancement (done)

Hive-3.4 development

Reported by:	cneumuel	Owned by:	ascheibe
Priority:	medium	Milestone:	HeuristicLab 3.3.6
Component:	Hive.General	Version:	3.3.6
Keywords:		Cc:	ascheibe

Description (last modified by ascheibe)

General notes

Server

~~Refactor domain objects and db-schema~~
- ~~Split info-objects and data-objects (like Job and JobData)~~
~~Data Access Layer (more consistent method names, more compact code, inspired by OKB)~~
~~Split transaction and db-context handling~~
~~Allow uploading of plugins for a job (or hiveexperiment)~~
Make WCF service completely stateless. Put all remaining state-information into the database (latestHeartbeats, latestConsistencyCheck, newlyAssignedJobs (remove completely and solve by adding a heartbeat))
~~StateLog: Log state transitions of jobs.~~
~~Statistics~~
- ~~Measure core capacity and utilization every minute~~
- ~~Measure CPU and memory capacity and utilization every minute~~
- Reliably measure the execution time spent on hive per user / in total. Also measure speedup values (maybe also per minute). Keep jobs deleted jobs in database (flag them) - only delete JobData, plugins ect.
- ~~Number of experiments / jobs (per user). Job per slave~~
- ~~Calculate overall productivity per job (waiting time vs. computation time)~~
Scheduler
- Consider waiting time to avoid starvation
- Users should have priorities
- A user should be able to manage priorities only in the scope of his own experiments
- Childjobs should automatically have the priorities of their parent jobs
- Precomputed job-queue
~~Fix wrong timestamps in statelog on services.heuristiclab.com~~

Slave

~~Adapt Slave for new Server~~
~~Refactor Slave (easier communication between core and executor)~~
~~Tests~~
~~Console Client~~
~~Windows Service Client~~
~~Installer for Slave~~
~~Windows Tray Icon for Slave~~
~~HL App Client~~
~~Sort out problem with uploaded, modified assemblies which aren't downloaded to the slave; Add GUIDs to PluginCache~~
~~Heartbeat interval should be controllable by the server~~
~~Creation of a unique Id for a machine which does not change if the config is deleted~~
~~Correct total physical memory available for a slave (ConfigManager)~~
~~Test sandboxing and security of appdomains. If any assemblies can be uploaded by users, becomes very important.~~
~~React on SayHello action (call Hello service method)~~
~~Send cpu utilisation with every heartbeat~~
~~Log exceptions to Windows Event Log~~
~~FreeCores needs to be decremented right after a CalculateJob message has been received. Otherwise a slave reports free cores which are already reserved for new jobs.~~
~~PluginTemp directory should be cleaned up from time to time (or on startup)~~
~~SlaveCommListener in Slave.Tests should not be used in ConsoleClient~~
Heartbeats are massively delayed, because the heartbeat-method locks on engines (in GetExecutionTimeOfAllJobs) and the same lock is made at StartJobInAppDomain. This causes the a slave-heartbeat-timeout (1 minute), thus a reset and reassignment of all jobs.

Experiment Manager

~~Show jobs in treeview. Would greatly save screen space and navigation-clicks~~
- ~~to be enhanced (event wiring)~~
~~Sort HiveExperiments alphabetically~~
~~Plugin-Upload (optional)~~
~~Experiment Sharing~~
~~Appropriate numbering of Runs~~
~~Use Service-Call pattern from OKB (or PPOV-Cockpit)~~
~~Show StateLog - use Gantt Chart like view~~
~~Pause and stop single jobs~~
~~Paused jobs should not be integrated into experiment, so results are not lost. Parameters of paused jobs should be changable (and used when resumed).~~
Deleting jobs after adding them (neither the remove button, nor the del key, nor the context menu entry succeeds in deleting a job (experiment) that has just been dragged in)

Hive Engine

HiveEngine jobs should have a HiveExperiment, which is marked, so a user cannot see it in HiveExperimentManager. However it should be visible in Administration GUI. If a Hive Engine crashes and cannot delete the experiment, this should be detected by the server and it should be automatically deleted.
~~Improve HiveEngine View (list of jobs, with status ect.)~~
~~Stabilize~~

Administration

Missing WebService Methods:

~~GetAllHiveExperiments~~
GetUsers
GetUserStatistics
~~GetJobsBySlave -> GetJobsByResourceId~~
GetGlobalStatistics (for Statistics TabPage)
~~GetScheduleForResource (+ Add/Update/Delete)~~

TODOS:

~~convert HeuristicLab.Calender to a plugin~~
~~use svcutil~~
~~write partial classes for dtos and implement IContent~~
~~build Observable Collections for Users/Slaves/Groups~~
~~add ContentViews for Users and SlaveGroups~~
~~show some fancy statistics~~
~~add Save Button~~
~~integrate HeuristicLab.Services.Hive.Common-3.4 in Server~~
~~get rid of HiveItem etc. on Server~~

Meeting protocols

Architects meeting ^(16.06.2011)

DataAccess:

~~TransactionManager with interface again~~
~~remove AssignedResourcesId in AssignedResources, use JobId+ResourceId as primary keys~~
remove CreateHiveDatabaseApplication. the db schema should not be developed dbml first, since dbml does not support most sql-server features. instead the sql-server schema should be designed first and the dbml should be generated.
~~UptimeCalendar should be named DowntimeCalendar~~
~~DataAccess layer and Dao classes should be removed, access to linq to sql should happen directly in server-implementation.~~

Server

~~Lifecycle should be named differently. maybe EventHandler, EventManager.~~
~~put magic numbers into config~~
- ~~timeout in Lifecycle~~
- ~~ApplicationConstants~~
- ~~look for magic numbers in hive client~~
~~GetWaitingJobs should be implemented as a stored procedure and should also assign a job to a slave. it should make sure no race conditions occur if it is called concurrently.~~
- ascheibe: moved back to next HL release

HiveExperiment

~~rename: HiveExperiment -> Job, Job -> Task~~
~~HiveExperimentPermissions~~
- ~~the GrantedUserId could be removed~~
  - ascheibe: GrantedUserId is part of the PK and can't be removed. GrantedByUserId is not necessary and could be removed, but it still could be interesting information?
  - ascheibe: When talking to swagner it was decided that we leave it because it could be interesting in the future.
- ~~only Full and Read permissions are necessary (Read: just read!, Full: control, delete, grant permissions)~~
~~remove LastAccessed and IsHiveEngine. there should be a category field instead.~~

Remarks for the future (cneumuel) ^(28.06.2011)

Security

GetPlugins currently returns all plugins from the server. This exposes all uploaded assemblies. When confidentiality for plugins is relevant this method should be removed and only GetPlugin(s)ById and GetPlugin(s)ByHash should be available.
Slave-user: Each hive slave uses the same username and password. A slave is allowed to download jobs. When a slave downloads a job it should be checked if the job is assigned to this slave or a parent-slave-group (not implemented yet). However it is still possible for an attacker to fake the ID of another slave (if it is known) and get access to jobs.

Statistics

Further measures to include (as total sums, also keep deleted jobs in DeletedJobStatistics):
- Globally: FinishedJobs, WaitingJobs, FailedJobs, AbortedJobs, TransferringJobs, PausedJobs
- Per user: total jobdata-size (MB)

Server performance

Increasing number of slaves puts pressure on the server with increasing response times and some deadlock-situations. Ideas to resolve:
- Increase heartbeat-interval (maybe dynamic when the number of slaves gets higher). Remember to increase the SlaveHeartbeatTimeout in the web.config too.
- Make GetWaitingJobs faster by using stored procedure or use a job-queue instead of querying the whole job-table.
Large jobs (>15MB) are sometimes result in database-timeouts, especially if multiple of them are uploaded concurrently. Ideas to resolve:
- Use Filestream as db-type instead of Varbinary as it is supposed to be faster for large data-blobs.
- As streaming is not an option (no security, encryption), using a chunking channel could work (http://msdn.microsoft.com/en-us/library/aa717050.aspx).

Scheduling
Some ideas for a scheduler:

3 levels of priorities:
- Job priority (fixed at upload)
- User priority (fixed)
- Time (dynamic: f(Now-Uploaded))
Those 3 priority values are aggregated (average, (weighted-)sum) represent the final priority by which the jobs are ordered.
Fast-slaves-first: Faster slaves get the jobs first, slow slaves later. This would require:
- Performance-index: Let each slave calculate a benchmark-job before it is used.
- Job-queues per slaves: Right now every slave who sends a heartbeat gets a job (if one is available). One queue per slave would allow the server to actively assign jobs to slaves. Such a queue could also ease performance issues and race conditions.
Re-scheduling: Sometimes fast slaves finish their jobs and slow slaves are still calculating. In those cases it might be reasonable to pause the jobs and have them calculated on the faster slaves.

Change History (204)

comment:1 Changed 14 years ago by cneumuel

Status changed from new to accepted

comment:2 Changed 14 years ago by cneumuel

Version changed from 3.3 to branch

comment:3 Changed 13 years ago by cneumuel

Summary changed from Refactore Hive Project Structure to Hive-3.4 development

comment:4 Changed 13 years ago by cneumuel

Description modified (diff)

comment:5 Changed 13 years ago by ascheibe

Description modified (diff)

comment:6 Changed 13 years ago by cneumuel

Description modified (diff)

comment:7 Changed 13 years ago by ascheibe

Description modified (diff)

comment:8 Changed 13 years ago by ascheibe

Description modified (diff)

comment:9 Changed 13 years ago by cneumuel

Description modified (diff)

comment:10 Changed 13 years ago by cneumuel

Cc ascheibe added

comment:11 Changed 13 years ago by ascheibe

Description modified (diff)

comment:12 Changed 13 years ago by cneumuel

Description modified (diff)

comment:13 Changed 13 years ago by cneumuel

Description modified (diff)

comment:14 Changed 13 years ago by cneumuel

Description modified (diff)

comment:15 Changed 13 years ago by cneumuel

Description modified (diff)

comment:16 Changed 13 years ago by cneumuel

Description modified (diff)

comment:17 Changed 13 years ago by cneumuel

Description modified (diff)

comment:18 Changed 13 years ago by cneumuel

Description modified (diff)

comment:19 Changed 13 years ago by cneumuel

Description modified (diff)

comment:20 Changed 13 years ago by ascheibe

Description modified (diff)

comment:21 Changed 13 years ago by ascheibe

Description modified (diff)

comment:22 Changed 13 years ago by cneumuel

Description modified (diff)

comment:23 Changed 13 years ago by cneumuel

Description modified (diff)

comment:24 Changed 13 years ago by ascheibe

Description modified (diff)

comment:25 Changed 13 years ago by cneumuel

Description modified (diff)

comment:26 Changed 13 years ago by cneumuel

Description modified (diff)

comment:27 Changed 13 years ago by ascheibe

Description modified (diff)

comment:28 Changed 13 years ago by ascheibe

Description modified (diff)

comment:29 Changed 13 years ago by ascheibe

Description modified (diff)

comment:30 Changed 13 years ago by ascheibe

Description modified (diff)

comment:31 Changed 13 years ago by ascheibe

Description modified (diff)

comment:32 Changed 13 years ago by ascheibe

Description modified (diff)

comment:33 Changed 13 years ago by ascheibe

Description modified (diff)

comment:34 Changed 13 years ago by ascheibe

Description modified (diff)

comment:35 Changed 13 years ago by ascheibe

Description modified (diff)

r5633 added Appointment/Schedule ws and dao methods

comment:36 Changed 13 years ago by cneumuel

r5636

updated jobstates documentation
enhanced ganttChart
fixed setting of jobstates
added option to force lifecycle-trigger (mainly for testing purposes)

comment:37 Changed 13 years ago by cneumuel

Description modified (diff)

r5637 added treeview for hive jobs in experiment manager

Last edited 13 years ago by cneumuel (previous) (diff)

comment:38 Changed 13 years ago by ascheibe

r5638 worked on Administration UI

comment:39 Changed 13 years ago by cneumuel

r5675 improved treeview for hive jobs

Last edited 13 years ago by cneumuel (previous) (diff)

comment:40 Changed 13 years ago by ascheibe

Description modified (diff)

r5676 worked on Administration UI

comment:41 Changed 13 years ago by ascheibe

r5677 some minor ui fixes for slave

comment:42 Changed 13 years ago by cneumuel

r5708 changed the way transactions are handled

Last edited 13 years ago by cneumuel (previous) (diff)

comment:43 Changed 13 years ago by ascheibe

Description modified (diff)

r5711

use SlaveComm Endpoint from app.config
various further slave bugfixes/cleanups
added preliminary icon for hive slave ui and some slave ui improvements
added resource deletion to admin ui
fix service exception thrown if there is no EventLog

comment:44 Changed 13 years ago by cneumuel

Description modified (diff)

comment:45 Changed 13 years ago by cneumuel

Description modified (diff)

comment:46 Changed 13 years ago by cneumuel

r5718

fixed statelog when time on server differs from slave or client
fixed wrong creation of childjobs in experiment manager
made ganttchardview the default view for statelogs

comment:47 Changed 13 years ago by ascheibe

r5721 worked on slave and slave service installer

comment:48 Changed 13 years ago by ascheibe

Description modified (diff)

r5778

log uncaught exceptions to an eventlog if available
fixed job pause bug

comment:49 Changed 13 years ago by cneumuel

Description modified (diff)

r5779

implemented pause, stop for single jobs
introduced Command property for jobs (to distinguish between state and command (abort vs. aborted))
improved behaviour of ItemTreeView (double click opens new window, selected item stays marked)
fixed bugs in StateLogGanttChartListView and HiveJobView
fixed cloning of client-side dtos

comment:50 Changed 13 years ago by ascheibe

r5780 various improvments on the service installer and slave tray icon

comment:51 Changed 13 years ago by ascheibe

r5782

fixed job pause bug... again
general Executor improvements

comment:52 Changed 13 years ago by cneumuel

r5786

implemented correct numbering of BatchRuns
improvements in ExperimentManager
fixed bug in server (jobs were scheduled multiple times)
added exception handling for task in slave
improved timeout handling of jobs (LifecycleManager)

comment:53 Changed 13 years ago by cneumuel

Description modified (diff)

comment:54 Changed 13 years ago by cneumuel

r5787 made deleting and creating directories for PluginTemp more robust

comment:55 Changed 13 years ago by ascheibe

r5789

added autostart for tray icon to installer
machine unique id now includes the machine name
core: check if job already exists on slave
already finished jobs now fail and are sent back

comment:56 Changed 13 years ago by ascheibe

r5790 don't save the unique machine id

comment:57 Changed 13 years ago by cneumuel

Description modified (diff)

r5793

implemented correct downloading of paused jobs. its now also possible to change parameters and resume a algorithm
removed Prepare() calls in ExperimentManager and in slave, as it prevents corrent resuming of paused jobs
made events in ItemTreeView be invoked in the correct thread
reduced log output in ExperimentManager

comment:58 Changed 13 years ago by ascheibe

r5795 various slave and slave tray icon improvements

comment:59 Changed 13 years ago by cneumuel

r5797

ItemTreeView robustifications
compactified the layout in HiveJobView

comment:60 Changed 13 years ago by ascheibe

r5826 slave ui now receives status information and displays it in doughnut chart

comment:61 Changed 13 years ago by cneumuel

r5955

seperated ExperimentMangerClient (OKB-Style, contains business logic) and HiveExperiment (mainly only contains information)
fixed redundant cloning methods in dtos
added simple statistics in HiveExperiment which the user can see before downloading an experiment
added db-delete cascade for slaves and statelogs - now slaves can be safely deleted

comment:62 Changed 13 years ago by cneumuel

r5958 initial port of HiveEngine

Last edited 13 years ago by cneumuel (previous) (diff)

comment:63 Changed 13 years ago by cneumuel

r6000 :)

added GetPlugin service method
fixed minor issues with double plugins in database
worked on HiveEngine
fixed wrong role name for Hive User
fixed bug in group assignment of slaves

comment:64 Changed 13 years ago by ascheibe

r6004

fix pause/stop bug when serializing big experiments
use proper newlines
use GetPlugin(..) instead of GetPlugins()

comment:65 Changed 13 years ago by cneumuel

r6006

changed relationship between Job and HiveExperiment. There is no more HiveExperiment.RootJobId, instead there is Job.HiveExperimentId.
one HiveExperiment can now have multiple Experiments.
TreeView supports multiple root nodes
HiveEngine creates a HiveExperiment for each set of jobs, so jobs cannot be without an parent experiment anymore (no more loose jobs)
updated ExperimentManager binaries

comment:66 Changed 13 years ago by ascheibe

r6008

increase timeout when sending (for sending large jobs/lot's of plugins)
handle failed GetPluginDatas() properly

comment:67 Changed 13 years ago by gkronber

GetPluginDatas() is a strange identifier. Plural of data is data.

comment:68 Changed 13 years ago by cneumuel

r6033

created baseclass for jobs (ItemJob) which derives OperatorJobs and EngineJobs
created special view for OptimizerJobs which derives from a more general view
removed logic from domain class HiveExperiment and moved it into RefreshableHiveExperiment
improved ItemTreeView
corrected plugin dependencies
fixed bug in database trigger when deleting HiveExperiments
added delete cascade for Plugin and PluginData
lots of fixes

comment:69 Changed 13 years ago by cneumuel

Description modified (diff)

comment:70 Changed 13 years ago by cneumuel

Description modified (diff)

comment:71 Changed 13 years ago by ascheibe

r6100

Executor now sends all exceptions to the ExperimentManager as NetNamedPipe communication won't be possible in a Sandbox due to security constraints
count stopped and aborted jobs correctly
send correct status when a job is stopped by the ExperimentManager
try to log unhandled exceptions to gui if no EventLog is available
don't crash if job is sent more than once by server

comment:72 Changed 13 years ago by ascheibe

Description modified (diff)

r6101

don't lock engines for so long in StartJobInAppDomain
move SlaveCommListener to ConsoleClient
delete orphaned job folders at startup

comment:73 Changed 13 years ago by ascheibe

r6107

simplify PreparePlugins
send more exceptions to ExperimentManager

comment:74 Changed 13 years ago by cneumuel

r6110

renamed engines to executors
changed locking in StartJobInAppDomain
avoid destruction of proxy object after 5 minutes for Slave.Core
added JobStarted event and fixed ExecutionStateChanged and ExecutionTimeChanged
slaves which are moved to another slavegroup will pause their jobs now, if they must not calculate them

comment:75 Changed 13 years ago by cneumuel

r6111 improved the way jobs are downloaded by ExperimentManager and HiveEngine

comment:76 Changed 13 years ago by ascheibe

r6112

HeartbeatManager: don't sleep while starting jobs
Executor: make Start() blocking
shutdown properly if an uncaught exception is thrown

comment:77 Changed 13 years ago by ascheibe

r6116

SlaveTrayIcon: don't try to kill TrayIcons from other users
split installer to fix config installer bug for users who did not run the installer

comment:78 Changed 13 years ago by ascheibe

r6166 forgot to check in HL icon for installers

comment:79 Changed 13 years ago by ascheibe

r6167

increased send/receive timeout
renamed hive binding name

comment:80 Changed 13 years ago by cneumuel

r6168

removed Job-dto objects from slave core (since it stores outdated objects)
added command textbox to HiveJobView
improved the way the control buttons behave in HiveJobView
improved job control (pause and stop is also possible when job is not currently calculating)
improved gantt chart view (last state log entry is also displayed)
unified code for downloading jobs between experiment manager and hive engine

comment:81 Changed 13 years ago by ascheibe

r6175 temporary switch to privileged sandboxing until communication between core and executor works with sandboxing

comment:82 Changed 13 years ago by cneumuel

r6178

added semaphores to ensure an appdomain is never unloaded when the start method has not finished
HiveEngine uploading and downloading of jobs works and is displayed in the view

comment:83 Changed 13 years ago by ascheibe

r6203

dropped dependency of Core from Executor
enabled sandboxing
moved most parts of Job handling from Core to SlaveJob to simplify locking
optimized how UsedCores is handled
SlaveStatusInfo is now thread-save and counts jobs more correct

comment:84 Changed 13 years ago by ascheibe

r6204 don't crash on shutdown

comment:85 Changed 13 years ago by cneumuel

r6212 created HiveEngine.Views plugin

comment:86 Changed 13 years ago by ascheibe

r6216

make UsedCores more reliable
some cosmetic fixes

comment:87 Changed 13 years ago by cneumuel

r6219 improved exception handling for hive experiments

comment:88 Changed 13 years ago by ascheibe

r6225

Slave UI now uses tab pages
balloon tips are displayed on receiving new jobs

comment:89 Changed 13 years ago by cneumuel

Description modified (diff)

r6229

added basic statistics recording (once per minute) for
- executiontime per user
- usedcores, usedmemory per slave

comment:90 Changed 13 years ago by ascheibe

r6230

don't set every view as default in slave ui
fixed bug in PluginCache where files got accessed by multiple threads

comment:91 Changed 13 years ago by ascheibe

r6248

don't set job failed if JobNotFoundException is thrown
disable AboutView for all items
avoid NullRefException in SendFinishedJob

comment:92 Changed 13 years ago by ascheibe

r6257

added UAC self elevation for start/stop of windows service
added slave states and simplified ui commands

comment:93 Changed 13 years ago by ascheibe

r6263

added view for displaying jobs
improved slave ui

comment:94 Changed 13 years ago by ascheibe

Description modified (diff)

comment:95 Changed 13 years ago by cneumuel

Description modified (diff)

r6267

extended statistics recording:
- execution times of users are captured
- execution times and start-to-finish time of finished jobs is captured (to computer hive overhead)
- data of deleted jobs is automatically captured in DeletedJobStatistics
changed ExecutionTime type in database from string to float (milliseconds are stored instead of TimeSpan.ToString())
added IsPrivileged field to job to indicate if it should be executed in a privileged sandbox
added CpuUtilization field to slave to be able to report cpu utilization
added GetJobsByResourceId to retrieve all jobs which are currently beeing calculated in a slave(-group)
TransactionManager now allows to use serializable tranactions (used for lifecycle trigger)

comment:96 Changed 13 years ago by cneumuel

Description modified (diff)

r6269 added CpuUtilization to heartbeats

comment:97 Changed 13 years ago by cneumuel

r6357

refactoring of slave core
created JobManager, which is responsible for managing jobs without knowing anything about the service. this class is easier testable than slave core
lots of cleanup
created console test project for slave

comment:98 Changed 13 years ago by cneumuel

r6362 changed roles authentication to use an AuthenticationManager instead of method attributes. this makes unit tests easier.

comment:99 Changed 13 years ago by cneumuel

Description modified (diff)

r6369

added consideration of appointments in heartbeats
code cleanup

comment:100 Changed 13 years ago by ascheibe

Description modified (diff)

r6371

code cleanups for slave review
added switch between privileged and unprivileged sandbox
removed childjob management because it's not used

comment:101 Changed 13 years ago by ascheibe

r6372 changed year to 2011

comment:102 Changed 13 years ago by cneumuel

r6373

moved ExperimentManager into separate plugin
moved Administration into separate plugin

comment:103 Changed 13 years ago by cneumuel

r6381

locking for childHiveJobs in OptimizerHiveJob avoid multi threaded access issues
added IsPrivileged to gui
minor changes

comment:104 Changed 13 years ago by ascheibe

r6407

implemented usage of checksums for comparing assemblies
re-added CreateHiveDatabaseApplication.cs to project

comment:105 Changed 13 years ago by abeham

r6418

fixed references to absolute path references

comment:106 Changed 13 years ago by cneumuel

r6419, r6420

created events when statelog changed
fixed memory leak in hiveengine
extended timeout for long running transactions and database contexts (when jobdata is stored)
replaced random guids in database with sequential guids for performance reasons
minor fixes and cleanups
updated hive binaries
updated statistics

comment:107 Changed 13 years ago by abeham

r6422

synchronized config file with that from trunk

comment:108 Changed 13 years ago by abeham

Description modified (diff)

Added TODO point regarding the deletion of jobs in the experiment manager

comment:109 Changed 13 years ago by ascheibe

r6426 removed useLocalPlugins

comment:110 Changed 13 years ago by cneumuel

r6431 - applied some review comments

General:

changed Log to ThreadSafeLog
added license information to all files
added assembly descriptions
using blocks before namespace

HeuristicLab.Services.Hive.DataAccess:

made TransactionManager static
removed DaoException
removed TimeSpanExtensions
renamed prepareHiveDatabase.sql should be renamed to Prepare Hive Database.sql
created Initialize Hive Database.sql

comment:111 Changed 13 years ago by cneumuel

r6435

some cleanup in HiveEngine
using ThreadSafeLog instead of synchronized methods

comment:112 Changed 13 years ago by ascheibe

r6437 Admin UI:

some bugfixes
removed dummy stuff

comment:113 Changed 13 years ago by cneumuel

r6444

stability improvements for HiveExperiment and HiveEngine
parallelized upload of jobs
enabled cancellation of job upload
reduced the amount of double-assignment of jobs by an additional check in HeartbeatManager
tried to tackle the amount of deadlocks by automatically rerunning transactions
some fixes

comment:114 Changed 13 years ago by cneumuel

Description modified (diff)

comment:115 Changed 13 years ago by cneumuel

Description modified (diff)

comment:116 Changed 13 years ago by cneumuel

Description modified (diff)

comment:117 Changed 13 years ago by ascheibe

r6451 Admin UI:

added subgroups
groups can now have calendars
calendar bugfixes

comment:118 Changed 13 years ago by cneumuel

Description modified (diff)

r6452

renamed UptimeCalendar and Appointment to Downtime
added service methods to delete plugins and get plugin by hash
made reverted TransactionManager change, made it non-static and added interface
moved magic numbers to application settings

comment:119 Changed 13 years ago by ascheibe

r6456

fixed Admin Views plugin dependencies
use settings instead of magic numbers and strings

comment:120 Changed 13 years ago by cneumuel

r6457

added methods for granting and revoking hive experiment permissions
added unit tests for hive experiment permissions
added a status webpage to see how some statistics and current status

comment:121 Changed 13 years ago by cneumuel

r6458 visualization of statistics on status page (requires MS charting controls) (http://services.heuristiclab.com/Hive-3.4/Status.aspx)

comment:122 Changed 13 years ago by cneumuel

r6463

created user interface for experiment sharing
created UserManager which provides access to the users
inserted a lot of security and authorization checks serverside
minor fixes in experiment manager

comment:123 Changed 13 years ago by ascheibe

r6464

some Admin UI bugfixes

Slave:

fixed bug when Pause is called immediately after Calculate
send exceptions when something goes wrong in Pause or Stop

comment:124 Changed 13 years ago by cneumuel

r6465 show owner of experimens in listview

comment:125 Changed 13 years ago by cneumuel

r6479

finished experiment sharing
added role for executing privileged jobs
refreshing experiments in experimentManager does not delete already downloaded jobs
moved some properties from HiveExperiment into RefreshableHiveExperiment

comment:126 Changed 13 years ago by cneumuel

r6481

added web.config from services
added some help-sql scripts

comment:127 Changed 13 years ago by cneumuel

Description modified (diff)

comment:128 Changed 13 years ago by ascheibe

r6492 Admin UI:

don't completely rebuild treeview on drag and drop
some bugfixes

comment:129 Changed 13 years ago by ascheibe

r6521 catch exception when querying execution times

comment:130 Changed 13 years ago by ascheibe

r6546: Don't call Clear() on ThreadSafeLog in log_MessageAdded. This doesn't work with the changes made in r6536 (LockRecursionPolicy.SupportsRecursion). Instead use maxLogCount of Core.Log to limit the number of log messages kept in memory.

comment:131 Changed 13 years ago by cneumuel

Solved problems with large jobs (>15MB) by adding the following node to the <requestFiltering section in %windir%\System32\inetsrv\config\applicationHost.config on the server:

<requestLimits maxAllowedContentLength="2147483647" />

Limit should theoretically be 2GB, tested with 150MB.

source: http://cutesoft.net/forums/thread/42292.aspx

Last edited 13 years ago by cneumuel (previous) (diff)

comment:132 Changed 13 years ago by ascheibe

r6595 removed unused config section

comment:133 Changed 13 years ago by ascheibe

r6683 replaced the CreateHiveDatabaseApplication HL App with a Windows Forms application

comment:134 Changed 13 years ago by ascheibe

r6688 some renaming to be more consistent with OKB

comment:135 Changed 13 years ago by ascheibe

r6689 some more renaming to be more consistent with OKB

comment:136 Changed 13 years ago by ascheibe

r6696

some cleanups
removed unused code

comment:137 Changed 13 years ago by ascheibe

Description modified (diff)

r6698

implemented review comments
more cleanups

comment:138 Changed 13 years ago by ascheibe

r6700 changed version in project files, AssemblyInfo and plugin files back to 3.3

comment:139 Changed 13 years ago by ascheibe

r6701 renamed folders from 3.4 to 3.3

comment:140 Changed 13 years ago by ascheibe

r6703

really removed 3.4 folders
added skeleton for a Slave HL App
added missing license headers and AssemblyInfo frames
fixed merging of config files

comment:141 Changed 13 years ago by ascheibe

r6704 added plugin frame file to Slave HL App

comment:142 Changed 13 years ago by ascheibe

r6712

got db scripts up-to-date
renamed db related stuff back to 3.3
fixed a bug in the Status page that occured when the db is empty

comment:143 Changed 13 years ago by ascheibe

r6717

moved DTO's to Services.Hive project
removed Services.Hive.Common project
some cleanups
added DTO's for enums

comment:144 Changed 13 years ago by ascheibe

Description modified (diff)

comment:145 Changed 13 years ago by ascheibe

Description modified (diff)

comment:146 Changed 13 years ago by ascheibe

Description modified (diff)

comment:147 Changed 13 years ago by ascheibe

r6721 Review comments: renamed Job to Task

comment:148 Changed 13 years ago by ascheibe

r6722 deleted Services.Hive.Common and DBCreator project as they are not needed anymore

comment:149 Changed 13 years ago by ascheibe

r6723 Review comments: renamed HiveEperiment to Job

comment:150 Changed 13 years ago by ascheibe

Description modified (diff)

comment:151 Changed 13 years ago by ascheibe

r6725 more renaming for more consistency

comment:152 Changed 13 years ago by ascheibe

r6727 the last bunch of renames, hopefully

comment:153 Changed 13 years ago by ascheibe

r6730 added Hive Slave HL App client

comment:154 Changed 13 years ago by ascheibe

r6731 updated plugin dependencies and cleaned project references

comment:155 Changed 13 years ago by ascheibe

r6734 some minor improvements in the Slave UI and Administrator UI

comment:156 Changed 13 years ago by ascheibe

r6743

fixed a bug in the Slave UI
finished renaming Webservice and Dao methods to be consistent with Job/Task naming
some cosmetic changes and project dependencies cleanups

comment:157 Changed 13 years ago by ascheibe

r6744

renamed last couple of folders
fixed an installer bug
now with more license headers and less magic numbers

comment:158 Changed 13 years ago by ascheibe

r6747 updated urls to Hive-3.3 in config files

comment:159 Changed 13 years ago by ascheibe

r6756 adapted Administrator UI to be more like OKB

comment:160 Changed 13 years ago by ascheibe

r6757 deleted unused files

comment:161 Changed 13 years ago by ascheibe

r6761 finished refactoring Hive Administrator UI

comment:162 Changed 13 years ago by ascheibe

r6762 remove UpdateControl, its no longer used

comment:163 Changed 13 years ago by ascheibe

r6764

fixed a bug in the hive server where it kept sending PauseAll cmd's
pop up the slave when a user clicks the balloon tip
set width of jobsView columns correctly

comment:164 Changed 13 years ago by cneumuel

Owner changed from cneumuel to ascheibe
Status changed from accepted to assigned

comment:165 Changed 13 years ago by ascheibe

r6768 removed unused fields from Job

comment:166 Changed 13 years ago by ascheibe

Description modified (diff)

comment:167 Changed 13 years ago by ascheibe

r6791 implemented Experiment Manager review comments

comment:168 Changed 13 years ago by ascheibe

r6792 Hive Experiment Manager is now called Hive Job Manager

comment:169 Changed 13 years ago by ascheibe

r6793 renamed Experiment Manager Folder

comment:170 Changed 13 years ago by ascheibe

r6823

implemented administrator ui review comments
implemented slave ui review comments

comment:171 Changed 13 years ago by ascheibe

r6834

implemented last couple of slave ui review comments
added global runs view to the job manager
some minor ui improvements

comment:172 Changed 13 years ago by ascheibe

r6863 adapted code to new ThreadSafeLog

comment:173 Changed 13 years ago by ascheibe

r6864 show messageboxes instead of exceptions if the username or password is wrong

comment:174 Changed 13 years ago by ascheibe

r6872

updated project settings to work with the trunk restructuring changes
added HowToCompile.txt with instructions on how to compile hive

comment:175 Changed 13 years ago by ascheibe

r6873 removed DayView project because it's already in trunk

comment:176 Changed 13 years ago by ascheibe

r6886

updated server and client configuration files
removed outdated binaries

comment:177 Changed 13 years ago by ascheibe

r6892 fixed project references of test projects

comment:178 Changed 13 years ago by ascheibe

r6893 server can now control the slave heartbeat interval

comment:179 Changed 13 years ago by abeham

r6894: fixed reference to Calendar.DayView

comment:180 Changed 13 years ago by ascheibe

r6896 adapted relative assembly references and output paths to reflect repository structure

comment:181 Changed 13 years ago by ascheibe

r6897

fixed bug that occurred when setting the Heartbeat Interval for a group of slaves
slave reports HbInterval now correct on first start

comment:182 Changed 13 years ago by ascheibe

r6898

added Slave installation instructions
updated names of the windows service, event log, etc.

comment:183 Changed 13 years ago by ascheibe

r6899 fixed a small graphical glitch in the administrator ui

comment:184 Changed 13 years ago by ascheibe

r6904 removed unnecessary dependencies

comment:185 Changed 13 years ago by ascheibe

r6905 added binaries for the Job Manager for HeuristicLab 3.3.5

comment:186 Changed 13 years ago by ascheibe

r6906 added Job Manager only configuration file

comment:187 Changed 13 years ago by ascheibe

r6910 fixed a bug where the slave didn't report the cpu utilization correctly

comment:188 Changed 13 years ago by abeham

A run collection view (e.g. box plot) opened on the run collection of a hive job that is still receiving runs through auto-refresh is not updating when new runs are available.

comment:189 Changed 13 years ago by ascheibe

r6924 fixed bug in PermissionView in JobManager

comment:190 Changed 13 years ago by ascheibe

r6925 fixed bug in PermissionView "bugfix"

comment:191 Changed 13 years ago by ascheibe

r6926 updated jobmanager binaries

comment:192 Changed 13 years ago by ascheibe

r6940

fixed naming of binding configurations
fixed small bug in Status page

comment:193 Changed 13 years ago by ascheibe

r6941

corrected total execution time in hive shown on status page
Job Manager: don't poll already finished jobs

comment:194 Changed 13 years ago by ascheibe

r6943 fixed a small bug in the Job Manager and the Administrator UI

comment:195 Changed 13 years ago by ascheibe

r6945 slave: catch more errors and log them to the windows event log

comment:196 Changed 13 years ago by ascheibe

r6946

disable logging of the user statistics on the server because of high run time demands
also show in the slave UI the timestamps of arrived messages

comment:197 Changed 12 years ago by ascheibe

r6958 fixed an arithmetic overflow when gathering statistics on job deletion

comment:198 Changed 12 years ago by ascheibe

r6971 prevent appdomain leases from timing out if communication with server is interrupted

comment:199 Changed 12 years ago by ascheibe

r6972 added indices to DB script

comment:200 Changed 12 years ago by ascheibe

r6984

The Hive Engine probably won't make it into 3.3.6 so i moved it to the MetaOpt branch because the Hive-3.4 branch should not be used anymore. The Hive Engine will then be released together with MetaOpt.
Updated MetaOpt to compile to new trunk binary directory and reference assemblies in this folder.

comment:201 Changed 12 years ago by ascheibe

Milestone changed from HeuristicLab 3.3.x Backlog to HeuristicLab 3.3.6
Status changed from assigned to reviewing

comment:202 Changed 12 years ago by ascheibe

Status changed from reviewing to readytorelease

I'm setting this ticket to readytorelease because Hive is now in trunk. The list of missing features can be found here?. The Hive integration in trunk is tracked with ticket #1672.

comment:203 Changed 12 years ago by abeham

Version changed from branch to 3.3.5

comment:204 Changed 12 years ago by swagner

Resolution set to done
Status changed from readytorelease to closed
Version changed from 3.3.5 to 3.3.6

Note: See TracTickets for help on using tickets.

Download in other formats:

Context Navigation

#1233 closed enhancement (done)

Hive-3.4 development

Description (last modified by ascheibe)

General notes

Server

Slave

Experiment Manager

Hive Engine

Administration

Meeting protocols

Architects meeting (16.06.2011)

Remarks for the future (cneumuel) (28.06.2011)

Change History (204)

comment:1 Changed 14 years ago by cneumuel

comment:2 Changed 14 years ago by cneumuel

comment:3 Changed 13 years ago by cneumuel

comment:4 Changed 13 years ago by cneumuel

comment:5 Changed 13 years ago by ascheibe

comment:6 Changed 13 years ago by cneumuel

comment:7 Changed 13 years ago by ascheibe

comment:8 Changed 13 years ago by ascheibe

comment:9 Changed 13 years ago by cneumuel

comment:10 Changed 13 years ago by cneumuel

comment:11 Changed 13 years ago by ascheibe

comment:12 Changed 13 years ago by cneumuel

comment:13 Changed 13 years ago by cneumuel

comment:14 Changed 13 years ago by cneumuel

comment:15 Changed 13 years ago by cneumuel

comment:16 Changed 13 years ago by cneumuel

comment:17 Changed 13 years ago by cneumuel

comment:18 Changed 13 years ago by cneumuel

comment:19 Changed 13 years ago by cneumuel

comment:20 Changed 13 years ago by ascheibe

comment:21 Changed 13 years ago by ascheibe

comment:22 Changed 13 years ago by cneumuel

comment:23 Changed 13 years ago by cneumuel

comment:24 Changed 13 years ago by ascheibe

comment:25 Changed 13 years ago by cneumuel

comment:26 Changed 13 years ago by cneumuel

comment:27 Changed 13 years ago by ascheibe

comment:28 Changed 13 years ago by ascheibe

comment:29 Changed 13 years ago by ascheibe

comment:30 Changed 13 years ago by ascheibe

comment:31 Changed 13 years ago by ascheibe

comment:32 Changed 13 years ago by ascheibe

comment:33 Changed 13 years ago by ascheibe

comment:34 Changed 13 years ago by ascheibe

comment:35 Changed 13 years ago by ascheibe

comment:36 Changed 13 years ago by cneumuel

comment:37 Changed 13 years ago by cneumuel

comment:38 Changed 13 years ago by ascheibe

comment:39 Changed 13 years ago by cneumuel

comment:40 Changed 13 years ago by ascheibe

comment:41 Changed 13 years ago by ascheibe

comment:42 Changed 13 years ago by cneumuel

comment:43 Changed 13 years ago by ascheibe

comment:44 Changed 13 years ago by cneumuel

comment:45 Changed 13 years ago by cneumuel

comment:46 Changed 13 years ago by cneumuel

comment:47 Changed 13 years ago by ascheibe

comment:48 Changed 13 years ago by ascheibe

comment:49 Changed 13 years ago by cneumuel

comment:50 Changed 13 years ago by ascheibe

comment:51 Changed 13 years ago by ascheibe

comment:52 Changed 13 years ago by cneumuel

comment:53 Changed 13 years ago by cneumuel

comment:54 Changed 13 years ago by cneumuel

comment:55 Changed 13 years ago by ascheibe

comment:56 Changed 13 years ago by ascheibe

comment:57 Changed 13 years ago by cneumuel

comment:58 Changed 13 years ago by ascheibe

comment:59 Changed 13 years ago by cneumuel

comment:60 Changed 13 years ago by ascheibe

comment:61 Changed 13 years ago by cneumuel

comment:62 Changed 13 years ago by cneumuel

comment:63 Changed 13 years ago by cneumuel

comment:64 Changed 13 years ago by ascheibe

comment:65 Changed 13 years ago by cneumuel

comment:66 Changed 13 years ago by ascheibe

Architects meeting ^(16.06.2011)

Remarks for the future (cneumuel) ^(28.06.2011)