Opened 13 years ago
Closed 13 years ago
#1672 closed enhancement (done)
Hive trunk integration
Reported by: | ascheibe | Owned by: | ascheibe |
---|---|---|---|
Priority: | medium | Milestone: | HeuristicLab 3.3.6 |
Component: | Hive.General | Version: | 3.3.6 |
Keywords: | Cc: |
Description
This ticket is for tracking the trunk integration of HeuristicLab Hive (#1233).
Change History (68)
comment:1 Changed 13 years ago by ascheibe
- Status changed from new to assigned
comment:2 Changed 13 years ago by ascheibe
comment:3 Changed 13 years ago by ascheibe
- removed unused files
- added missing license headers
comment:4 Changed 13 years ago by ascheibe
r6979 fixed plugin dependencies
comment:5 Changed 13 years ago by ascheibe
- added the Hive Services and Slave projects
- added missing svn ignores
comment:6 Changed 13 years ago by ascheibe
r6985 added documentation for Hive
comment:7 Changed 13 years ago by ascheibe
- removed unused files
- changed the plugin cache path of the Slave HL App so that HL doesn't discover Hive assemblies
- cleaned up config files
- incremented version number of installers to 3.3.6
- removed Execution time on Hive from Status page because it can't be calculated without the user statistics
comment:8 Changed 13 years ago by ascheibe
- removed dead code
- added Hive assembly references to Tests project
- fixed problems found by tests
comment:9 Changed 13 years ago by ascheibe
r6995 fixed typo which led to false assembly paths in the Slave App
comment:10 Changed 13 years ago by ascheibe
r6997 added missing invoke
comment:11 Changed 13 years ago by ascheibe
r6998 Changed again how plugin discovery works because of Hive. The reason is that it must be possible to move the plugin and working directories away from the original slave working directory. This is needed for the Slave App and also in the future for the windows service because we don't want it to run as the LocalSystem user. I have removed setting the PrivateBinPath and am now setting the ApplicationBase. This doesn't effect HL (because ApplicationBase is set by default to !pluginDir anyway) but makes Hive work. The reason why setting the PrivateBinPath doesn't work with moving plugin and working directories is (from msdn): "Private assemblies are deployed in the same directory structure as the application. If the directories specified for PrivateBinPath are not under ApplicationBase, they are ignored."
comment:12 Changed 13 years ago by ascheibe
r7009 moved the Hive services parts to services project as suggested by swagner
comment:13 Changed 13 years ago by ascheibe
- Owner changed from ascheibe to abeham
- Status changed from assigned to reviewing
comment:14 Changed 13 years ago by ascheibe
r7014 increased HB interval to 20 secs
comment:15 Changed 13 years ago by ascheibe
- increased max. object graph size which can be serialized to allow downloading of big jobs
- removed more magic numbers
- increased job polling interval
comment:16 Changed 13 years ago by ascheibe
r7029 fixed a small bug when refreshing permissions
comment:17 Changed 13 years ago by ascheibe
r7032 reverted change that exceptions are thrown in the job manager
comment:18 Changed 13 years ago by ascheibe
r7045 switch more service calls to IsolationLevel=ReadUncommitted transactions to prevent db deadlocks
comment:19 Changed 13 years ago by ascheibe
r7046 added Hive and Benchmarking project dependencies to HeuristicLab-3.3 project
comment:20 Changed 13 years ago by ascheibe
- use transactions for status page
- removed speed up charts because they are only working with user statistics
comment:21 Changed 13 years ago by ascheibe
- removed duplicate events
- got rid of compiler warnings
comment:22 Changed 13 years ago by ascheibe
- disabled drag and drop for Hive Jobs
- fixed adding and deleting of Hive Tasks
- show Statelog after a Job is downloaded
comment:23 Changed 13 years ago by ascheibe
- Permissions can now be deleted
- fixed overlay icons for permissions
- fixed overlay icons in job list
comment:24 Changed 13 years ago by ascheibe
- allow drag and drop only for new jobs
- prepare optimizers on drag and drop
comment:25 Changed 13 years ago by ascheibe
- added a default job name
- added check that a job name is set before upload
comment:26 Changed 13 years ago by ascheibe
r7078 added missing invoke
comment:27 Changed 13 years ago by ascheibe
r7103 cleaned up namespaces of the jobmanager
comment:28 Changed 13 years ago by ascheibe
r7104 fixed graphical glitches and corrected tab order in the job manager
comment:29 Changed 13 years ago by abeham
Are there any more changes coming to this ticket? If so, please take it again, it's currently in reviewing state.
comment:30 Changed 13 years ago by ascheibe
- Owner changed from abeham to ascheibe
- Status changed from reviewing to assigned
comment:31 Changed 13 years ago by ascheibe
- speed up download of tasks by avoiding unnecessary service calls
- display download progress correctly
comment:32 Changed 13 years ago by ascheibe
- removed magic numbers for upload retries
- speed up job downloading by placing deserializing/downloading semaphores correctly
- increased max. number of parallel downloads/deserializations
- added more status messages when downloading to make it more clear what's actually happening
- renamed some variables
comment:33 Changed 13 years ago by ascheibe
r7131 try to stop the slave service before uninstalling it
comment:34 Changed 13 years ago by ascheibe
- Owner changed from ascheibe to abeham
- Status changed from assigned to reviewing
- fixed name of slave windows service
reviewing comments:
- reduced MaxParallelDownloads to 2 to not completely overload cpus
- renamed ServiceLocator to HiveServiceLocator
comment:35 Changed 13 years ago by abeham
- Added missing configurations to Clients.Hive-3.3 project
comment:36 Changed 13 years ago by ascheibe
r7135 some fixes for the slave tray ui:
- removed some more magic numbers
- fixed reconnecting to windows service when it was stopped
- added more time for stopping/starting windows service so that no exception is thrown because of a timeout
comment:37 Changed 13 years ago by abeham
EngineHiveTask:
- please clean up the GetAsTaskData method, there's an uncommented lock. Is it necessary, why was it there in the beginning?
OptimizerHiveTask:
- GetNewRunName and GetRunNumber: idx will be 3 and not -1 if string cannot be found
TaskData:
- is contained in the file JobData.cs, should be renamed to TaskData.cs
PersistenceUtil:
- would be nicer to use the using pattern for MemoryStream
JobResultPoller:
- Why is stopRequested a property: private bool stopRequested { get; set; }
There are some singletons (e.g. HiveClient) which are not fully thread-safe (although for HiveClient this is probably not a problem). The MSDN implementation pattern for singletons suggests to either use double-checked locking or static initialization (private static readonly instance = new HiveClient();). Interestingly the article states that the static initialization is also lazily instantiated, because it is private and accessed only from Instance.
I'm a bit confused that the HiveServiceLocator.Instance property provides a public setter!?
comment:38 Changed 13 years ago by ascheibe
r7142 implemented reviewing comments
comment:39 Changed 13 years ago by ascheibe
Thanks for your comments. Concerning the EngineHiveTask: I don't think the locks are necessary. The GetAsTaskData is only used when uploading tasks and one task is only uploaded once, so this should be ok.
comment:40 Changed 13 years ago by ascheibe
r7144 stop result polling before deleting a job
comment:41 Changed 13 years ago by ascheibe
r7146 admin ui
- improved treeview
- added more icons and tooltips
comment:42 Changed 13 years ago by ascheibe
comment:43 Changed 13 years ago by ascheibe
r7156 fixed starting, pausing and stopping of jobs and tasks
comment:44 Changed 13 years ago by ascheibe
r7157 improved event logging in the Hive service
comment:45 Changed 13 years ago by ascheibe
r7158 another event logging fix
comment:46 Changed 13 years ago by abeham
I just tested downloading a running job (1250 tasks). It worked to download, but the JobManager never exited the populating / updating display state. I then switched the view to another job and back to the just downloaded job: The application locked up without using any CPU. I gave you permission to view the job it's called "QAPLIB SA parameter range (lipa20b - tai50b)". I'm pretty sure it'll still be running tomorrow, so you can test for yourself.
comment:47 Changed 13 years ago by ascheibe
r7162 throw exceptions in Job Manager so that we can see if something went wrong
comment:48 Changed 13 years ago by ascheibe
r7164 set taskDataInvalid when deserializing fails
comment:49 Changed 13 years ago by ascheibe
r7165 build the task tree first and then display it. This should be more light on the CPU.
comment:50 Changed 13 years ago by ascheibe
- don't serialize the results 2 times before uploading
- made slave a little bit more robust
comment:51 Changed 13 years ago by ascheibe
r7171 communication with ui should be more stable now
comment:52 Changed 13 years ago by ascheibe
r7177 fixed setting of priorities in the Job Manager
comment:53 Changed 13 years ago by ascheibe
r7178 disabled checking if there are parent tasks which should to be calculated because this case doesn't exist at the moment
comment:54 Changed 13 years ago by ascheibe
r7182 switched to HeuristicLab Log and LogView for the Slave UI
comment:55 Changed 13 years ago by ascheibe
- possible fix for the slave hang problem: don't host the service on the thread it was created on
- added a trigger for deleting slavestatistics when statistics are deleted
comment:56 Changed 13 years ago by ascheibe
- increased times between life cycles on the server
- some smaller performance improvements on the server
comment:57 Changed 13 years ago by ascheibe
- increased timeout for slaves/tasks to 3 minutes
- moved the cleanup functionality to an own windows service. This will hopefully increase performance because it was done within the heartbeat calls up until now.
comment:58 Changed 13 years ago by ascheibe
- fixed a typo
- increased transferring timeout
comment:59 Changed 13 years ago by ascheibe
r7192 some small job manager ui fixes
comment:60 Changed 13 years ago by ascheibe
- tooltip now shows the name and the id of a task
- the date is now shown on the x-axis for runs spanning multiple days
comment:61 Changed 13 years ago by abeham
Deleting a job while it's being downloaded causes HL to crash
comment:62 Changed 13 years ago by ascheibe
Thanks for the bug report!
r7200: don't allow deleting jobs while they are uploading/downloading
comment:63 Changed 13 years ago by ascheibe
r7217 fixed compiler warning
comment:64 Changed 13 years ago by ascheibe
r7218 renamed some jobs to tasks
comment:65 Changed 13 years ago by ascheibe
r7219 renamed wrongly named folder
comment:66 Changed 13 years ago by ascheibe
r7222 don't crash if there are no child hive tasks
comment:67 Changed 13 years ago by abeham
- Owner changed from abeham to ascheibe
- Status changed from reviewing to readytorelease
The last test with several computers joining was successful and no further bugs have been identified.
comment:68 Changed 13 years ago by swagner
- Resolution set to done
- Status changed from readytorelease to closed
- Version changed from 3.3.5 to 3.3.6
r6976 integrate the Hive client projects into trunk (Hive Job Manager and Administrator)