Still Checking the Forum Out
Registered: 1452248375 Posts: 2
Reply with quote #1
Hi fellow members I have got a question to understand how you address this problem that we have started to see since increasing our patching levels. I am sure someone out there must have run into a similar problem to what we are facing. So historically we used to patch our most critical servers manually which we found was taking a long time and now with zero day attacks and ransom aware becoming more common a decision was made to ensure WSUS patches are applied in a reasonable time to our whole server/client estate. We do this using WSUS and other third party patching tools to install and reboot servers early on a Monday morning (between 2am - 5am). Since we have introduced this we have run into some problems. The main problem is when servers are rebooted automatically as part of the patching, if the application relies on a database and/or the application was rebooted before the database it doesnt always make a successful connection as well as if after a reboot the services don't start up correctly the application will fail for the end user and we sometimes need to manually start these up even though the services should automatically start following a reboot. We use VMware and use vApps to tie applications, database servers and other related back end servers but this is only good if it's part of a planned maintenance with the vsphere farm, it doesnt help when the reboots happen as part of patching/windows/application updates. I know there is an option for VM monitoring within VMware but unsure if this would help us in this situation and whether that could cause more issues. We need to somehow find a solution to ensure that we can control the order of certain high critical services are started up after a patching reboot otherwise we will need to go back to manually patching which is something we don't really want to do. Does anyone else have this issue, if so do you have any ideas of how we could address this? So far our best option for us to easily identify if services are down is to put our NOC in the DMZ and play with firewall rules to ensure this server can detect services which are down in the internal network or invest in a cloud NOC offering and perform monitoring this way. Any suggestions would be most appreciated. Thanks
Registered: 1454887308 Posts: 598
Reply with quote #2
The same problem with VMware & reboots affects workstations as well as servers. I had a custom machine built with Win 10 Pro installed with the intention of using it for 3 or 4 VMs. I paid $1,500 & I'm sorry that I bought it. It became one big pain with the updates & reboots. I'm also looking for a solution.
Registered: 1451939938 Posts: 103
Reply with quote #3
My perspective on this is a little limited since I've focused my career on just Microsoft SharePoint for the past few years. But it is relevant because for all of our production and the bigger non-prod systems we have, we do a SQL server back-end with separate servers for the SharePoint services (web front-end, app, search, etc.)
So in my world ... I basically let all the automated patching happen if it's a non-prod system, then I just come back and make sure that the SharePoint server is rebooted one last time cleanly after the SQL server has done its maintenance. For PROD, I'm still being a bit overcautious ... I let the automated systems (we're using SCCM) patch the SharePoint servers but not the SQL. They automated patching happens early a.m. and then in the late afternoon I come back and power all the SharePoint servers down, patch the SQL manually, and power the SharePoints back up. I have to say I've been pleasantly surprised that with Win2K8 R2, SharePoint 2010, SQL 2012, and SCCM, none of our non-prod environments has really had a problem just allowing SCCM to do all the patching, SQL server included. I may eventually let PROD go that way too, but for now I feel better doing it the hard way. Soo ... best answer or suggestion I would have is to have your SQLs all get patched around the same time, then have the patching happen on your paired systems afterwards, so that the reboots happen after SQL is up and stable. Then if still problems, script a final reboot to happen after your patching is finished. Rebooting fixes everything! (both sarcastic but sadly true many of the times!)
Registered: 1451592353 Posts: 279
Reply with quote #4
Well, you certainly can orchestrate the whole process using whatever automation tools you have at hand. If you happen to have licensed vSphere Orchestrator, you can use that, otherwise just a PowerShell script run from a machine unaffected by the process.
__________________ Evgenij Smirnov My personal blog (German): http://www.it-pro-berlin.de/ My stuff on PSGallery: https://www.powershellgallery.com/profiles/it-pro-berlin.de/
Registered: 1455526466 Posts: 215
Reply with quote #5
>>We need to somehow find a solution to ensure that we can control the order of certain high critical services are started up after a patching reboot otherwise we will need to go back to manually patching which is something we don't really want to do. : Not quite the same, but it might give some inspiration - We use the *Startup Delay* option on our VM's (Hyper-V). For the DC's it is set to 0 seconds, for the member servers it is set to 120 or 180 or... - You could use the Recovey options on a service : With "Run a program" you could run a script that does some checks and the restart the service.
Hope this helps.
__________________ Pieter Demeulemeester
Associate Troublemaker Apprentice
Registered: 1451575798 Posts: 885
Reply with quote #6
Patch the servers monthly.
Patch 50% of the DB servers (if you have 2 or more) on the third Tuesday Patch other 50% on fourth Tuesday. This gives you time to see what the patches break. Schedule a reboot of the App Servers that talk to the DB's for 05.00 in the morning or some suitable time. Patch the App servers a week later, same reboot schedule. DB's get 1 reboot a month. App Servers 2 reboots a month. Make sure backups happen before each reboot. __________________ Have you tried turning it off and walking away? The next person can fix it!
New to the forum? Read this