Mark Minasi's Tech Forum
Register Calendar Latest Topics Chat
 
 
 


Reply
  Author   Comment   Page 1 of 2      1   2   Next
nikolas.e

Senior Member
Registered:
Posts: 103
Reply with quote  #1 
Okay first of all let me just say that am very tired just by thinking how nervous i was all day.

We have a production server HP StorageWorks X1600G2 as our nas storage currently holding almost 5 tb of data configured with 7 X 2 TB disks as Raid 5.

First thing i must add is that the array does not have a hot spare setup (Bare with me i didn't build the network i joined the company 3 months ago). I will ask some more info about the hot spare since some people report it is good and some other it is bad.

Today was the day one of the 7 disks sent a warning to me about predictive failure. Since this server has a high i/o load all day my job was a little bit difficult as opposed to other servers since i had to decide. Should i replace and let the rebuild take care the rest during work hours or wait during off hours to do the task?


The steps i followed based of what i know so far.

1: Since 5 TB of data will take days to backup i had to verify the latest backup that finished yesterday if it was consistent. Verification was successful recovering some Hundred MBs to different location.

2: Thank god we had the exact model of the disk ready to replace the faulty one. My thoughts  first where is it a false alarm or is it really going to fail? i did some minute research on the internet of similar issue but i trusted HP Management that it was telling the truth.

3: Replaced the disk and i noticed immediately that rebuilding was started. It started at 9:30 AM and finished while ago at 11:20 PM. That was a hell of a time rebuilding the disk. What i was worried so much is that raid 5 can handle 1 disk failure but what happened if second disk fails? i say that because i was worrying about the i/o load of the server if it can handle it or not? I guess for the moment am lucky since it rebuilded the disk and i can see no warning for the array. 

Am glad all went well. If you were in the same place which am sure many of you have been in worse case scenario what more steps you should have follow?

Would you have do the same replace the disk during work hours?
Since this happened should i add a new disk and setup it as Hot Spare for the array?

i feel i wrote an essay but anyways just expressing how i felt today since honestly i was really nervous.

Thanks


__________________
Just call me the 1000Questionsguy
0
donoli

Senior Member
Registered:
Posts: 459
Reply with quote  #2 
Quote:
Should i replace and let the rebuild take care the rest during work hours or wait during off hours to do the task?


I don't think that I would replace the other drives at this time.  I wouldn't replace them during working hours either.  Are they Western Digital drives, by any chance?

Quote:
If you where in the same place

Quote:
Today was the day were one of the 7 disks sent a warning


Off topic:  English may not be your first language so please don't take offense.  You have the words where & were confused.  Swap them.
0
wobble_wobble

Avatar / Picture

Associate Troublemaker Apprentice
Registered:
Posts: 741
Reply with quote  #3 
My 2 cents.
Replace the disk when you have the replacement, especially in a RAID set that big.
You'll drop from approx 2888 Disk IOPS to 240 Disk IOPS with the Array in a degraded state/ rebuild state.
Best to get it back as soon as possible.

I hate my phone and auto-corrrect
Replace the disk when you have the replacement, especially in a RAID set that big.
You'll drop from approx 288 Disk IOPS to 240 Disk IOPS with the Array in a degraded state/ rebuild state.

__________________
Have you tried turning it off and walking away? The next person can fix it!

New to the forum? Read this
0
nikolas.e

Senior Member
Registered:
Posts: 103
Reply with quote  #4 
Quote:
Originally Posted by donoli


I don't think that I would replace the other drives at this time.  I wouldn't replace them during working hours either.  Are they Western Digital drives, by any chance?




Off topic:  English may not be your first language so please don't take offense.  You have the words where & were confused.  Swap them.



No worries yes it is not my default language so sometimes i tend to do mistakes in writing. Thanks for pointing out.

Disk was Seagate

__________________
Just call me the 1000Questionsguy
0
nikolas.e

Senior Member
Registered:
Posts: 103
Reply with quote  #5 
Quote:
Originally Posted by wobble_wobble
My 2 cents.
Replace the disk when you have the replacement, especially in a RAID set that big.
You'll drop from approx 2888 Disk IOPS to 240 Disk IOPS with the Array in a degraded state/ rebuild state.
Best to get it back as soon as possible.



Thank you for the info. My decision was to get it up as soon as possible. Though  yes i was worrying about the i/o load.

How do you know about this : You'll drop from approx 2888 Disk IOPS to 240 Disk IOPS with the Array in a degraded state/ rebuild state.?



__________________
Just call me the 1000Questionsguy
0
donoli

Senior Member
Registered:
Posts: 459
Reply with quote  #6 
Quote:
Disk was Seagate


I always liked Seagate drives. I don't know if they changed since the merger with Maxtor. First I heard that Maxtor bought Seagate. Then I heard that Seagate bought Maxtor.
0
wobble_wobble

Avatar / Picture

Associate Troublemaker Apprentice
Registered:
Posts: 741
Reply with quote  #7 
Quote:
Originally Posted by nikolas.e


Thank you for the info. My decision was to get it up as soon as possible. Though  yes i was worrying about the i/o load.

How do you know about this : You'll drop from approx 2888 Disk IOPS to 240 Disk IOPS with the Array in a degraded state / rebuild state.?



Apologies, my phone did some mad autocorrect.
You'll drop from approx 280 Disk IOPS to 240 Disk IOPS with the Array in a degraded state

bloody technooblogy


__________________
Have you tried turning it off and walking away? The next person can fix it!

New to the forum? Read this
0
donoli

Senior Member
Registered:
Posts: 459
Reply with quote  #8 
Have you considered disabling auto correct?
0
wobble_wobble

Avatar / Picture

Associate Troublemaker Apprentice
Registered:
Posts: 741
Reply with quote  #9 
Autocorrect, yes, disabled.
Also reset dictionary.
But it keeps returning.
but I have a lot of Samsung apps disabled/ updates disabled.
Got a Samsung change pushed maybe 2 or 3 months ago, and that was the start of the major miss types

Now looking at allowing all the Samsung apps I'm not allowing update, to update.

Can't face another Windows phone....



__________________
Have you tried turning it off and walking away? The next person can fix it!

New to the forum? Read this
0
Wes

Senior Member
Registered:
Posts: 190
Reply with quote  #10 
Super dangerous to have raid5 with physical disks that large.  Hope you are in the process of moving the data to a different solution asap!
0
nikolas.e

Senior Member
Registered:
Posts: 103
Reply with quote  #11 
Quote:
Originally Posted by Wes
Super dangerous to have raid5 with physical disks that large.  Hope you are in the process of moving the data to a different solution asap!



Hi Wes. May i ask what would you suggest in my case?  New nas with different raid option?

__________________
Just call me the 1000Questionsguy
0
Wes

Senior Member
Registered:
Posts: 190
Reply with quote  #12 
Hi Nikolas,

If you're going to continue to use this array with such large disks, you'll probably want to switch to raid10 as soon as you can coordinate moving the data off and back again (and pick up some disks to round out the capacity).  Your chances of hitting a URE with disks these size during a rebuild operation are too high for comfort...
0
cj_berlin

Avatar / Picture

Senior Member
Registered:
Posts: 177
Reply with quote  #13 
Quote:
Originally Posted by Wes
Your chances of hitting a URE with disks these size during a rebuild operation are too high for comfort...



BAARF! BAARF! BAARF!

__________________
Evgenij Smirnov

My personal blog (German): http://www.it-pro-berlin.de/
My stuff on PSGallery: https://www.powershellgallery.com/profiles/it-pro-berlin.de/
0
Wes

Senior Member
Registered:
Posts: 190
Reply with quote  #14 
Quote:
Originally Posted by cj_berlin



BAARF! BAARF! BAARF!


Lol! I love it. However funnily enough there are situations where raid5 is once again viable - SSDs! But I would definitely join BAARF with the caveat that we're talking about spinning rust...
0
nikolas.e

Senior Member
Registered:
Posts: 103
Reply with quote  #15 
Quote:
Originally Posted by Wes
Hi Nikolas,

If you're going to continue to use this array with such large disks, you'll probably want to switch to raid10 as soon as you can coordinate moving the data off and back again (and pick up some disks to round out the capacity).  Your chances of hitting a URE with disks these size during a rebuild operation are too high for comfort...



Good morning Wes. Thank you very much for the info.  It will not be an easy task since this server is a production server and works the weekends also. So transferring this size of data, braking the raid 5, add an even number of disks and build raid10 to transfer data back will not be so easy because of the size of data.



BAARF! BAARF! BAARF! 

Now i know the Meaning of BAARF! [smile]

__________________
Just call me the 1000Questionsguy
0
Previous Topic | Next Topic
Print
Reply

Quick Navigation: