skreamer 1 #1 December 12, 2001 This is a question for all the techies (but Merrick's contributions will be taken under advisement!)My first two weeks at the new job have been hectic to say the least, I would value your input (even Merrick's ) on the following situation :Last week one of our HDDs failed, no problem because we have RAID 5. 4 days later our Database (Options) on this server (NT4) fucks up in a big way. This is a real bastard because we are in the middle of our busiest sales month and can't take/process customer orders. ie the shit's been hitting the fan in a big way. Anyway, our out-sourced DBA people come in (3 in total so far, one spent the night here last night), trying to break in, unload, reload etc. etc. Here is the problem though : these bastards are trying to claim that one of the HDDs failing could've corrupted the DB. I have gone on record as saying 'Bullshit and they can get fucked'. We're using RAID 5 and for 4 days after the ONE HDD failed the server was still running fine. So, am I correct? I am asking because I know absolutely zero about databases. I am sure these guys are looking for an out because they know a court case is pending for our lost business, lost customer confidence etc. I have a strong suspicion these guys aren't telling us everything because they have had a total of 3 of their DBAs on site (basically on-site for the last 36 hours), if it was our fault this would definitely NOT be the case. The review meeting is tomorrow and they are going to be trying to shift some blame on to us (systems).If Zennie or anybody else could give me some feedback/advice on this I'd appreciate it (tomorrow's meeting promises to be pretty interesting....)WillPS I have already received the new HDD but now I have to wait for these clowns to finish up, make a full backup and then I can only bung the new drive in. (funny moment was when I told management what would happen if we lost a second HDD before RAID 5 was restored, boy did they crap themselves! hee hee ) Quote Share this post Link to post Share on other sites
indyz 1 #2 December 12, 2001 Edit: I removed my analysis of the problem becuase I've decided that I don't have enough info to be accurate and cover all of the possibilities. All I can say with confidence: Even RAID 5 can fail, and software guys screw the pooch sometimes, too. --Brian"The sky is lovely, dark and deep.But I have promises to keep,And miles to go before I pull" Quote Share this post Link to post Share on other sites
jfields 0 #3 December 12, 2001 Skreamer, did you forget to feed the gerbils inside your computer? You know that when they die, you start losing data. In general, I would tend to agree with you. It would be awfully suspicious, considering the delay between the HDD failure and the database failure.A few questions:Hardware RAID card or software-based RAID?NT4 with what service pack?What database package?As a side note, I'll share a really annoying hdd/database issue I had at work. Maybe it will give you some ideas.A SCSI hard drive in a mirror set went bad. I replaced it, and everything seemed fine for like 2 days. Then the other drive in the mirror went bad. I had doubts, but replaced it too. (Both under warranty.) The replacement drives they shipped me BOTH failed, so I contacted the manufacturer. They shipped out a fresh pair. At this point, the system is totally out of comission. Then I took a look at the SCSI cable, and noticed some discoloration in part of it. It hooked to the drives then looped up to the top of the case and then back down. At the very top was a yellowish area on an otherwise grey cable. Heat buildup in the very top of the case had caused slow degredation of the cable. Before I noticed that, I had a hell of a time diagnosing the problem. The cable caused sporadic problems that generally went unnoticed with the error-checking. But when I was reduced to a single drive and had the drive-intensive procedure of re-establishing the mirror, the problems seemed to burst out and hose everything. Thankfully, throughout this mess, I had BACKUPS, BACKUPS, BACKUPS. I hope your problem gets resolved with a minimum of hassle.JustinMy Homepage Quote Share this post Link to post Share on other sites
skreamer 1 #4 December 12, 2001 QuoteHardware RAID card or software-based RAID?NT4 with what service pack?What database package?Hardware RAIDSP6aProgressI appreciate (NOW!) that running a database on RAID 5 is not a good idea and the performance sucks. Tomorrow I will suggest we implement mirroring instead. BUT in the meantime I still can't accept that one of the HDDs in a RAID 5 configuration can corrupt a DB 4 (FOUR) days later. These DBA twats are also saying that they can't find any conclusive evidence in their logs of what caused the corruption. I really suspect that are just trying to cover their own asses and trying to shift the blame. Oh well, my 2nd week here and it looks like its rumble time already! So much for the honeymoon period! /s Quote Share this post Link to post Share on other sites
freeflir29 0 #5 December 12, 2001 Get a really big hammer....it always works for me! "and I'm not easily impressed...Ooohh look...a blue car!" -Homer Simpson Quote Share this post Link to post Share on other sites
jfields 0 #6 December 12, 2001 Skreamer, I'd have doubts about the consults deal also.RAID 5 definitely degrades the performance of a database. What some mid-sized places do is run the db on a non-redundant stripe set. Say.. 5 drives. You have a lot more speed in IO, because different drive heads can be working simultaneously. The down side is that one single drive failure hoses the entire database. So running two complete striped servers gives some redundancy. Instead of mirrored drives, you have mirrored servers. But you have to buy twice the hardware, so many people go with a RAID 5 solution instead, and deal with the performance loss.Damn, I'm having an intelligent e-mail dialogue with Skreamer. What is the world coming to? JustinMy Homepage Quote Share this post Link to post Share on other sites
jfields 0 #7 December 12, 2001 QuoteGet a really big hammer....it always works for me!We weren't talking about your urological remedies, Clay. JustinMy Homepage Quote Share this post Link to post Share on other sites
skreamer 1 #8 December 12, 2001 Quoteand software guys screw the poochhehehee, you said 'screw the pooch' , heehehehehhee(is that better Justin? ) Quote Share this post Link to post Share on other sites
skreamer 1 #9 December 12, 2001 Quoteurological Oh YAY, scrabble boy is back..... Quote Share this post Link to post Share on other sites
jfields 0 #10 December 12, 2001 *I* didn't say it Skreamer, but I'll try to ignore the whole concept for now. Did you check your server for furballs? JustinMy HomepagePS So much for our intelligent conversation. Quote Share this post Link to post Share on other sites
skreamer 1 #11 December 12, 2001 What's a server? Is that like a waiter or summink? Quote Share this post Link to post Share on other sites
jfields 0 #12 December 12, 2001 QuoteWhat's a server? Is that like a waiter or summink?Nice Clay imitation. <>JustinMy Homepage Quote Share this post Link to post Share on other sites
skreamer 1 #13 December 12, 2001 lolBut Clay is soooo technical, he even said so himself... lolApparently his company asked him to set up a mail server so he went out and bought every employee a modem and set them up with a Hotmail account....(PS you so know this is going to be another one of THOSE threads..../sPS 11h30PM and I can only start running the back-up NOW... and my boss from hell wants me in an hour early tomorrow morning in case there are any issues (and no freakin' overtime for all this either!!! BASTARDS!!!) Quote Share this post Link to post Share on other sites
jfields 0 #14 December 12, 2001 Quotemail server so he went out and bought every employee a modem I don't think so. He just got really, really fast at making paper airplanes. JustinMy Homepage Quote Share this post Link to post Share on other sites
indyz 1 #15 December 13, 2001 Ahh! Justin is one of those bastards who has an Internet Explorer only homepage. Us *nix users really don't like that (and we're gonna rule the world some day --Brian"The sky is lovely, dark and deep.But I have promises to keep,And miles to go before I pull" Quote Share this post Link to post Share on other sites
Kris 0 #16 December 13, 2001 If it's a DPT or Mylex hardware RAID properly setup with ECC memory, write-caching disabled and backup battery then I call BS on the drive failure causing the problem. Adaptec RAID, well, they're crap (not true XOR, etc...).I run a DPT Millennium U160 w/ 256MB of ECC across 5 36GB LVD Barracuda's in a RAID-5 on my homne system and I have blown a drive. It immediately swapped over to the hot spare and kept on chugging while it rebuilt.Sounds like coincidence and that the DB just threwup for the hell of it.I would run a deep integrity check on the RAID volume just to be sure. RAID 5 won't help you if there's something schizo with the RAID adapter itself.Just my $0.02 as an ex-Seagate monkey^H^H^H^H^H^Hemployee.Kris Quote Share this post Link to post Share on other sites
jfields 0 #17 December 13, 2001 Brian,I finally got sick of maintaining two separate websites, just so a few people with old or crappy browsers could view it. The logs were way (80%+) IE. The folks with decent alternatives (recent Mozilla, etc.) can see most of the site, but it just doesn't completely work. It won't look like it should, but oh well. When it comes to cross-platform compatibility on my personal page, I'm just a lazy bastard. JustinMy Homepage Quote Share this post Link to post Share on other sites
wildblue 7 #18 December 13, 2001 QuoteI appreciate (NOW!) that running a database on RAID 5 is not a good idea and the performance sucks.Eh? Who? If it's hardware based RAID you should have better performance. If you're really worried about performance, you can always make it a RAID 0 set (striping with out parity) and mirror that. Or better yet, make a RAID 5 set of RAID 0 sets (RAID 50?) Or a RAID 5 of RAID 5s (RAID 100? - I forget the names when you start getting that carried away)I ain't happy, I'm feeling gladI got sunshine, in a bag Quote Share this post Link to post Share on other sites
jfields 0 #19 December 13, 2001 QuoteOr better yet, make a RAID 5 set of RAID 0 sets (RAID 50?) Or a RAID 5 of RAID 5s (RAID 100? - I forget the names when you start getting that carried away)Welcome to the land of the paranoid. Of course, we have mirrored drives, in mirrored systems, all with full tape backups, and additional data-only backups to CDRs.JustinMy Homepage Quote Share this post Link to post Share on other sites
skreamer 1 #20 December 13, 2001 QuoteEh? Who? If it's hardware based RAID you should have better performance. Found this, when I was researching our little situation - clearly advises against RAID 5. Quote Share this post Link to post Share on other sites
RemiAndKaren 0 #21 December 13, 2001 All I can say is:Quote(OT) Corrupt DB - SCSI HDD failure (RAID 5)did you REALLY need to specify this was Off Topic? lolRemiMuff 914 Quote Share this post Link to post Share on other sites
jfields 0 #22 December 13, 2001 Skreamer,QuoteEh? Who? If it's hardware based RAID you should have better performance. Maybe this was said in comparison to Software RAID, which would be true. I don't think it is stating that Hardware RAID 5 has better performance than RAID 0 disk striping. That is clearly not true.Believe it or not, I actually learned some of this stuff at work in between posts on DZ.com. JustinMy Homepage Quote Share this post Link to post Share on other sites