Resonance Cachecade

Published August 15th, 2011 by Barney Desmond

We don’t normally post about hardware wankery, but this little piece of shininess appeared for free in some of the newer Dell servers we’ve been ordering, and it actually sounds like it’s not an awful hack.

Cachecade is an LSI technology (Dell PERC cards are rebranded gear) that adds a read-cache tier to the RAID logic, in the form of solid-state disks. While SSDs are still too expensive for mass-scale primary storage, they’re cheap enough that you can burn a few hundred bucks and get 50gb worth of faster reads.

The real benefit of this style of read-cache should be for random block reads, where SSDs proverbially drop excrement over rotational media from a great height. The jury is still out for us – we’ve just started using cachecade on a couple of VM hypervisors and a customer DB server, but we’re hoping to see some noticeable impact even on a qualitative basis.

In truth, the performance improvements will be difficult for us to quantify on our own workloads. You can apparently get this functionality if you purchase the new LSI® MegaRAID® CacheCade™ Pro 2.0, but I’d bet that it’s not exposed through something sane (like SNMP) and you’ll be forced to use the perennially-awful MegaCLI tool to get at the data.

0
Comments

RAIDing USB flash disks – not just a silly stunt

Published September 29th, 2009 by Barney Desmond

We’ve seen it all before:

hay guyz, check this out, I got a bunch of old 64mb thumb drives and made a RAID out of them! now i can put all my pr0n on there roffle lolololll

RAIDed floppies? It’s been done. RAIDed tapes? Yo dawg, that’s an enterprise storage solution! Let’s talk seriously now.

I have a fileserver that my family uses, it’s just a box with a couple pairs of hard drives in it (RAID-1, thank you very much. None of this starving-student crap with an oddball assortment of drives in RAID-0). Given that the box is used exclusively for serving up SMB shares, the OS installation is tiny.

I could’ve gone with something really stripped down and optimised, but that would require effort; sysadmins are allergic to unnecessary effort. Instead I just installed Ubuntu jaunty via netinst. Laugh all you want, but I have better things to do, like sleep.

Close-up of chikage's OS drives

Close-up of chikage's OS drives

The old system was whining about missing one half of its RAID-1, so I decided to splurge on a pair of 4gb USB flash disk – the princely sum of $22 for the pair. I setup the md software raid volumes ahead of time, which were happily picked up by the ubuntu installer – 512MiB /boot partition and the rest handed off for LVM to manage.

I could bore you with a bunch of details, but who cares about that.

  • Does it work? Yes, albeit a bit slower during bootup – total boot time from power-button to login prompt is 90 seconds.
  • Does the RAID work? Nicely, thank you. You can yank a drive out and it’ll keep ticking along.
  • Is there enough capacity? Plenty, the OS filesystem is 44% full.
  • Won’t swapping kill it? Yes, maybe eventually. The system has 1GiB of RAM, more than enough when you consider it’s only really using about 100MiB. At least there’s a chance both drives won’t fail at exactly the same time, so I can replace one.
  • Am I taking backups? Of course! If it toasts itself it’s not big deal.

What next? Hmm, if I splash out I could buy another pair of flash disks and kick it up to RAID-10 for a performance boost!

Tags: , , , ,
Posted in FTW

 Leave a comment

1
Comment

Tales of Hardware – IBM x3650

Published March 10th, 2009 by Barney Desmond

All the servers Anchor buys are from Supermicro. Most people won’t have heard of them, but they’re a sizeable hardware vendor that also does some OEM gear. Supermicro certainly doesn’t carry the mindshare of other big brands like HP, Dell, et al., but we chose them because their stuff is reliable and affordable – we focus on the things that actually matter, rather than some enterprise-y idea of sticking with big brands that you trust – “noone ever got fired for buying IBM” they say.

Actually, hold that thought for a moment.
Read the rest of this entry »

0
Comments

Safely handling RAID failure

Published March 9th, 2009 by Davy Jones

With hard discs being by far the most common point of failure in servers RAID does wonders for protection against loss of data.

With a RAID array in normal operation we’re in a pretty safe place. We know that we can suffer failure of a drive without loss of data or disruption of service. Once a drive has failed however we’re in a slightly more precarious position. Loss of another drive or damage to the remaining drive could easily cause major problems. At this point the only thing that can protect you can against data loss if you make a mistake is your backups – you did configure backups didn’t you?

Restoring a damaged RAID array is a task that requires extra caution. 

On our range of dedicated servers and vps‘ it’s one of those things that just happens automatically and the client usually only finds out after the problem has been fixed. For our co-location customers however it’s a task that we often find ourselves involved with to lend a helping hand.

With this in mind we’ve started to put together a series of articles discussing the steps we take to restore a Linux RAID array after hard disc failure and recoving from a Windows software RAID failure We hope you find them useful.

0
Comments

A tale of two drives

Published October 9th, 2008 by Barney Desmond

It’s no secret that we’d rather be working on Linux than Windows here at Anchor. It is, by and large, much more annoying to actually get anything done, but it also just breaks in opaque and unexplained ways. O Windowes, let me count the ways in which you are broken! This is one such problem we ran into yesterday.

Hard drive failure is a fact of life when you run servers, by sheer virtue of that fact that you have hundreds of them. To mitigate the risk and reduce unscheduled downtime, we use Window’s built-in software RAID feature. It’s not an enterprise solution, but it gets the job done. What’s important is staying online and not losing data.

Did I mention that trying to monitor a Windows box is a nightmare? A colleague of mine wrote a script to allow us to keep a watchful eye on Windows RAID volumes, it’s a lifesaver. A recently-deployed machine got a broken mirror, which we were able to act on immediately. We removed the dodgy mirror and prepared a replacement (we always have plenty of spares, of course). Allow me now to re-enact this scene…

Windows (sounding almost efficient): The driver has detected that device \Device\Harddisk1\DR9 has predicted that it will fail

Sysadmin: Thanks, Windows, I’ll get right on that. You didn’t say whether that was SMART, or just voodoo, but whatever, it’s good to know.

The bad drive is removed and a replacement installed in the hotswap drive bay

Sysadmin: Okay, Windows, do your stuff. “Scan for new hardware”, please.

A pause.

Sysadmin: Ahem, Windows, “Scan for new hardware” and find my drive.

Windows: ‘Ey there, chaps. Do what now, you say? AIEEEEGRH!!

The server stops responding entirely, necessitating a touch of the reset button

Needless to say, we’re rather unimpressed, and have to call the customer to let them know why it’s just dropped offline.

A quick check of the logs is in order. It’s also frustrating that there’s no sane way to scroll through log entries in Windows with something like a text editor, or to “tail” a log as it’s updated in realtime.

09:36 – The previous system shutdown at 9:21:23 AM on 8/10/2008 was unexpected.

Okay, it went down at 09:21, which is correct. Now if we look back in time a little…

09:21 – dmio: Harddisk1 write error at block 1953524618 due to disk removal

*sigh* And this is after the disk was removed cleanly…

0
Comments