VMware announce the new VMware 4 named vSphere

Published February 26th, 2009 by Paul De Audney

VMware have just announced that the new version of VMware ESX will be called vSphere.

Some of the announced features are:

  • 64bit kernel and console operating system (COS)
  • clustered VirtualCenter Servers
  • ESX hosts profile management
  • cross-hosts virtual networking
  • 8-way virtual SMP
  • virtual machines fault tolerance across multiple hosts (the famous Continuous Availability presented last year)
  • VMs and media library
  • alarms on physical hardware faults
  • access control on storage resources
  • configuration change tracking
  • full support for SATA local storage

So it seems VMware are catching up to Xen with some of the features. There will be interesting times ahead in the virtualization space, with the recent release of Citrix XenServer for free.

With an updated kernel and 64bit COS, end users should see more hardware end up on the compatible list which is good news for those who want to use some of the latest and greatest hardware.

Additionally 3rd party vSwitches are going to be supported. Cisco have demoed their Nexus 1000V with vSphere.

0
Comments

Firewalling VMware ESX for console access

Published February 23rd, 2009 by Barney Desmond

One of Anchor’s more recent product offerings is VMware-based virtual private servers. As one of my colleagues has already detailed, we take extra measures to secure the VMware host server to reduce the possibility of a compromise.

Our VPS offering uses VMware ESX, which runs on bare metal and doesn’t have a host operating system. This isn’t the full story – according to documentation it boots a Redhat Enterprise Linux 3 system, then loads the vmkernel which is where the real work is done. One of the nice things about this approach is that there’s a userspace environment in which to run support software, like good monitoring components.

We ran into an odd problem recently with an ESX host server on a dedicated network segment, namely that we couldn’t view the console for VM guests. Nothing would happen for about 30 seconds, then the VMware Infrastructure Client (VIC) would report a connection failure.

Most people using VMware now have probably used the vanilla VMware Server once or twice. It’s pretty easy to understand, and because it runs on top of your usual OS, firewalling it as simple as opening holes for legitimate clients to connect to TCP port 902. That’s not the case with VMware ESX, and without reading the manuals it’s not immediately obvious what the problem is.

As it turns out, the VIC is attempting to connect to TCP port 903 on the host server. If one assumed that the VMware hypervisor is just a modified linux system (which is what it looks like, but isn’t quite) you should be able to see a listening port in the output of netstat -tnlp, but you can’t.

[root@miyuki root]# netstat -tnlp
Proto Local Address       Foreign Address     State       PID/Program name
tcp   0.0.0.0:5666        0.0.0.0:*           LISTEN      2684/nrpe
tcp   127.0.0.1:32771     0.0.0.0:*           LISTEN      1807/cimserver
tcp   0.0.0.0:5988        0.0.0.0:*           LISTEN      1807/cimserver
tcp   0.0.0.0:5989        0.0.0.0:*           LISTEN      1807/cimserver
tcp   127.0.0.1:8005      0.0.0.0:*           LISTEN      1652/webAccess
tcp   0.0.0.0:902         0.0.0.0:*           LISTEN      1598/xinetd
tcp   0.0.0.0:199         0.0.0.0:*           LISTEN      1489/snmpd
tcp   0.0.0.0:8009        0.0.0.0:*           LISTEN      1652/webAccess
tcp   0.0.0.0:80          0.0.0.0:*           LISTEN      1765/vmware-hostd
tcp   0.0.0.0:8080        0.0.0.0:*           LISTEN      1652/webAccess
tcp   0.0.0.0:22          0.0.0.0:*           LISTEN      1498/sshd
tcp   127.0.0.1:8889      0.0.0.0:*           LISTEN      1839/openwsmand
tcp   0.0.0.0:2265        0.0.0.0:*           LISTEN      1732/osirisd
tcp   0.0.0.0:443         0.0.0.0:*           LISTEN      1765/vmware-hostd

As a little more reading revealed, the VIC makes an extra connection to port 903 for console data. This marks a significant change from the earlier model of passing everything through the same connection, and the reason for such a change is unclear.

We’ll assume there’s some performance benefit to be had there. What we find more interesting/important is that port 903 is entirely “under the radar”, as it’s implemented in vmkernel. The other traffic on the box is subject to our standard iptables rules as far as we can tell, including port 902, which is used for a lot of other management-client to host-server interaction.

tcpdump is also oblivious to port 903, so we’re guessing VMware passes traffic through the netfilter stack when it’s deemed to be convenient or necessary. If we had a spare ESX host sitting around doing nothing, I’d be interested to see if the packet counters shown in the output of ifconfig are also affected.

0
Comments

Gettin’ down and jiggy with good site design

Published February 22nd, 2009 by Barney Desmond

As a web-hosting company we see a lot of websites. Plenty of them are a little… hard on the eyes, they lack a certain je ne sais quoi. If only everyone followed this guy’s good advice.

http://www.youtube.com/watch?v=a0qMe7Z3EYg

0
Comments

Tracing I/O usage on Linux

Published February 19th, 2009 by oliver

I/O subsystems are a whole industry of their own, and many libraries could be (and probably have been) written on the subject already. The particular sub-topic I’m talking about today is when you are faced with a machine that you suspect to be suffering from heavy I/O load, and you want to find the culprit.

Sadly, this is an area that Windows has the upper hand. You can quite easily using the Performance Monitor determine which process is using the largest chunk of your disk I/O. On Linux things can be a little harder, however not all is lost.

If you are fortunate enough to be experiencing the problem on a machine running at least a 2.6.20 kernel and with Python 2.5 or later available, you can run IOTop. This prints out I/O usage data in a similar format to the standard “top” command, and it looks something like this:

IOTop Output

IOTop Output - picture sourced from http://guichaz.free.fr/iotop/iotop_big.png

Sadly at the time I needed to diagnose I/O, the machine I was using had neither a 2.6.20 kernel nor Python 2.5 so I was forced into seeking other methods to trace the I/O. Cue Blktrace. This hooks into the kernel’s debug filesystem to gather I/O stats and presents a fairly raw trace of what’s going on. You can download the source from here or find RPM packages for recent RHEL at the RPMForge Repository.

While it is possible to use blktrace directly, there is also a helper script btrace which shortcuts a lot of the most commonly used options and output formatting. You will need to mount debugfs on /sys/kernel/debug then you are ready to roll!

root@blarg:~# mount -t debugfs none /sys/kernel/debug
root@blarg:~# btrace /dev/sda
  8,0    0        1     0.000000000  2884  A   W 60060711 + 8 <- (8,1) 60060648
  8,0    0        2     0.000000244  2884  Q   W 60060711 + 8 [kjournald]
  8,0    0        3     0.000005278  2884  G   W 60060711 + 8 [kjournald]
  8,0    0        4     0.000006933  2884  P   N [kjournald]
  8,0    0        5     0.000007515  2884  I   W 60060711 + 8 [kjournald]
  8,0    0        6     0.000010068  2884  A   W 60093063 + 8 <- (8,1) 60093000
  8,0    0        7     0.000010263  2884  Q   W 60093063 + 8 [kjournald]
  8,0    0        8     0.000011588  2884  G   W 60093063 + 8 [kjournald]
  8,0    0        9     0.000012072  2884  I   W 60093063 + 8 [kjournald]

OK so there’s not much I/O happening on my workstation, but on the machine I was diagnosing recently, the output of btrace spewed out hundreds of lines per second, many of them referring to a process running the mutt mail program. It turned out one of the users had approximately 90,000 emails in one folder that mutt was constantly rescanning since the machine didn’t have a recent enough version of mutt to support header caching.

The emails were archived away and the I/O problem was resolved. Whereas previously we could only guess at what was causing the I/O load, blktrace squarely points the finger at the problem process. On later machines IOTop would have been even more straightforward. Both are valuable additions to the sysadmin toolkit.

Tags: , , , ,
Posted in FTW

 Leave a comment

1
Comment

Awesome Linux tool of the day: dstat

Published February 17th, 2009 by Paul De Audney

What is dstat?  dstat is a versatile replacement for vmstat, iostat and ifstat.

Dstat allows you to view all of your system resources instantly. Eg, You can compare disk usage in combination with interrupts from your HDD controller, or compare the network bandwidth numbers directly with the disk throughput (in the same interval).

No blog post pimping a cool tool would be complete without the obligatory screen shot.

dstat default output

dstat default output

Tags: , , ,
Posted in FTW

 Leave a comment

0
Comments

Another great reason to run Postfix as your MTA

Published February 16th, 2009 by Barney Desmond

All of our managed Linux servers here at Anchor use Postfix, written by Wietse Venema, as their mail server. Postfix is easy to configure, works out of the box, written with security in mind, actively maintained, and very fast. These are all very good reasons to stick with Postfix, but I’ve just found another one for all the programmers out there:

http://dotat.at/writing/exim-turing.conf

From Tony Finch’s homepage:

I realised recently that Exim is Turing-equivalent so I decided to write a little demo which includes an informal description of how to translate a Turing machine into an Exim configuration, and an example configuration that implements combinator reduction like my IOCCC winner mentioned above.

1
Comment

Inode shortage reaches critical levels

Published February 16th, 2009 by Barney Desmond

A customer got in touch with us recently saying they couldn’t upload files via FTP due to insufficient diskspace, but there was plenty of free space apparent when they logged in and checked. We don’t normally manage their server, but we said we’d take a look.

root@aria:~# touch /srv/www/newfile
touch: cannot touch `/srv/www/newfile': No space left on device

After logging in and taking a look around, the problem became apparent.

root@aria:~# df -h
Filesystem                Size  Used Avail Use% Mounted on
/dev/mapper/nayuki-root  1016M  536M  429M  56% /
/dev/mapper/nayuki-usr    4.0G  1.3G  2.6G  34% /usr
/dev/mapper/nayuki-var    4.0G  1.3G  2.6G  33% /var
/dev/mapper/nayuki-www     50G   41G  6.4G  87% /srv/www

There’s plenty of disk space…

root@aria:~# df -i
Filesystem                Inodes   IUsed   IFree IUse% Mounted on
/dev/mapper/nayuki-root    65536    7244   58292   12% /
/dev/mapper/nayuki-usr    262144   96830  165314   37% /usr
/dev/mapper/nayuki-var    262144   19680  242464    8% /var
/dev/mapper/nayuki-www   3276800 3276800       0  100% /srv/www

But no spare inodes!

Inodes are a filesystem structure used to store metadata about files (we use ext3, the default, so the following details may not necessarily apply to other filesystems). This of course takes up space on the disk, so inodes are preallocated when the filesystem is first created. The practical consequence is that you can’t create any more files once you run out of inodes. Not all filesystems work in the exact same way, so the system gives you a general error about being out of space.

This is usually correct – to be honest, I’ve never actually seen a filesystem run out of inodes before. It’s something of a nuisance that you can’t allocate extra inodes on an existing filesystem. The best short-term solution we could offer the customer was to remove or migrate some files.

If you use Linux’s Logical Volume Manager (LVM) subsystem, you can pull some trickery and grow an existing filesystem, but this doesn’t let you create an arbitrary number of spare inodes. If you’re going to have a massive number of files, the filesystem needs to be created from scratch with lots of empty inodes.

Chances are you’ll never run into this, but if you deal with systems that have a lot of small files (eg. mail servers, possibly public-facing web servers) it’s something to keep in mind when you set things up. Although it’s not something we’d run into before, we were able to identify the problem and provide a quick solution thanks to our extensive general knowledge of our systems.

Useful commands (ext2/3 filesystems on linux):

df -h
     diskspace utilisation on filesystems
df -i
     inode utilisation on filesystems
dumpe2fs -h /dev/nayuki/www
     get info on the filesystem (specify the path to the block device)
1
Comment

Anchor’s New Colocation Fit Out – Stage Two

Published February 11th, 2009 by Barney Desmond

Our new colocation space will be ready to go very soon! In the last couple of days we’ve had the new racks installed and the basic network infrastructure connected. The power rails in each rack will be powered up on Friday, and the network hardware will be installed. We expect to have live equipment in there in less than a week.

This will mostly be a photo post, they speak for themselves. To keep things interesting we’ve got photos of some of the high-level building infrastructure. This is the heavy-duty, redundancy-everywhere stuff that keeps you up and running, guaranteed. If you’re interested, follow the link on each photo; there’s a little more detail on what you’re looking at.

You can see most of the new floorspace and racks in the photos there. Once it’s online, we’ll have doubled our entire operating capacity. I’d say we’re growing rather nicely.

Photos are Creative Commons licensed Creative Commons Attribution-Share Alike 3.0 Unported Licence

0
Comments

The 800lb Gorilla Knows Where Your Website Lives

Published February 11th, 2009 by matt

If you run a website for commercial purposes, you know that the only way it’s going to provide you with benefit is if people actually visit it. Regardless of what sort of site it is (brochure, company promotion, online store, etc), if nobody’s actually loading the pages, your site may as well not exist. One way or another, you need to drive people to your site.

There are many different ways to get people to visit your site, and different strategies work for different sites. TV advertising, for example, has become popular in the last couple of years, to entice people to just visit the site. Other traditional forms of advertising, as well as online banner or text ads, are also popular. Putting your website name on your stationary and vehicles, and promoting your website to your existing customers works well in certain markets, too.

However, I am fairly confident that regardless of what industry you’re in, search engines make up a fair portion of your incoming traffic. More than that, though, it’s almost certain that one search engine in particular is driving the majority of your search engine-sourced traffic: Google.

For example, the operators of the popular developer question-and-answer forum Stack Overflow recently published some statistics on their sources of traffic:

Currently, 83% of our total traffic is from search engines, or rather, one particular search engine:

Search Engine Visits
Google 3,417,919
Yahoo 9,779
Live 5,638
Search 2,961
AOL 1,274
Ask 1,186
MSN 1,177
Altavista 202
Yandex 191
Seznam 103

[...] Google delivers 350x the traffic to Stack Overflow that the next best so-called “search engine” does. Three hundred and fifty times!

These numbers are almost certainly skewed more heavily towards Google than your average website, because the sort of people who benefit from Stack Overflow (software developers) are also the sort of people most likely to use Google over another search engine, but even if 100 times more people in the general population used a different search engine (and let’s face it, that’s not particularly likely), Google would still account for three-and-a-half times the incoming traffic of the next best search engine.

As a result of this massive traffic skew, if your business relies on search engine traffic, the main search engine you need to be targeting is Google. While there are seemingly endless parades of shonky search engine optimisers who will submit your website to “thousands of search engines”, the simple fact is that if all these thousands of search engines are providing you with the same proportion of traffic as “Seznam” (the 10th search engine on Stack Overflow’s top 10 list), then you’d need to be listed on over 33,000 search engines to match Google’s traffic contribution. Or, you could just make sure Google likes you, instead, for far less effort and expense.

This reliance by the world’s online population on one search engine isn’t necessarily healthy, though. As Whimsley describes in his excellent article, Mr Google’s Guidebook, Google has fundamentally changed the way the Web works, and in many ways it now dictates how websites are designed and marketed. The very fact that we are talking about “making sure Google likes you” and optimising your website for the Google indexer strongly suggests that Google is, in fact, “more a master than a servant”.

Philosophical arguments aside, though, you can’t afford to ignore Google if you want your online business presence to succeed and work for you. What can you do?

First off, I’d like to discuss the use of professional SEO (Search Engine Optimisation). While there are a few firms out there who do a decent job, it is a huge market for lemons. It is incredibly difficult to assess the actual value that an SEO is going to give you, in advance.

As a technical person, I’ve dealt with implementing the recommendations of a lot of dodgy SEO people over the years, and it’s not pretty. A lot of what SEO “experts” recommend are things that Google themselves have specifically debunked, like the virtual hosts vs dedicated IP address myth. Other times it’s doing things that Google specifically warns against, like buying links to boost your pagerank.

In several cases, I’ve seen a client of mine hire the services of a shyster, who has done everything that Google advises against, to provide a short-term benefit. The customer’s site shoots to the top of the Google rankings, the customer is pleased, and pays the SEO a big chunk of money. Some short time (less than a day, in one infamous case) later, the customer’s site disappears from Google’s index entirely. The SEO doesn’t care — they’ve got their money and are onto the next victim — but the customer’s website reputation is in ruins, as Google has detected all of the dodgy work, and has blacklisted the site from their indexes. Cleaning up from this mess can cost you many thousands of dollars directly, as well as lost revenue from people not being able to find you. In many cases, it can easily kill your business completely.

The simple fact is that there’s no real secret to SEO. Google is quite open in many ways about how it ranks pages, and what benefits and harms a site’s rank. It has a whole part of it’s main site dedicated to disseminating information to webmasters about how to do better in the site rankings, and what to avoid doing. You don’t need a professional SEO to tell you these things — there’s nothing secret about it all, and not even anything particularly difficult. However, if you do decide to hire a professional to help, here’s a few tips to make sure you don’t end up doing more harm than good:

  • Avoid anyone who talks about “thousands of search engines”. While Google isn’t the only search engine out there, there isn’t more than a half dozen or so that actually matter on an individual basis. Most of the reputable work that is done to improve your ranking in these mainstream search engines will also automatically help other search engines, too.
  • If you know any other online business owners personally, ask them if they’ve had any SEO work done, and get recommendations. If their search rankings have been consistently improved for three to six months after the SEO has been paid and left, then there’s less likelihood that they’re a fly-by-night shyster, and they may be worth using for your business.
  • Never, ever let an SEO modify your site content directly. Not only might they do deeply disreputable things to your site’s content without giving you any way to easily check what they’ve done, but if their work conflicts with your site designer’s work or processes, it might cost you a lot of money to fix. Having the person who did your site layout and content work review any SEO recommendations can also act as a filter against the worst excesses of a bad SEO.
  • For every recommendation that an SEO makes, ask for a citation regarding the legitimacy of the recommendation. If the SEO can’t show you where on a search engine’s site it recommends doing a certain things, then the chances are it’s a dodgy practice.
  • Do some research of your own on any recommendation you feel might not be above board. Don’t take the SEO’s word for it that it won’t cause you problems down the line.
  • If an SEO says they’ve got “secret” information about how a search engine works, run like hell. Nobody’s better at keeping secrets than Google (they’ve got 10,000 employees, yet nobody outside the company has any idea how many servers they use — is that good secret-keeping, or what?). The chances of a given SEO really having secret information is very slim indeed, and even if they do, the search engines can always change the way they do things to punish your site for gaming the system. It’s just not worth it.
  • If possible, have a trusted technical person (such as your website designer, or your hosting company) review the recommendations of the SEO. While it might cost you an extra couple of hundred dollars to have this checking done, what is the cost to your business if your site was blacklisted by the major search engines for doing dodgy things?
  • Try and get a longer-term contract for an SEO’s services, one that involves periodic payments over a 3-6 month period after the initial optimisation work has been done. This will tend to discourage the shysters, as their business model is one of “do some quick work, boost rankings temporarily, grab the cash, and get out before the whole thing falls apart”. A trustworthy company is far more likely to be happy with a longer-term relationship.

Whether you hire a professional or go it alone, it’s good to educate yourself a little about what sort of things the search engines recommend. Some excellent resources on this subject include:

  • Google Webmaster Central — the start page for anything related to improving your site in Google’s eyes.
  • The Google Webmaster help center — a collection of helpful articles about the how, what, and why of designing a site to be Google-friendly.
  • The Google Webmaster blog — chock full of interesting articles and tips for webmasters.
  • The Google Webmaster Dashboard — a fantastic resource that lets you peer into all the information that Google has about your site, like how many links there from other parts of the web to the various pages on your site, whether the crawler has had problems finding some of your site info, how your sitemaps are helping, the effects of robots.txt changes, and removing pages from the index that you don’t want showing up in search results.
  • The Google Webmaster Forum — where you can ask questions of other website owners (and Google employees), and find out loads of useful information on topics that you’ve probably never even thought about.

Yes, all of those links are Google-specific, because Google makes all this info easily and clearly available, and you get the most bang for your buck by targetting Google. Most good site design ideas will help with other search engines, too, so following Google’s advice will benefit you in general.

2
Comments

Unicorns and Rainbows

Published February 9th, 2009 by Barney Desmond

Unicorns and rainbows are a serious matter for sysadmins and geeks in general. They represent a state of tranquility and nirvana. Microsoft aspires to let us all live in a wonderful happyland with rolling green hills and springs of crystal-clear water. ‘Tis a place where unicorns roam free and rainbows grace the horizon like technicolour halos.

Of course, we’re not holding our breath, but we can dream of the day it arrives.

http://cornify.com/

0
Comments