Microsoft use Single Instance Storage on All of Their FIle Servers

CD Stack

Single Instance Storage (SIS) – what’s that? Single Instance Storage is when duplicate copies of a file on a file system are stored only once, with pointers to the location in the various directories. In this context we are talking about the file system. For anyone who has any Exchange experience – it’s the same concept, but on a file server.

I have been aware that SIS has been available in Windows for some time now, apparently since Windows 2000, but no-one I know has used it. I always thought that the issue was support from third parties, particularly backup vendors.

At Tech-ED this time the Windows 2003 R2 Storage Server team stated that all of Microsoft’s file servers ran SIS and that it was saving them loads of money. Over the last few weeks I’ve been ferreting around trying to find out what the story is; and yes Microsoft are running all of their managed file servers with SIS.

It turns out that the main reason SIS wasn’t being widely promoted (or widely adopted) was because there wasn’t really an administration interface for it. That’s been fixed in Windows 2003 R2 Storage Server and Microsoft are going to start promoting it’s usage. I’m not sure whether they are going to port the Administration Interface to other Windows versions, I’ll have to get that Windows Server 2003 R2 box built to have a look.

My experience with file servers would give duplicate file numbers lower to the ones the Microsoft experience. They experience a saving of 60%, I’d put it nearer 40% but it’s still a load of storage and associated backup that is being saved.

What do you do in a dull teleconference?

Jonathan

Like many people these days I spend a lot of time on teleconferences these days. The thing with teleconferences is that the level of involvement can often be very low. You need to be sitting listening and adding value when required, but the involvement can often be low enough to enable you to do another activity. There is something about teleconferences which make them particularly poor at time keeping (perhaps it’s because we are all doing something else as well).

You can’t do something requiring a lot of thought, ideally it is something you can leave and come back to with ease. There are, therefore, a number of things that I find myself doing while on a teleconference.

One of the things that I do is to browse flickr pictures in the groups that I am interested in. I particularly like to look through sunsets and sunrises, or UK pictures.

It’s also a good time to catch up on feeds that don;t really need reading. It’s surprising how many of these there are. There are loads of feeds where it’s sufficient to know that it exists, a prime example of this is the Microsoft Download feed. I don’t get an RRS feed for news because there is too much of it, so I also spend some time looking at the BBC News site.

Every now and again someone will send me a silly game to play. My attention span for these things is not very high. The latest one pokes a bit of fun at Steve Ballmer and his (reportedly) throwing a chair at news of one of his employees leaving to join Google. I can’t do games like the Pit Stop Game because that requires my attention, and part of my brain is still listening to the teleconference.

I have, from time to time, also used the time to sort through my task list.

The other thing I do is to write blogs.

I have considered whether it would be possible to do some exercise while sat listening but concluded that it would be difficult to sound calm and convincing while riding an exercise bike.

The question I am not particularly clear on is whether this actually adds to or removes from my productivity.

Strange Search

Fish

The Internet is a strange place. Yesterday I got a few hits on my site from people searching MSN for the words “happy” and “bunny”. Having looked on MSN today I seem to be 13 in the list for this particular search. It’s a phrase I think I have only used the once, but there were eight different people in about two hours who came through. These people were all over the place – one was even in Tunisia.

Did I miss out on some major world event yesterday which involved lots of jolly white tailed rodents, or is it just one of those things. It shows how far we have to go with search technology, because I suspect that the context in which I was using the phrase “happy bunny” was completely different to the context in which they were using it.

(Now, of course, I have just made the problem worse by using the phrase a number of times in this post )

Solution Architecture – Being One Step Away

Volume

One of the things I find challenging as a Solution Architect who delivers solutions to tens of thousands of users is that you know that they don’t understand what it is that you are trying to achieve, the constraints that you were working under, or the things that you had to go through to get there.

Today I was sat in a plane travelling to one of my customers sites and behind me were two individuals who had been given a new laptop as part of one of my projects. They were talking about their experience which on the whole was OK; but then the issue started coming out. There primary issue was with an application that they both used and had errors. In client refresh projects it’s always the applications which are the major problem. It doesn’t matter how much testing you do there is absolutely no way of testing all of the functions and combinations, so you always have problems. But then came the comment which demonstrated the lack of understanding which is my problem – “You would have thought that a professional organisation like that could deliver applications that worked”.

I wanted to jump up out of my chair and go and sit between them and explain the multi-tiered testing process that their application would have been through. I wanted to explain how their own organisation would have defined an application owner who should have thoroughly tested the functions of the application that they use. I wanted to explain that the main reason for application problems were security settings that were necessary to project their environment and to maintain their accreditation regime.

Being a reserved and polite British person I sat where I was and said nothing. Perhaps I should have given these two gentlemen some of my time and then they could have become advocates for the project in the rest of the business. But I didn’t. Instead I sat there and pondered the whole issue of complicated projects and our inability to communicate to people in a way that they understand that IT never delivers a perfect solution and that we would do our best to assist them. I also considered the ever increasing complexity in the infrastructure caused by more and more applications being deployed. I even considered how much the Internet revolution had so far failed to reduce that complexity for even the simplest task.

But then the plane landed and I decided that I would write something down and conclude with these words “you can please some of the people some of the time; you can never please all of the people all of the time”. My personal challenge is to get to the point where I am comfortable that I did all that I could to deliver the best that I could. It’s also about time people started to understand that they are really pioneers in the IT industry and pioneers need a sense of adventure – which allows for failure.

Extreme Data

Sound

This is just a link, because I think it’s worth one.

Extreme Data: Rethinking the “I” in IT

I’ve read the report and it’s really good at tracking the change in the IT industry that is occurring because data is now available everywhere and for everything.

Exchange Disk Performance Part 2 – and Correction

Outlook

The other day I posted an article on Exchange disk performance and something that was puzzling me. Stu assisted me in finding the correct answer (because the last one was a little flawed).

The flaw was in some information that I didn’t communicate, which lead me to a wrong conclusion. I concluded that the number of disks required was double what the calculation for RAID 0+1 had produced.

I think what I had done was to assume, in my head, that in a mirrored pair that it was only the ‘front’ disk in the mirror that responded to reads, which is of course ridiculous (for most modern hardware). On top of this I didn’t communicate where I got the RAID overhead ratios from.

Clarifying the RAID overhead ratios. For Exchange database storage the ratio of reads to write is something like 3 to 1 according the Optimising Storage for Exchange 2003.

In a RAID 0+1 infrastructure a write requires 2 IO’s. So for 4 IO operations you actually do 5 disk IO’s (3 reads and 2 writes) giving you the RAID impact of .8 (you get 80% of the performance from the volume of disks that you have).

In a RAID 5 configuration a write require 4 IO’s. So for 4 IO operations you actually undertake 7 disk IO’s (3 reads and 4 writes) giving you a RAID impact of .57 (you get 57% of the performance from the volume of disks that you have).

So my calculations were actually correct, but I made the wrong conclusion because of a false assumption – not for the first time, unfortunately not for the last either.

So in order to support 1000 concurrent users you need:

  • RAID 1 = 7 disks (well 8 actually because you can’t have an odd number)
  • RAID 0+1 = 7 disks NOT 14 as a first concluded (well 8 actually because you can’t have an odd number)
  • RAID 5 = 9 disks

“One pound of learning requires ten pounds of common sense to apply it”. Persian Proverb

SyncToy goes 1.0

StartI tried SyncToy in the beta and it worked great for the purposes I was using it. I’ve not upgrade to 1.0 and it’s still great.

A few changes during the beta process, but nothing that really affects me.

Exchange and RAID Levels

Outlook

I have recently been trying to put together the logic and reasoning behind the clear Microsoft recommendation that Exchange 2003 services are hosted on RAID 1+0 and not RAID 5. RAID 5 being for those people who had money constraints. All of the reasoning stated seems to be based on the need for IO performance.

Due to the way that a write occurs each RAID level can be given a factor. If RAID 0 is a factor of 1, then RAID 1 (or 1+0) is 0.8 in terms of performance and RAID 5 is 0.57.

So the logic goes a bit like this:

Each online user does something like 0.5 IOPS and a 10K drive does about 100 IOPS. So a single drive can support 200 concurrent users. With RAID factors built in the users per drive are 160 for RAID 1 and only 114 for RAID 5. So you get better performance from RAID 1 disks. But here is the rub; no-one would implement RAID 1 they would all implement RAID 1+0 so you actually only get 80 users per disk for RAID 1+0, or in other words you need two disks to get the 160. In a RAID 5 set the resilience is already factored into the performance numbers.

So in order to support 1000 concurrent users you need:

  • RAID 1 = 7 disks
  • RAID 1+0 = 14 disks
  • RAID 5 = 9 disks

What’s more, as every Exchange administrator knows, you need loads of spare space on a server to do all of those maintenance tasks, and also it adds a little to the performance. With RAID 1+0 you get 7 disks worth, with RAID 5 you get 8 disks worth of storage.

Having blown the IOPS argument out of the water, there must be something more to it, and this is where I think that the real reasoning comes in. In the case of a failure a RAID 5 set will slow down significantly because all of its information is having to be calculated from parity information on the other disks. A RAID 1+0 set will not suffer from such an overhead because it isn’t having to calculate from parity information, it’s just reading the other disk that is a mirror of the one that has failed. The really worrying part is that I suspect Microsoft have focussed on the IO issue because they don’t want money strapped managers to implement RAID 5 because the disaster tolerance sell isn’t high on their priority list.

WinFS – it's a big deal and now it's a Beta 1

York Museum Gardens

Channel 9 has a great video showing some of the early power of WinFS. This is a big deal and the demo’s look great.

Although the file system is as basic a filing system as you could imagine I know loads of people who simply do not think in that type of structured manner. They think by tags and categories and types.

WinFS will make it possible for these people to get the visualisations they require without compromising the safety of the data, or their productivity. Whilst the Beta 1 looks great in terms of the foundation, it will be the different UI experiences that make it a truly big deal.

Multiple paths to the same end make for an absolute nightmare for support though. Image that someone is used to getting there in a particular way and the support person tells them a different way; what will they do?

IT Power Consumpton

SleepJonathan Schwartz talks to the issue of IT power consumption particularly in the data centre. In true American fashion his main reasoning is the cost of oil and the cost of real estate. The environment gets a tiny mention;  even tinier than his dig at Dell and their delivery of servers that consume huge amounts of power.

Even so, an industry wide initiative to reduce power and heat would be most welcome.

Windows 95 – Ten Years Old

Start

What was I doing 10 years ago. Was I waiting with baited breath for the release of Windows 95. I don’t think I was actually, it was obviously such a significant event that I have completely forgotten it.

Back in those days we were far more interested in Windows NT and OS/2. I think most of us in the little office where I worked looked at it (except Vince because he was the Mac man).

I don’t think I know anyone still running Windows 95, I still have some friends and family running Windows 98 and Me. They are the ones who normally have the problems that are difficult to mend. It is very difficult to remember how to do things from that long ago.

Microsoft Monitor wonders where the Hoopla is? I kind of agree, however painful some of our experiences have been, there were ad continue to be some real wow moments too.

Mary Jo Foley makes some comments – mainly about the people still using Windows 95 and why they haven’t upgraded.

And there is a growing number of articles on Technorati.

But no big hoopla from Microsoft – perhaps looking back isn’t their strong point.

When is it time to wait for Exchange 12?

Honister Pass

As part of a recent project I have been asked the question of whether to wait for Exchange 12 or not. The choice being to architect for a deployment now on Exchange 2003, or whether to delay until the architecture could be made for Exchange 12.

Here are my thoughts on this specific question and also on the generic issues with making this kind of a choice.

Dealing with the generic issues initialy:

  • Risk averse, mainstream or leading edge – customers tend to fit into one of these categories especially with a mission critical solution like Exchange.
  • Level of third-party software complexity – the complexity of the architecture can be significantly influenced by the level of third-party software integration. Exchange environments always have at least two third-party application integrated in at the server infrastructure level and they anti-virus and backup but there is also a long list of other integration requirements; Fax, Blackberry, Archive, Anti-Virus, etc.
  • Complexity of the existing infrastructure – is the current infrastructure standardised and all at a specific level. In the case of Exchange; is the environment to be upgraded all at a certain level of Exchange or is there still a mixed environment.
  • Current Equipment – what you buy now, won;t be what you will buy in 12 months time, or even 6 months time.

Specific to Exchange 12.

The current feature set looks something like this:

  • Edge Services – Gateway protection, incorporating current IMF technology
  • Outlook auto setup of profiles
  • Redesigned ESM UI
  • Scripting for all ESM components
  • Continuous Backup – Replicate changes to another database
  • Improved search functionality
  • Web Services API
  • OMA will be removed (probably because of the wide adoption of ActiveSync)
  • Policy compliance – verify client configuration
  • Enhanced mobile device support
  • Access Sharepoint and other application through OWA
  • Unified messaging  – voice mail and faxes in your mailbox
  • Improved Calendaring functionality
  • 64-Bit version

So the considerations from this are primarily:

  • The release dates for Exchange 12 are still not available, although likely to be some time late in 2006 it may slip into 2007. Until these dates become clearer it would seem that delaying a migration would be a little dangerous.
  • Exchange 2003 Service Pack 2 is delivering an amount of incremental change, particularly in mobility that many customers will take a good while to adopt.
  • Microsoft is increasingly linking the capabilities of the client to the capabilities of the server; Outlook and Exchange. Though they talk a good talk on backward compatibility my experience has not been all that good.
  • Exchange 12 does not change the database technology, so the things that constrain the architecture are unlikely to go away.
  • Continuous Backup becomes available in Exchange 12, but from my perspective will only be used to protect the ‘really important’ mailbox in most organisations, it’s too expensive to do much more. The architecture that is required to support this will involve a lot of testing.
  • The move of Exchange back to the centre of all messaging will require others to release their control. Most large organisations that require a Unified Messaging solution, in my experience, have already done it. I really see Exchange 12 Unified Messaging capabilities fitting into the smaller organisation context.
  • Not sure on the background to this statement but a quote from a TechEd session – “migration from Exchange 2003 Service Pack 2 will be the easiest migration to Exchange 12”.
  • The changes to Exchange edge-services is going to be adopted in slow time, people will want to be sure of the benefit before moving such a critical part of the infrastructure over.
  • The improvements in Calendaring will not be compelling to many customers. Calendaring is still something that hasn’t quite got there, and it still won;t quite get there in Exchange 12.

I have some questions though:

  • What on earth is 64bit support giving. Is this being used to break the 3GB limit on memory usage?
  • Will Outlook auto-configuration require Outlook 12?

If someone gave me some money to invest in a messaging infrastructure I, personally, would invest it in establishing a clean Exchange 2003 Service Pack 2 environment and start to drive the adoption of SharePoint as the ultimate replacement to Public Folders (Public Folders will still be available in Exchange 12, but with little change and a statement that they won’t be in Exchange 13 (unlucky for some)). I’d also push the adoption of Office Live Communication Server. Each of these three things will encourage people to regard presence as central to their working, once they get this mind-set change all sorts of behavioural changes start to occur. In this context productivity training will become a massive need.

Some links, although most of my information came from a TechEd session that was held much more recently than most of these articles was published:

http://www.windowsitpro.com/Windows/Article/ArticleID/45880/45880.html

http://www.msexchange.org/ExchangeNews/February-2005-Exchange-12-Features-Announced.html

http://www.infoworld.com/article/05/07/07/28OPenterwin_1.html

http://www.infoworld.com/article/05/03/31/HNexchange2006_1.html