JBODS and EXCHANGE 2010- Have you attempted them?
We today stand in a vestibule enormous technological front. To my amazement, though there have been consistent drives and resistance in adapting technological changes, once proved and adapted, the change makes a significant impact for all.
We all have also witnessed that technological changes have never been steady lines. We have had upward trends, steep curves and also dramatic “reversals”.
The saga of DAS/SAN has always existed with Exchange servers, and this article is not about bashing either but to stand out to suggest that there are always better means for everything, and often the route ventured less could also be better! Its all about the perspective of the mind and the elegance in capitalizing technology.
Exchange 2010 is a phenomenal tool given its capabilities; let me list the key ones that might make sense to this article. The whole concept base sketched for E2010 was to offer storage enhancements for Exchange server apart from added performance boosters like high availability, performance and yes of course “reliability”.
>SATA DISK ENHANCEMENT: I/O patterns have been optimized to reduce I/O burst scenarios(disk writes)
>I/O REDUCTION: E2010 can handle better I/O request(50% reduction than E2007), leading to better utilization with less disks
>JBOD SUPPORT: E2010 can be implemented with upto 16 replicated copies/mailbox database, enabling it to be capable being used in RAID-less environment as well, driving huge cost benefits.
>STORAGE PROBLEM ADDRESSAL: The fast database failover mechanisms make it ideal in situations of disk failure scenarios incase of RAID-less environments. E2010 is highly pliant to storage problems as well. It has the capability to detect automatically minor disk problems and also dramatically repair the data corruptions as well.
I am sure this forms a good platform for the discussion ahead…
Isn’t technology all about crossing the barriers, unleashing the benefits in its broadest sense yet adding add-ons superiorly and being cost effective at its peak?
A user ideally desires that whatever he has in his mailbox be available to him always. But given the huge challenge of size constraints (user quota and mailbox size), it isn’t really possible to keep all mails. So the user, fixes it up to some extent through “archiving and .PST files” concept.
What –if your PST files get corrupted? So whether its archiving or backup, to a great extent, administrators rely on storage capacity. So all I intend in this article is to express thoughts about few unknown or let me put it across, as alternative options of storage for E2010.
Storage today per-se E2010, isn’t confined to “enterprise storage” anymore. You may have all your SAN guys screaming it aloud, that they stand best when it comes to E2010 given its RAID concept and performance capabilities or even the battle of DAS/SAN. Why not think about JBODS?
JBODS are a good option as well that work equally great with E2010. Why not we leverage ourselves with advancements in E2010 with changes in disk capability and disk access mechanisms?
What is that as an administrator of Exchange you bank upon? I believe Disk Capacity and Disk Performance are the two metrics that are the soul of any Exchange Server. Let’s discuss these to get a better hold.
Just to make it a crystal clear, I will talk in brief about disk capacity and disk performance per-se E2010. The usual ideology of administrators/ implementers whilst framing the mailbox size is to have optimal performance and disk utilization for users and of course Exchange server.
So how do we reach on the metrics of how many users we allocate for a disk? We take into account the IOPS of the disk and the general intended IOPS of user.
So let me say, I am using a “C” gigabyte of high performing disk with “M” IOPS. In E2007, the intended IOPS per user is 0.3 IOPS of disk performance. So my number of users will be
[mostly Exchange servers are using C= 146GB high performing disks with M=150 IOPS ]
No. of users for E2007= M /0.3= 150(IOPS of disk)/0.3(IOPS for user)= 500 USERS PER DISK
Mailbox size for each user in E2007= C/number of users per disk= 146 (GB disk)/500(no. of users)=300 MB mailbox size
Lets do the same for E2010 and figure out the statistics accurately for same numbers.
The IOPS desired for each user in E2010 is 0.1 per user. I am putting the same figures again,
[mostly Exchange servers are using C= 146GB high performing disks with M=150 IOPS ]
No. of users for E2010= M /0.1= 150(IOPS of disk)/0.1(IOPS for user)= 1500 USERS PER DISK
Mailbox size for each user in E2010= C/number of users per disk= 146 (GB disk)/1500(no. of users)=100 MB mailbox size
Okay, great, so we reached a consensus on the metrics, but I didn’t intend to say here that we could support more users with reduced mailbox size. You could obviously support 3x USERS and almost 70% reduced IOPS than E2007, but what about your practical scenarios of small mailbox size of users and their problems? I would still recommend giving more mailbox size flexibility to users. Keeping the storage option optimized, is a possibility with E2010 given its performance metrics as figured above. With this you could obviously strike a balance between your disk performance and disk capacity. My idea of showing you this metrics was to just chalk out a flavor to all that E2010 capabilities in terms of performance are amplified to a great extent. This capability has infact opened up options for JBODS.
Bottom-line-Whether you choose SAN, DAS, JBODS, Microsoft Exchange 2010 gives you the flexibility to offer larger mailboxes without any significant impact on performance difference.
I am going to discuss now how JBODS add value in E2010!
Let me reiterate again, if prior to E2010, I too wouldn’t advocate JBODs for exchange servers. But E2010, has surpassed the capabilities and its very much possible to make use of them. Microsoft advocates strongly to go ahead with JBODS with its Exchange environment unless there is a need totally to go with enterprise storage. Alright, before we hit it out, may be I should talk a little bit about as to what JBOD is. J
Why should you consider JBOD storage for E2010?
Undeniably storage is one of the biggest expenses whilst implementing E2010. Thus , I would reckon that if you could manage getting anywhere between 40-80% of your money down , get the same expected performance , grant more users increased mailbox sizes, then why not try JBODS.
How Does JBOD work good with E2010?
Let me start out with JBOD (JUST A BUNCH OF DISKS). Its nothing but a collection of disks which is presented to the OS as a single disk. They offer you full disk utilization as well unlike RAID architecture where the smallest disk holds value. With JBODS, you don’t get fault tolerance, as in , if one disk fails, then you lose out your data. How do I then advocate or say it could be used with E2010, given its no fault tolerance.
Three Cream Features in E2010:
>IOPS performance have been scaled up great (the above calculation lists clearly)
>Algorithms : E2010 has Database Mobility (DAG), to be a little understandable , it’s all about database failover levels within Exchange. The Exchange team at Microsoft has worked and successfully enhanced the algorithms at the core level in such a way that it can handle MULTIPLE “write requests/write bursts” specifically for SATA Disks ( earlier SATA disks were not used because they couldn’t handle such requests and hence onlyt Enterprise disks were used!)
>Storage Engine: There are lot of changes (Larger Sequential writes) introduced with Storage Engine as well. This eventually means that random reads and writes are significantly reduced as well and most of the data is stored sequentially (http://technet.microsoft.com/en-us/library/ee832793.aspx). Because of this entire rearchitecture of it is possible to sketch a mailbox database that can perform at same optimal performance as in a single disk as in a RAID environment
As a result, a large TB SATA disk can work equally well and offer the same performance as an enterprise storage. To just give you an estimate, a 7200RPM 1TB SATA disk offers 75-80 IOPS per spindle, which is well within E2010 IOPS per user expectations. Additionally, when you introduce the DAG concept (replication and failover at Database level), JBODS become more optimal.
How do I Sketch my Infrastructure with JBODS?
JBODS shouldn’t be considered if you aren’t going to implement DAG in your environment.(Microsoft doesn’t recommend this!)
The minimum configurations for JBOD and DAG are
- Mailbox Server Nodes -> Minimum 3
- Database Copies -> Minimum 3 copies per database
- One spindle -> per Database and associated logs : Each database is placed on a dedicated JBOD for data and logs. This actually means you have your edb and log files on the same disk unlike earlier.
- You cannot implement JBODS on same physical hosts and attempt at virtualizing DAG instances on that host( this hinders the concept of DAG)
Explanation: the above said configuration works fine, as if even one of the maibox server nodes fail, you have two more ; if one of the database gets corrupted, you have two more and finally even if one spindle fails( as its non RAID-ed), you still can manage with two more better copies of each.
We always need to remember JBOD has no redundancy and its without RAID architecture, thus you need to have the minimum requirements as above.
Checking the JBOD Storage Config
- You could check your JBOD config for your particular infrastructure using the Mailbox Calculator by setting the level at JBOD (DO NOT FORGET: mention 3 OR MORE database copies, else you would get JBOD as not suitable option)
- You could also use JetStress 2010 for checking disk stress http://blogs.technet.com/b/exchange/archive/2009/09/01/3408180.aspx
- You could also use LoadGen 2010 for mailbox server sizing
- You can use the link http://blogs.technet.com/b/exchange/archive/2009/11/09/exchange-2010-mailbox-server-role-requirements-calculator.aspx to get a fair idea of your storage in terms of JBOD.
To summarize, I wish to render all few Thoughts to Ponder Upon In Depth Per-se JBODS…
Does Exchange work really good with only high end enterprise storage?
I don’t think so anymore with E2010. E2010 gives you the needed power to scale out your infrastructure with good “low cost and large” mailboxes with less expensive disk options which work equally good.
1. Okay, you get large mailboxes with E2010 and you use JBODS, but doesn’t it hamper performance of my Outlook?
E2010 was designed and is capable of supporting great numbers (100,000 items/folder). How does this work? With E2010 you get the feature of “Personal Archiving”. This feature is a killer feature!!!
2. BIGGEST THOUGHT: I shouldn’t use JBODs as its without RAID…so “Can RAID architecture be full proof for my Exchange server’s performance? “
I do not agree with the thought that having RAID in my Exchange environment guarantees me full proof uptime or doesn’t impact my users , even when disks are being invoked incase of failure scenarios. Let me make my stance clearer. When I have a disk failure situation in my RAID environment, what ideally happens is a “rebuild”. So if you think, during the rebuild, your “users” aren’t impacted, you are WRONG! Of-course, they will be impacted. When a disk is being rebuild, there will be additional heavy read cycles on disks that are functioning rightly and there will be stress of additional write cycles on the new disk being invoked. So this logically means, if at all you intend having minimum impact during the rebuild you need to sketch out additional huge disks after carefully calculating your environment and infrastructure capabilities.
Another huge myth is if I have enterprise class disks (FC/SCSI) with RAID, my disk failure rates will be very low. Common guys, disks, whether they are enterprise or not, are disks at the end of the day. No matter whether you have enterprise class disks (FC/SCSI) , nothing guarantees a less impacting scenario for users.
All of them are same irrespective of what capacity, form factor and rotational speed they have. So I wouldn’t be convinced if someone says to me that have an enterprise class storage with RAID and your user impacts would be less.
3. Another fundamental thought that keeps creeping people’s mindset “how reliable is E2010 when using JBOD with a DAG? ”
My only simple answer is “Less Awareness = Bigger Misconceptions”. To address this, firstly a yes , that you can use RAID in a JBOD in E2010. But let me quote that ”Microsoft has come out with a RAIDLESS JBOD architecture with E2010 for mailbox role.” So incase you get a node failure, CAS server would failover to a different server hosting replica of JBOD. SO IF YOU ASK me the question “ of redundancy or incase of disk failure in JBOD, your DAG basically takes care of that!” now this concept works when you have set up DAGs on your mailbox servers to have multiple copies , so that incase one mailbox server goes down, your dbs are not lost and another mailbox server takes over. Additionally multiple CAS needs to be set up as incase server does go down, users can still access mails by being redirected to another CAS. So invariably, DAG introduces the redundancy which RAID takes care.
One of the key things all of us should keep in mind is to avoid complexity in our environments. It just aggravates issues when there is an outage/issue, to resolve issues. Not only does it raise the bar of expectations in resolutions but also cost and of course maintenance in the long run. What I intend to say in a very polite fashion is, agreed you may get a lot of add-on features with high superior technology, but aren’t you multiplying your costs for every single added feature that the technology boasts about?
Simplicty in our messaging environment can add lot of “unexplored” incredible value if it can address the same expected features of “High Availability” in an unassuming manner as well right? Let me throw some lights on why I spoke about simplicity and complexity.
Let me take a SAN based Infrastructure for our Exchange Server and lets work out how it works up for High Availability and then contrast it with a JBOD environment for the same. A SAN based infrastructure usually comprises of switches to establish connections in a network through Fibre channel and HBAs (the most possible simplest explanation, lol ) connected across through multiple SAN I/O modules. Now when you have servers, switches, FC, FC adapters, HBAs, SAN modules, you can imagine the high availability it could obviously guarantee , but also the guaranteed complexity. Additionally you need a cream class of SAN administrators to keep this infrastructure functioning good (added expense!!!). A JBOD infrastructure for exchange calls for less complexity to a great extent. All it wants is simple servers and SCSI connections, you establish redundancy as desired (may 2 or more in site and 2 or more in remote) and you should be all set. It doesn’t call any administrational expertise; server administration capabilities are more than enough to handle failure situations.
I know its always challenging and difficult to convince people out there. There is this misconception or less faith in the credibility and reliability associated with JBODS and Exchange. Few may be aware and for those who aren’t , Gmail and O365 work on JBODS. J
Yes, you heard it right, all your gmail mails are stored on JBODS and much of O365 (cloud services) from Microsoft is also based on JBODS. Microsoft’s internal deployment (180000 users) are based on JBODS. Microsoft’s TerraServer also works on JBODS.
My entire expression throughout this discussion is not to say SAN works best or DAS works best or JBOD works best, but to make you feel a refreshing breeze wave of cost effective and simple solution “JBOD” that can be looked into as well.
The needs of every organization are different, and what suits one will definitely not suit other, but let’s take complex and expensive solutions unless in last scenarios. Let’s not invest money which can be saved just because we functioned in one fashion (all while), let’s open up and look beyond!!!