Storage
Azure Data Box Disk – Order, Usage, and Performance
Reading Time: 7 minutesData Box Disk Overview
I have written in the past on the considerations of using Data Box for offline data transfers into Azure or using online methods, which was primarily focused on Data Box Heavy. Here I am going to walk through the process of obtaining a Data Box, specifically a Data Box Disk (see the Data Box Family of offerings here). The ordering process for all Data Box devices is largely the same, and this can be used as a reference for any of them. However, the primary focus of this post will be on the setup and usage of Data Box Disk.
If you’ve read my previous post in which I postulated the merit of using an offline method of transfer in many cases, you may find it odd that I am now promoting the Data Box Disk, which is only suitable for transferring a few TB of data. I maintain my position that in most cases online transfer is optimal, especially for the type of data that would be in-scope for a Data Box Disk. However, as I have noted, there are some cases where offline data transfer is needed.
Order and Setup
Ordering a Data Box is straightforward through the Azure Portal.
After you’ve selected the initial configuration items, you will choose the device type.
You will name the order and select the destination storage in Azure.
After confirming whether you’re using a Microsoft-Managed Key or Customer-Managed Key (in this case I’m using a Microsoft-Managed Key) you will enter shipping information and the order will be submitted. In each step of the process, you will receive an email with the status. For example, here is the notification that my order was created and then again when it was delivered.
When you create the job in Azure, it creates a Data Box resource, which has all of the information about the device and order including a timeline showing where the device is in the process.
The Disk arrived with the SATA to USB cable, and I hooked it up to my Intel NUC (excuse the dust!).
Copying Data
Note in the image above both the USB adapter and the ports on my device are denoted with “SS” meaning they’re USB 3.0. This is important, you will note that the Data Box Disk is an SSD which is very performant. You will also note in the email stating the device was delivered, that I have a certain period of time to get it shipped back before I start incurring additional cost.
Most enterprise servers only have USB ports to support peripherals, and thus do not invest in USB 3.0 or 3.1, leaving you with the 2.0 standard. The maximum theoretical throughput of USB 2.0 is 480 Mbps, or 60 MBps. The maximum theoretical throughput of USB 3.0 however, is 5 Gbps or 625 MBps. This is an important note, that in some cases it may be faster to even attach this to a laptop that has Gigabit network connectivity to wherever the source data is held if the servers only have USB 2.0 ports.
*Note:* I am doing this in Windows, but you can do all of the following in Linux as well.
If I look in Windows Explorer when I attach the drive I can see a volume, but it is encrypted and locked. That is intentional and a part of the security process with Azure Data Box.
The process for allowing access to each device in the Data Box family is different, but with Data Box Disk there is a utility to unlock the device, which in combination with the passkey available under the Data Box resource in Azure, will unlock the device.
At the root of the filesystem, you will see a folder for all the storage types, Table, Queue, File, Blob, and Managed Disk; what you copy here will get copied to the respective storage type at the destination.
Performance
If you have a lot of small files, one thing to note is the impact of antivirus. Especially if you’re pulling TBs worth of small files across the network to a laptop where the drive is attached, since it’s writing those files locally your antivirus will likely do in-line scanning. Depending on the data and whether your policies allow, adding an exception on your antivirus for the folder where you’re copying the data e.g. “F:\BlockBlob” may speed up your copy performance.
To test performance, I devised two tests, one with large files and one with small files. For the large files, I copied a bit over 50GB of .iso files of various Linux distributions. The copy below is simply CNT+C, CNT+V of that folder from my machine’s SSD to the Data Box Disk using Windows Explorer. In addition to the copy operation, I took a screenshot of the disk throughput and activity in Task Manager (which is a way of showing how much of the capable performance is utilized by way of disk operations queuing metrics).
You can see with a single copy job I’m getting over 300 MBps for those large files. I then also wanted to try small files, which is much more likely of a use case for Data Box Disk. For this I used a PowerShell script which is a part of another project I’m working on which will be posted soon on my GitHub to create 10,000 x 1 MB files – I again first copied them using Windows Explorer.
I was able to get just over 50MBps in write speeds, which is good considering the file sizes, but given there were no constraints on my source disk, destination disk, or CPU, this led me to believe that the bottleneck was with the copy operation itself. Next, I wanted to run a test with a multi-threaded copy operation, so I first set a baseline with a single-threaded robocopy job.
You can see this took about 3 and a half minutes and copied at roughly the same speed as Windows Explorer. Now that I have my baseline, here’s the real performance test using the multi-threading flag on robocopy.
With that flag I was able to push over 3x the amount of performance, increasing from ~50MBps to ~190MBs and reducing the copy time from 3 minutes and 33 seconds to just 58 seconds which fully utilized my hardware.
I also went back and tried the same multi-threaded copy operation with my large files and was able to increase the throughput from 334MBps to 522MBps which fully utilized my hardware as well.
Wrap-up
I finished loading my data onto the disk and utilized the data validation utility, which comes in the same download as the tool that unlocks and decrypts the drive, to generate checksums of my data on the device which I can use later to validate data integrity when it is copied into the Storage Account. After that I unmounted the device, packaged it back up and dropped it off at my local UPS store – the box already had a return label on it.
Similar to when the device was being shipped to me, I got email notifications for each step of the way including when the data copy started, and when it finished. The process is then marked as complete and all of the details are available in the portal.
You can see the data is now loaded into the Storage Account, and you will see a “databoxcopylog” folder as well, which you can use to validate the copy jobs included with the final checksum of the files.
Lastly, you will see a one-time charge for the device on your invoice, you can see here the $90 fee for the Data Box Disk in Azure Cost Management.
*Note*: You will still be charged for any transactions that take place when loading the data into your storage account.
The data is now all loaded, and I get a confirmation via email (which is also shown in the portal screenshot above) that the device has been erased in accordance with NIST 800-88r1 standards. As I noted above, the process for ordering the device is largely similar for the Data Box or the Data Box Heavy.
If you have any questions, comments, or suggestions for future blog posts please feel free to comment below, or reach out on LinkedIn or Twitter. I hope I’ve made your day a little bit easier!
Shared Storage Options in Azure: Part 5 – Conclusion
Reading Time: 4 minutesPart 5, the end of the series! This has been a fun series to write, and I hope it was helpful to some of you. The impetus for this whole thing was the number of times I’ve been asked how to setup shared storage between systems (primarily VMs) in Azure. As we’ve covered, you can see there are a handful of different strategies with pros and cons to each. I’m going to close this series with a final Pros and Cons list and a few general design pattern directions.
- Part 1: Azure Shared Disks
- Part 2: IaaS Storage Server
- Part 3: Azure Storage Services
- Part 4: Azure NetApp Files
- Part 5: Conclusion
Pros and Cons:
Azure Shared Disks:
Shared Storage Options in Azure: Part 1 – Azure Shared Disks « The Tech L33T
Pros:
- Azure Shared disks allows for the use of what is considered “legacy clustering technology” in Azure.
- Can be leveraged by familiar tools such as Windows Failover Cluster Manager, Scale-out File Server, and Linux Pacemaker/Corosync.
- Premium and Ultra Disks are supported so performance shouldn’t be an issue in most cases.
- Supports SCSI Persistent Reservations
- Fairly simple to setup
Cons:
- Does not scale well, similar to what would be expected with a SAN mapping.
- Only certain disk types are supported.
- Read-Only host caching is not available for Premium SSDs with maxShares >1.
- When using Availability Sets and Virtual Machine Scale sets, storage fault domain alignment with the VMs are not enforced on the shared data disk.
- Azure Backup is not yet supported.
- Azure Site Recovery is not yet supported.
Azure IaaS Storage:
Shared Storage Options in Azure: Part 2 – IaaS Storage Server « The Tech L33T
Pros:
- More control, greater flexibility of protocols and configuration.
- Ability to migrate many workloads as-is and use existing storage configurations.
- Ability to use older, or more “traditional” protocols and configurations.
- Allows for the use of Shared Disks.
- Integration with Azure Backup.
- Incredible storage performance if the data can be cached/ephemeral (up to 3.8 million IOPS on L80s_v2).
Cons:
- Significantly more management overhead as compared to PaaS.
- More complex configurations, and cost calculations compared to PaaS.
- Higher potential for operational failure with a higher number of components.
- Broader attack surface, and more security responsibilities.
- Maximum of 80,000 uncached IOPS on any VM SKU.
Azure Storage Services (Blob and File):
Shared Storage Options in Azure: Part 3 – Azure Storage Services « The Tech L33T
Pros:
- Both are PaaS and fully managed which greatly reduces operational overhead.
- Significantly higher capacity limits as compared to IaaS.
- Ability to migrate some workloads as-is and use existing storage configurations when using SMB or BlobFuse compared to using native API connections.
- Ability to use Active Directory Authentication in Azure Files, and Azure AD Authentication in Blob and Files.
- Both integrate with Azure Backup.
- Much easier to geo-replicate compared to IaaS.
- Azure File Sync makes distributed File Share services and DFS a much better experience with Backup, Administration, Synchronization, and Disaster Recovery.
Cons:
- BlobFuse (by default) stores credentials in a text file.
- Does not support older access protocols like iSCSI.
- NFS is not yet Generally Available.
- Azure Files is limited to 100,000 IOPS (per share).
Azure NetApp Files:
Shared Storage Options in Azure: Part 4 – Azure NetApp Files « The Tech L33T
Pros:
- Incredibly high performance, depending on configuration (up to ~3.2 million IOPS/volume).
- SMB and NFS Shares both supported, with Kerberos and AD integration.
- More performance and capacity than is available on any single IaaS VM.
Cons:
- While it is deployed in most major regions, it may not yet be available where you need it yet (submit feedback if this is the case).
- Does not yet support Availability Zones, Cross-Region Replication is in Preview.
There we have it, my final list of Pros and Cons between Azure Shared Disks, DIY IaaS Storage, Azure Blob/Files, and Azure NetApp Files. Lastly, I want to end with some notes on general patterns when considering shared storage like the ones discussed in this series.
Patterns by Workload Type:
Quorum:
- If the reason you need shared storage is for a quorum vote, look into using a Cloud Witness for Failover Clusters (including SQL AlwaysOn).
- If the cloud quorum isn’t an option, shared disks is going to be an easy option, and I would go there second.
Block Storage:
- If you need shared block storage (iSCSI) for more than just quorum, chances are you need a lot of it, so I’d first recommend running IaaS storage. Start planning a migration away from this pattern though, Blob block storage on Azure is amazing and if you can port your application to use it – I would highly recommend doing so.
General File Share:
- For most generic file shares, Azure Files is going to be your best bet – with a potential use of Azure File Sync.
- Azure NetApp Files is also a strong option here since the Standard Tier is cost effective enough for it to be feasible, though ANF requires a bit more configuration than Azure Files.
- Lastly, you could always run your File Share in custom IaaS storage, but I would first look to a PaaS solution.
High-Performance File Storage:
- If your application doesn’t support the use of Blob storage, like most commercial products, Azure NetApp files is likely going to be your best bet.
- Once NFS becomes generally available, NFS on Azure Files and Blob store are going to be strong competitors – especially on Blob and ADLS.
- Depending on what “high-performance” means, and whether or not you use a scale-out software configuration, storage on IaaS could potentially be an option. This is a much more feasible option when the bulk of the data can be cached or ephemeral.
We’ve come to the end! I hope that was a useful blog series. As technologies and features advance, I’ll go back and update these, but please feel free to comment if I miss something. Please reach out to me in the comments, on LinkedIn, or Twitter with any questions about this post, the series, or anything else!
Shared Storage Options in Azure: Part 3 – Azure Storage Services
Reading Time: 8 minutesWelcome to Part 3 of this 5-part Series on Shared Storage Options in Azure. In this post I’ll be covering Azure Storage Services. You may be thinking to yourself, wait a minute, what have we been talking about this whole time then? Azure Storage Services is easiest thought of being the term used for the services offered under an Azure Storage Account (Blob, File, Queue, Table). Given the context of this series, I’ll be discussing Azure Blob Storage and Azure File storage in this post. Though, I do want to add a disclaimer that technically Queue, and Table Storage can be “shared” also since multiple apps can call the same Queue or Table using the APIs. Since the focus here is more on the system-level, I’m not going to cover those two, but I’ll add some links to documentation where you can read more.
- Part 1: Azure Shared Disks
- Part 2: IaaS Storage Server
- Part 3: Azure Storage Services
- Part 4: Azure NetApp Files
- Part 5: Conclusion
Azure Blob Storage:
In the majority of cases, when people discuss “cloud storage” they’re talking about Blob – binary large object. What this service allows us to do is store massive amounts of unstructured “objects” in Azure. There are a couple ways we can Blob storage as shared storage from a system-level.
Shared Blob Storage:
As I mentioned in the introduction, all Azure Storage Services can be accessed over HTTP/S via API or using any of the client libraries. This means that they can all technically be “shared” storage, but what about system-level access? While I find most applications and solutions can be adapted using a client library, there is a project called “Blobfuse” which can be used for more traditional applications.
Blobfuse is an open-source project on GitHub which uses the libfuse library to pair together the Linux FUSE kernel module and the Azure Blob REST APIs to create a virtual filesystem. The result of this configuration is a mount point on a Linux machine directly to a Blob Storage Account. There can be certain challenges in using Blobfuse though, for example the result is NOT a POSIX-compliant filesystem and if you use mount the same Blob Storage from multiple machines you should keep those limitations in mind.
The default configuration for the setup of Blobfuse is to have your Storage Account name and Access Key in a plain-text configuration file sitting on your server, which is not ideal from a security perspective and should be noted. However, it is possible to use a Managed Service Identity with Blobfuse which significantly improves the security posture of the deployment (if you use a System Assigned Managed Identity) and something I would recommend over the default configuration. Lastly, Blobfuse is not available on Windows – Linux only.
As of the time of writing this blog post (January, 2021) NFS is not yet Generally Available (GA) on Blob Storage, but NFS 3.0 has been in preview since July 2020 . Once this goes GA I will update this post with that information, but won’t quote this as an option until that point.
Lastly, from a backup and disaster recovery perspective, Azure Blob Storage supports snapshots as well as Point-in-Time restore for block blobs.
Typical Use Cases:
The majority of the use cases I’ve seen that use Blob as shared storage at the system level are wanting to use consumption-based cloud storage without the overhead or limitations of a managed disk. Specifically, in applications that don’t support SMB natively and require a local mount point, that’s where Blobfuse comes into play. I have seen this with a lot of apps that are migrated into the cloud and want lower cost, higher capacity than is available from managed disks with more legacy applications where this may be the case. I’ve also seen this configuration with many HPC applications since NFS as an access protocol is not yet GA for Blob storage.
Cost, Performance, Availability and Limitations:
The cost of using Blob storage is always the same regardless of the access protocol, since as of now it all ends up going through the Azure Storage API anyways.
Blob storage is incredibly performant. There are two tiers of Blob storage, Standard and Premium. In most cases, Standard will be the appropriate tier. Premium is for storage that needs single-digit transaction times and is better suited for larger block sizes (256KiB+). Though do keep in mind, that similar to my comparison of Managed Disk Types and the cost calculation of capacity and transaction costs, in some scenarios Premium Block Blob Storage may be cheaper.
If you’re using a standard Blob storage account (not configured with a Hierarchical Namespace) which is most common, you’ll enjoy the following performance (as of January, 2021).
Image Reference: https://docs.microsoft.com/en-us/azure/azure-resource-manager/management/azure-subscription-service-limits#storage-limits
When configuring containers in your Blob Storage Account you’ll notice an access tier setting with the options “Hot, Cool, Archive”. I covered Archive Storage a few years back but it’s not really relevant to this topic. What is relevant though is the different between Hot and Cool storage. There seems to still be a lot of confusion around the difference between the two, but at it’s core the main different is transaction cost.
Similar to the difference between Premium and Standard SSD Managed disks, Hot Blob storage has a higher capacity cost but a lower transaction cost while Cool Blob Storage has a lower capacity cost and a higher transaction cost. If you’re storing data that is infrequently accessed but still needs to be constantly available, Cool Blob Storage is the way to go. If you’re storing data that has a lot of transactions then Hot Blob storage is your best bet. Don’t get caught up in the “per GB” sticker on each tier – this can be misleading to the resulting cost depending on your workload characteristics.
As far as durability and availability goes, Storage Accounts have a few different options depending on the storage service being used: LRS, ZRS, GRS, ZGRS, GRS-RA. There is a lot of information on these different redundancy levels, so take a look at the durability and availability table below and if you want to read more, click the link below.
Image Reference: https://docs.microsoft.com/en-us/azure/storage/common/storage-redundancy
Additional Reading on Understanding Azure Storage Redundancy Offerings: https://techcommunity.microsoft.com/t5/azure-storage/understanding-azure-storage-redundancy-offerings/ba-p/1431700
Lastly, I recently re-created an outdated version of an infographic on capacity limits for Azure Storage Accounts and I thought I would share that here.
Feel free to reference this image using the following: https://urls.hansencloud.com/azure-storage-limits
Azure Files:
Azure Files is another storage service under the Azure Storage Account and has similar shared features, but some very distinct to itself as well. The primary purpose of Azure Files is to provide file-level storage services like you would get from a Network Attached Storage appliance or a Server providing those access protocols to a filesystem share. Azure Files provides SMB access (as well as HTTP/S API) to provisioned shares.
Access Methods:
Like I mentioned, the primary access methods for Azure Files are through the API or by using SMB (NFS v4.1 is currently in preview so I won’t be considering it an option in this post as of right now, but will update it when it goes GA). Even though SMB is most typically used in with Windows machines, shares on Azure Files can be used by Windows, Linux, or even MacOS.
Something really interesting about Azure Files though is its Azure File Sync capability which allows for a centralized file share in Azure Files which can be facilitated through agents deployed on Windows Servers which then act as a cache for the Azure Files data. This is particularly interesting because it allows the Server itself to present whichever access method it would like to the client, but use the backing of a centralized Azure Files Share.
The way Azure File Sync works at a high-level is a File Share is created, then linked to what is called a “sync group”, which facilitates the registrations from any agents deployed on Windows Servers (in Azure or on-prem).
Azure Files also allows, in conjunction with typical access key authentication, Active Directory-based authentication options . The ability to use this type of AuthN directly on the Azure Files PaaS endpoint is really interesting and makes it a great choice for a solution where you want to leverage the identity systems you already have in place. It’s also worth noting that if you’re using Azure File Sync, the deployed agent is the only one communicating with the File Share directly and the access to the data locally can be controlled through whichever method you prefer (SMB ACLs with ADDS, for example).
Lastly, from a backup and disaster recovery prospective, Azure Files supports snapshots in addition to native integration with Azure Backup.
Typical Use Cases:
I see a mix of uses with Azure Files. A mix between using it for a file-based backend for various applications and services to an environment where the data is access directly by users. A scenario I’ve run into more frequently though is when companies want to replace traditional on-prem File Servers and even things like DFS. Anywhere you want to leverage SMB in a fully managed PaaS way, Azure Files is for you.
Cost, Performance, Availability and Limitations:
Similar to Blob Storage, Azure Files has multiple Tiers to help optimize for performance and Cost.
Image Reference: https://azure.microsoft.com/en-us/pricing/details/storage/files/
Again, these tiers are priced based on capacity (provisioned or consumed) in combination with transactions and any snapshots or backups.
Performance for Azure Files is based on whether or not you use a standard storage account or a specific “Azure Files” storage account SKU which will enable “Premium” File Shares. The performance specifications for a standard storage account (eg. General Purpose v2) are the same as the limits posted for the blob storage earlier. If you’re using Premium Files though, here are the performance targets.
Image Reference: https://docs.microsoft.com/en-us/azure/storage/files/storage-files-scale-targets
Keep in mind that the 100TB limit is per share, and you can create multiple (just like you would with traditional file shares) up to the limit of the Storage Account (5 PB by default, as stated earlier in this post).
Lastly, availability for Azure Files is no different than Blob since they’re both contained by a storage account and will be subject to the availability and durability of the storage account data redundancy setting.
Pros and Cons:
Okay, here we go with the Pros and Cons for using an Azure Storage Services (Blob & File) for your shared storage configuration on Azure.
Pros:
- Both are PaaS and fully managed which greatly reduces operational overhead.
- Significantly higher capacity limits as compared to IaaS.
- Ability to migrate workloads as-is and use existing storage configurations when using SMB or Blobfuse.
- Ability to use Active Directory Authentication in Azure Files
- Both integrate with native backup solutions.
- Both integrate with Azure Defender for Storage.
Cons:
- Blobfuse stores connection information in plain-text, by default.
- Does not support older access protocols like iSCSI.
Alright, that’s it for Part 3 of this blog series – Shared Storage on Azure Storage Services. Please reach out to me in the comments, on LinkedIn, or Twitter with any questions or comments about this post, the series, or anything else!
Shared Storage Options in Azure: Part 2 – IaaS Storage Server
Reading Time: 9 minutesRecently, I posted the “Shared Storage Options in Azure: Part 1 – Azure Shared Disks” blog post, the first in the 5-part series. Today I’m posting Part 2 – IaaS Storage Server. While this post will be fairly rudimentary insofar as Azure technical complexity, this is most certainly an option when considering shared storage options in Azure and one that is still fairly common with a number of configuration options. In this scenario, we will be looking at using a dedicated Virtual Machine to provide shared storage through various methods. As I write subsequent posts in this series, I will update this post with the links to each of them.
- Part 1: Azure Shared Disks
- Part 2: IaaS Storage Server
- Part 3: Azure Storage Services
- Part 4: Azure NetApp Files
- Part 5: Conclusion
Virtual Machine Configuration Options:
Compute:
While it may not seem vitally important, the VM SKU you choose can impact your ability to provide storage capabilities in areas such as Disk Type, Capacity, IOPS, or Network Throughput. You can view the list of VM SKUs available on Azure at this link. As an example, I’ve clicked into the General Purpose, Dv3/Dvs3 series and you can see there are two tables that show upper limits of the SKUs in that family.
In the limits for each VM you can see there are differences between Max Cached and Temp Storage Throughput, Max Burst Cached and Temp Storage Throughput, Max uncached Disk Throughput, and Max Burst uncached Disk Throughput. All of these represent very different I/O patterns, so make sure to look carefully at the numbers.
Below are a few links to read more on disk caching and bursting:
- Disk Caching: https://docs.microsoft.com/en-us/azure/virtual-machines/premium-storage-performance#disk-caching
- Disk Bursting: Managed disk bursting – Azure Virtual Machines | Microsoft Docs
You’ll notice when you look at VM SKUs that there is an L-Series which is “storage optimized”. This may not always be the best fit for your workload, but it does have some amazing capabilities. The outstanding feature of the L-Series VMs are the locally mapped NVMe drives which as of the time of writing this post on the L80s_v2 SKU can offer 19.2TB of storage at 3.8 Million IOPS / 20,000 MBPS.
The benefits of these VMs are the extremely low latency, and high throughput local storage, but the caveat to that specific NVMe storage is that it is ephemeral. Data on those disks does not persist a reboot. This means it’s incredibly good at serving from a local cache, tempdb files, etc. though its not storage that you can use for things like a File Server backend (without some fancy start-up scripts, please don’t do this…). You will note that the maximum uncached throughput is 80,000 IOPS / 2,000 MBPS for the VM, which is the same as all of the other high spec VMs. As I am writing this, no Azure VM allows for more than that for uncached throughput – this includes Ultra Disks (more on that later).
For more information on the LSv2 series, you can read more here: Lsv2-series – Azure Virtual Machines | Microsoft Docs
Additional Links on Azure VM Storage Design:
- Azure Premium Storage: Design for high performance – Azure Virtual Machines | Microsoft Docs
- Virtual machine and disk performance – Linux – Azure Virtual Machines | Microsoft Docs
Networking:
Networking capabilities of the Virtual Machine are also important design decisions when considering shared storage, both in total throughput and latency. You’ll notice in the VM SKU charts I posted above when talking about performance there are two sections for networking, Max NICs and Expected network bandwidth Mbps. It’s important to know that these are VM SKU limitations, which may influence your design.
Expected network bandwidth is pretty straight forward, but I want to clarify that the number of Network Interfaces you mount to a VM does not change this number. For example, if your expected network bandwidth is 3200 Mbps and you have an SMB share running on that single NIC, adding a second NIC and using SMB multi-channel WILL NOT increase the total bandwidth for the VM. In that case you could expect each NIC to potentially run at 1,600 Mbps.
The last networking feature to take into consideration is Accelerated Networking. This feature allows for SR-IOV (Single Root I/O Virtualization), which by bypassing the host CPU and offloading the network traffic directly to the Network Interface can dramatically increase performance by reducing latency, jitter, and CPU utilization.
Image Reference: Create an Azure VM with Accelerated Networking using Azure CLI | Microsoft Docs
Accelerated Networking is not available on every VM though, which makes it an important design decision. It’s available on most General Purpose VMs now, but make sure to check the list of supported instance types. If you’re running a Linux VM, you’ll also need to make sure it’s a supported distribution for Accelerated Networking.
Storage:
In an obvious step, the next design decision is the storage that you attach to your VM. There are two major decision types when selecting disks for you VM – disk type, and disk size.
Disk Types:
Image Reference: https://docs.microsoft.com/en-us/azure/virtual-machines/disks-types
As the table above shows, there are three types of Managed Disks (https://docs.microsoft.com/en-us/azure/virtual-machines/managed-disks-overview ) in Azure. At the time of writing this, Premium/Standard SSD and Standard HDD all have a limit of 32TB per disk. The performance characteristics are very different, but I also want to point out the difference in the pricing model because I see folks make this mistake very often.
Disk Type: | Capacity Cost: | Transaction Cost: |
Standard HDD | Low | Low |
Standard SSD | Medium | Medium |
Premium SSD | High | None |
Ultra SSD | Highest (Capacity/Throughput) | None |
Transaction costs can be important on a machine whose sole purpose is to function as a storage server. Make sure you look into this before a passing glance shows the price of a Standard SSD lower than a Premium SSD. For example, here is the Azure Calculator output of a 1 TB disk across all four types that averages 10 IOPS * ((10*60*60*24*30)/10,000) = 2,592 transaction units.
Sample Standard Disk Pricing:
Sample Standard SSD Pricing:
Sample Premium SSD Pricing:
Sample Ultra Disk Pricing:
The above example is just an example, but you get the idea. Pricing gets strange around Ultra Disk due to the ability to configure performance (more on that later). Though there is a calculable break-even point for disks that have transaction costs versus those that have a higher provisioned cost.
For example, if you run an E30 (1024 GB) Standard SSD at full throttle (500 IOPS) the monthly cost will be ~$336, compared to ~$135 for a P30 (1024 GB) Premium SSD, with which you get x10 the performance. The second design decision is disk capacity. While this seems like a no-brainer (provision the capacity needed, right?) it’s important to remember that with Managed Disks in Azure, the performance scales with, and is tied to, the capacity of the disk.
Image Reference: https://docs.microsoft.com/en-us/azure/virtual-machines/disks-types#disk-size-1
You’ll note in the above image the Disk Size scales proportionally with both the Provisioned IOPS and Provisioned Throughput. This is to say that if you need more performance out of your disk, you scale it up and add capacity.
The last note on capacity is this, if you need more than 32TB of storage on a single VM, you simply add another disk and use your mechanism for combining that storage (Storage Spaces, RAID, etc.). This same method can be used to further tweak your total IOPS, but make sure you take into consideration the cost of each disk, capacity, and performance before doing this – most often it’s an insignificant cost to simply scale-up to the next size disk. Last but not least, I want to briefly talk about Ultra Disks – these things are amazing!
Unlike with the other disk types, this configuration allows you to select your disk size and performance (IOPS AND Throughput) independently! I recently worked on a design where the customer needed 60,000 IOPS, but only needed a few TB of capacity, this is the perfect scenario for Ultra Disks. They were actually able to get more performance, for less cost compared to using Premium SSDs.
To conclude this section, I want to note two design constraints when selecting disks for your VM.
- The VM SKU is still limited to a certain number of IOPS, Throughput and Disk Count. Adding together the total performance of your disks, cannot exceed the maximum performance of the VM. If the VM SKU supports 10,000 IOPS and you add 3x 60,000 IOPS Ultra Disks, you will be charged for all three of those Ultra Disks at their provisioned performance tiers but will only be able to get 10,000 IOPS out of the VM.
- All of the hardware performance may still be subject to the performance of the access protocol or configuration, more on this in the next section.
Additional Reading on Storage:
- Disk Types: Select a disk type for Azure IaaS VMs – managed disks – Azure Virtual Machines | Microsoft Docs
Software Configuration and Access Protocols:
As we come to the last section of this post, we get to the area that aligns with the purpose of this blog series – shared storage. In this section I’m going to cover some of the most common configurations and access types for shared storage in IaaS. This is by no means an exhaustive list, rather what I find most common.
Scale-Out File Server (SoFS):
First up is Sale-Out File Server, this is a software configuration inside Windows Server that is typically used with SMB shares. SoFS was introduced in Windows 2012, uses Windows Failover Clustering, and is considered a “converged” storage deployment. It’s also worth noting that this can run on S2S (Storage Space Direct), which is the method I recommend using with modern Windows Server Operating Systems. Scale-Out File Server is designed to provide scale-out file shares that are continuously available for file-based server application storage. It provides the ability to share the same folder from multiple nodes of the same cluster. It can be deployed in two configuration options, for Application Data or General Purpose. See the additional reading below for the documentation on setup guidance.
Additional reading:
- Storage Spaces Direct: Storage Spaces Direct overview | Microsoft Docs
- Scale-Out File Server: Scale-Out File Server for application data overview | Microsoft Docs
- Setup guide for 2-node SSD for RDS UPD: Deploy a two-node Storage Spaces Direct SOFS for UPD storage in Azure | Microsoft Docs
SMB v3:
Now into the access protocols – SMB has been the go-to file services protocol on Windows for quite some time now. In modern Operating Systems, SMB v3.* is an absolutely phenomenal protocol. It allows for incredible performance using things like SMB Direct (RDMA), Increasing MTU, and SMB Multichannel which can use multiple NICs simultaneously for the same file transfer to increase throughput. It also has a list of security mechanisms such as Pre-Auth Integrity, AES Encryption, Request Signing, etc. There is more information on the SMB v3 protocol below, if you’re interested, or still think of SMB in the way we did 20 years ago – check it out. The Microsoft SQL Server team even supports SQL hosting databases on remote SMB v3 shares.
Additional reading:
NFS:
NFS has been a similar staple as a file server protocol for a long while also, and whether you’re running Windows or Linux can be used in your Azure IaaS VM for shared storage. For organizations that prefer an IaaS route compared to PaaS, I’ve seen many use this as a cornerstone configuration for their Azure Deployments. Additionally, a number of HPC (High Performance Compute) workloads, such as Azure CycleCloud (HPC orchestration) or the popular Genomics Workflow Management System, Cromwell on Azure prefer the use of NFS.
Additional Reading:
- Create NFS Ubuntu Linux Server volume – Azure Kubernetes Service | Microsoft Docs
- azure-quickstart-templates/nfs-ha-cluster-ubuntu at master · Azure/azure-quickstart-templates (github.com)
iSCSI:
While I would not recommend the use of custom block storage on top of a VM in Azure if you have a choice, some applications do still have this requirement in which case iSCSI is also an option for shared storage in Azure.
Additional Reading:
That’s it! We’ve reached the end of Part 2. Okay, here we go with the Pros and Cons for using an IaaS Virtual Machine for your shared storage configuration on Azure.
Pros and Cons:
Pros:
- More control, greater flexibility of protocols and configuration.
- Depending on the use case, potentially greater performance at a lower cost (becoming more and more unlikely).
- Ability to migrate workloads as-is and use existing storage configurations.
- Ability to use older, or more “traditional” protocols and configurations.
- Allows for the use of Shared Disks.
Cons:
- Significantly more management overhead as compared to PaaS.
- More complex configurations, and cost calculations compared to PaaS.
- Higher potential for operational failure with the higher number of components.
- Broader attack surface, and more security responsibilities.
Alright, that’s it for Part 2 of this blog series – Shared Storage on IaaS Virtual Machines. Please reach out to me in the comments, on LinkedIn, or Twitter with any questions about this post, the series, or anything else!
- Part 1: Azure Shared Disks
- Part 2: IaaS Storage Server
- Part 3: Azure Storage Services
- Part 4: Azure NetApp Files
- Part 5: Conclusion
Shared Storage Options in Azure: Part 1 – Azure Shared Disks
Reading Time: 4 minutesIn an IaaS world, shared storage between virtual machines is a common ask. “What is the best way to configure shared storage?”, “What options do we have for sharing storage between these VMs?”, both are questions I’ve answered several times, so let’s go ahead and blog some of the options! The first part in this blog series titled “Shared Storage Options in Azure”, will cover Azure Shared Disks.
As I write subsequent posts in this series, I will update this post with the links to each of them.
- Part 1: Azure Shared Disks
- Part 2: IaaS Storage Server
- Part 3: Azure Storage Services
- Part 4: Azure NetApp Files
- Part 5: Conclusion
When shared disks were announced in July of 2020, there was quite a bit of excitement in the community. There are so many applications that still leverage shared storage for things like Windows Server Failover Clustering, on which many applications are built like SQL Server Failover Cluster Instances. Also, while I highly recommend using a Cloud Witness, many customers migration workloads to Azure still rely on a shared disk for quorum as well. Additionally, many Linux applications leverage shared storage that were previously configured to use a shared virtual disk, or even RAW LUN mappings, for applications such as GFS2 or OCFS2.
Additional sample workloads for Azure Shared Disks can be found here: Shared Disk Sample Workloads.
There are a few limitations of shared disks, the list of which is constantly getting smaller. For now, though, let’s just go ahead and jump into it and see how to deploy them. After which, we’ll do a quick “Pros” and “Cons” list before moving on to the other shared storage options. I deployed Shared Disks in my lab using the portal first (screenshots below), but also created a Github Repository (https://github.com/matthansen0/azure-shared-storage-options) with the Azure PowerShell script and an ARM template to deploy a similar environment – feel free to use those if you’d like!
As a prerequisite (not pictured below) I created the following resources:
- A Resource Group in the West US region
- A Virtual Network with a single subnet
- 2x D2s v3, Windows Server 2016 Virtual Machines (VM001, VM002) each with a single OS disk
Now that those are created, I deployed a Managed Disk (named “sharedDisk001”) just like you would if you were deploying a typical data disk.

On the “advanced” tab you will see the ability to configure the managed disk as a “shared disk”, here is where you set the max shares which specifies the maximum number of VMs that can attach that particular disk type.


After the disk is finished deploying, we head over to the first VM and attach an existing disk. You’ll note that the disk shows up as a “shared disk” and shows the number of shares left available on that disk. Since this is the first time it’s being mounted it shows 0.

After attaching the disk to the first VM, we head over and do the same thing on VM002. You’ll note that the number of shares has increased by 1 since we have now mounted the disk on VM001.

Great, now the disk is attached to both VMs! Heading over to the managed disk itself you’ll notice that the overview page looks a bit different from typical managed disks, showing information like “Managed by” and “Max Shares”.

In the properties of the disk, we can see the VM owners of that specific disk, which is exactly what we wanted to see after mounting it on each of the VMs.

Although I setup this configuration using Windows machines, you’ll notice I didn’t go into the OS. This is to say that the process, from an Azure perspective, is the same with Linux as it is with Windows VMs. Of course, it will be different within the OS, but there is nothing Azure-specific from that aspect.

Okay, here we go the Pros and Cons:
Pros:
- Azure Shared disks allows for the use of what is considered to be “legacy clustering technology” in Azure.
- Can be leveraged by familiar tools such as Windows Failover Cluster Manager, Scale-out File Server, and Linux Pacemaker/Corosync.
- Premium and Ultra Disks are supported so performance shouldn’t be an issue in most cases.
- Supports SCSI Persistent Reservations.
- Fairly simple to setup.
Cons:
- Does not scale well, similar to what would be expected with a SAN mapping.
- Only certain disk types are supported.
- ReadOnly host caching is not available for Premium SSDs with maxShares >1.
- When using Availability Sets and Virtual Machine Scale sets, storage fault domain alignment with the VMs are not enforced on the shared data disk.
- Azure Backup not yet supported.
- Azure Site Recovery not yet supported.
Alright, that’s it for Azure Shared Disks! Go take a look at my Github Repository and give shared disks a shot!
Please reach out to me in the comments, LinkedIn, or Twitter with any questions or comments about this blog post or this series.
Delete Azure Recovery Vault Backups Immediately
Reading Time: 3 minutesIf you’re like many others, over the past few months you’ve noticed that if you configure Azure Backup, you can’t delete the vault for 14 days until after you stop backups. This is due to Soft Delete for Azure Backup. It doesn’t cost anything to keep those backups during that time, and it’s honestly a great safeguard to accidentally deleting backups and gives the option to “undelete”. Though, in some cases (mostly in lab environments) you just want to clear it out (or as was affectionately noted by a colleague of mine, “nuke it from orbit”). Let’s walk though how to do that real quickly.
When you go and stop backups and delete the data you’ll get the warning “The restore points for this backup item have been deleted and retained in a soft delete state” and you have to wait 14 days to fully delete those backups, you’ll also get an email alert letting you know.


To remove these backups immediately we need to disable soft delete, which is a configuration setting in the Recovery Services Vault. DO NOT DO THIS UNLESS YOU ABSOLUTELY MUST. As previously noted, this is a greats safeguard to have in place, and I would also suggest using ARM Resource Locks in production environments in addition to soft delete. If you’re sure though, we can go turn it off.

Alright, now that we’ve disabled Soft Delete for the vault, we have to commit the delete operation again. This means first we’ll need to “undelete” the backup, then delete it again which this time won’t be subject to the soft delete policy.

Now we can go delete it again, after which we can find that there are no backup items in the vault.


Success!!! The backup is fully deleted. So long as there are no other dependencies (policies, infrastructure, etc.) you can now delete the vault.
If you have any questions or suggestions for future blog posts feel free to comment below, or reach out to me via email, twitter, or LinkedIn.
Thanks!
Configure Azure Blob Archive Storage
Reading Time: 4 minutesAzure storage is great. Good thought to open on right? Of course! This year Azure graced us with the ability to (preview) the new Azure Archive Storage. Obviously this is enticing, especially at it’s (current) $0.0018/GB price point. For more cost informaiton on Azure Archive Storage you can visit the link below.
https://azure.microsoft.com/en-us/pricing/details/storage/blobs/
Now this is nice, but I found myself a bit perplexed. How do I configure a storage account as an “archive” storage account? As it turns out, you don’t. Let’s walk though configuring an archive blob tier.
First, obiously you need a storage account. The Archive access tier is currently available on either “Blob” or “General Purpose v2”. General Purpose v2 will work the same way, you’ll just also have the ability to host non-blob storage (File, Queue, Table). I’m going to just choose Blob though for this purpose.
Account kind selected, I’ll create the storage account. You can choose whatever Access Tier you’d like, that’s the access tier all of your objects will inheret by default. I choose “Cool” here because you will have to upload data before you can archive it and the cool tier saves money initially.
Alright storage account created, let’s go open it up.
If you go to the “Configuration” tab you can see the default acces tier you selected durring creation. Here is where I was a bit confused, why don’t I have the ability to select archive? You’ll see in a bit.
Go ahead and create a container, and upload a file. I created a container with the very complex name of “container1”, and have uploaded my very important image file that I want to archive.
Can see above that the inhereted access tier is “Cool” which was set at the storage account level. If you go into the blob properties you can see at the bottom there is an option to select the access tier for that specific file. Ah! There is it, Archive!
I’ll go ahead and select Archive, and see the the following message.
Please be cognizant of this, they aren’t kidding when they say that rehydration can take a long time. We can now refresh and see that the file is set to an access tier of “archive”.
Fantastic, we’ve archived the file! Now here is where you have to be careful, while the file is in archive the only data you’re able to access is the file metadata. The file itself is NOT ACCESSABLE until rehydrated. If you try and download the file while archived you can see the following message.
Archive storage is designed to be very long-term storage that you don’t need to access immediately, thus the low cost point. If you do need to access your file, you simply go back to that object and change it’s access tier to either Cool or Hot. It will then go through the “rehydration” process to move the file back into an accessable access tier.
I urge you to take that message seriously, in this example it took about 8 hours for my 48k image file to be rehyrated. They say it takes longer for larger files, and I’m going to test that next. Though in the mean time, assume it will take quite some time to be accessable again. After which time, WHEW! I recovered my very, very important file.
There you go, how to configure Azure Blob Archive Storage.
I hope I’ve made your day at least a little bit easier.
DFSR Failure After VM Restore (DFSR Error 2104)
Reading Time: 3 minutesI have an environment that heavily leverages DFS, and recently one of the replication member servers had to be restored from a VEEAM backup. Typically VEEAM is great and doesn’t cause any issues, in this case though DFS completely broke. I got a TON of SCOM alerts, and the event log was littered with them as well.
The DFS Replication service failed to recover from an internal database error on volume D:. Replication has been stopped for all replicated folders on this volume. Additional Information: Error: 9214 (Internal database error (-1605)) Volume: D: xxxxxx Database: D:\System Volume Information\DFSR
Event 2212, DFSR
The DFS Replication service has detected an unexpected shutdown on volume D:. This can occur if the service terminated abnormally (due to a power loss, for example) or an error occurred on the volume. The service has automatically initiated a recovery process. The service will rebuild the database if it determines it cannot reliably recover. No user action is required.
Additional Information:
Volume: D:
GUID: xxxxxxxxxxxxxxx
Error 2104, DFSR
The DFS Replication service failed to recover from an internal database error on volume D:. Replication has been stopped for all replicated folders on this volume.
Additional Information:
Error: 9214 (Internal database error (-1605))
Volume: xxxxxxxxxxxxxxxxxxxxxxxx
Database: D:\System Volume Information\DFSR
The important error here is 2104, noting the database issue. There are multiple topics out there that talk about this, but they all end up linking back to this support article.
In the end, essentially the database that is used by DFS replication becomes corrupted. It is a system-generated database so all you need to do is disable the replication service, delete the database, and start the replication service back up. Easy? No. There are a myriad of issues with doing this, mostly because the database is hosted in “System Volume Information” on the volume that hosts the DFS Root folder, or wherever you’ve placed the replication targets. Luckily for you, I hit my head against a wall for hours on end and figured out the solution.
Step 1: Stop DFSR service (stop-service DFSR)
Step 2: Grant yourself visibility to the “System Volume Information” folder. This entails flipping the radio button in explorer to “view hidden files”, as well as unchecking the box for “hide all system protected folders”.
Step 3: Grant yourself proper permissions to the “System Volume Information” folder. Go to the root of the volume that holds the replication targets eg. D:\. You will now see a grayed-out folder with a lock on it called “System Volume Information”. Go through the normal rigamaroo to grant “Administrators” full control over the folder. You should then be able to open it up, before it would have said “Access Denied”.
Step 4: Delete or rename the “DFSR” folder inside “System Volume Information”. Unfortunately, that’s not easy. Based on what I saw, it was because the file names in the database folder exceeded the limitations of explorer ( https://hansencloud.com/2014/04/22/varying-file-name-too-long-issues ). Here the easiest thing to use is the wonderful Robocopy /MIR! Create an empty folder in the root of the drive and copy it into the DFSR folder using the /mir flag in robocopy. This will “mirror” the source folder into the destination folder.
Now the DFSR folder should be completely empty.
Step 5: Start the DFS Replication service (start-service DFSR)
Step 6: Check for validating event logs.
Event 4102, DFSR
The DFS Replication service initialized the replicated folder at local path D:\xxxxxx and is waiting to perform initial replication. The replicated folder will remain in this state until it has received replicated data, directly or indirectly, from the designated primary member.
Additional Information:
Replicated Folder Name: XXXXXXX
Replicated Folder ID: XXXXXXXXXXXXXXXXXXXX
Replication Group Name: XXXXX\XXXX
Replication Group ID: XXXXXXXXXX
Member ID: XXXXXXXXXXXXX
Event 4412, DFSR
The DFS Replication service detected that a file was changed on multiple servers. A conflict resolution algorithm was used to determine the winning file. The losing file was moved to the Conflict and Deleted folder.
Additional Information:
Original File Path: D:\XXXXXXX
New Name in Conflict Folder: XXXXXXXXXXX
Replicated Folder Root: D:\XXXXXXXX
File ID: XXXXXXXXXXXXXXXX
Replicated Folder Name: XXXXXXXXXXXX
Replicated Folder ID: XXXXXXXXXXXXXXX
Replication Group Name: XXXXXXXXXXXXXX
Replication Group ID: XXXXXXXXXXXXXXXXX
Member ID: XXXXXXXXXXXXXXXXXXXX
There you go! You’ve done it! Microsoft said you had to contact their support to fix it, but you crafty devil – you’ve gone and done it yourself.
I hope I’ve made your day at least a little bit easier.
Changing Azure Recovery Services Vault to LRS Storage
Reading Time: 2 minutesBack in the classic portal with backup services it was an easy fix. Simply change the settings value of storage replication type. I’ve recently started moving my workloads to recovery serveries vaults in ARM, and noticed something peculiar. By default, the storage replication type of the vault is GRS.
If your needs require geographically redundant storage, that that’s perfectly fine. I however don’t have such needs, and trust in Microsoft’s ability to keep data generally available in a LRS replication topology. It should be just like it was in classic, as an option anyways, right? Strangely, the option to change the replication type for the storage configuration on the vault is grayed out.
Odd, right? I thought so, until I found this.
Okay, well it’s not optimal but it looks like I need to remove the backup data from the vault to change the storage replication types right? Well, I gave that a shot and no go. I had the same issue, the option was still grayed out.
I ultimately had to completely delete, and create a new recovery services vault. Once it’s initially created you can change the replication type.
Ah, finally! Then register the VM(s), run some backup jobs and voila! Confirmation that the vault is using LRS storage.
I hope this makes your day at least a little bit easier.
Thanks,
Disk Performance on Server 2012 Task Manager
Reading Time: < 1 minuteSomething that I’ve noticed not a lot people know, is how to get your disk performance to show up on a server’s task manager. Yes, yes, I know. With the new Microsoft model you shouldn’t be RDP’ing to servers anyways — but I still get asked how to do this. For some reason they have this by default on Windows 8, but not enabled on Servers. So here’s the one command you need to fix that.
1) The problem, no disk portion on task manager. 🙁
2) Run the following command to enable it.
“DiskPerf -Y”
This enables the physical and logical disk performance counters.
3) Close and re-open task manager.
There ya go!
I hope I’ve made your day at least a little bit easier.