Performance Testing with Azure Firewall Basic SKU

Posted on

Reading Time: 4 minutes


Azure Firewall is a cloud-native, stateful, managed firewall service with built-in high availability, with incredible automated scaling and throughput capabilities, and many security features. In some cases, though, the price point can be a bit high to get started if they’re being used for basic L3/L4 rules, routing, and NAT capabilities. For this reason, last year the “Basic” Azure Firewall SKU was released.

There are two costs associated with Azure Firewall:

  • Firewall base cost (per unit)
  • Data Processed

The base cost per firewall unit is as follows (in East US, as of September 2023):

  • Premium: $1,277/mo
  • Standard: $912/mo
  • Basic: $288/mo

As for the data processing costs, they are as follows:

  • Premium: $0.016/GB ($16/TB)
  • Standard: $0.016/GB ($16/TB)
  • Basic: $0.065/GB ($65/TB)

You will note that with the lower base cost, comes a higher processing charge, so don’t get too excited thinking you can just swap production to a Basic SKU and save money, make sure and do your homework.

The features and comparisons of these SKUs is well documented at

azure firewall feature comparison between skus

You can look at a more in-depth list of features and capabilities of the Firewall Basic SKU here

You will notice that the documentation states that the Firewall Basic SKU says that it’s “recommended for environments with an estimated throughput of 250 Mbps”, this got me thinking though, how much throughput can it actually handle? Understandably this SKU is primarily designed for lower level or lab environments, but what if I need to do a data load or something, is it really limited to 250 Mbps?

Lab Testing:

To test this theory, I built a little lab environment:

  • Hub network with an Azure Firewall Basic SKU
  • Two spoke networks with a Route Table that point all traffic through the Hub Firewall
  • A Windows Client OS in the “client” vnet
  • A Linux server in the “server-vnet” network
  • Both VMs are configured as DS4_v4 so they have plenty of CPU and are capable of up to 10Gbps so we can be sure they won’t be a bottleneck in the testing.

Now that I have the lab environment build, I wanted to test the throughput of the firewall in two ways:

  • SNAT the traffic through the Firewall to the Internet
  • Route the traffic between spokes

Internet Speed Test:

For the first test, I chose to simply use everyone’s favorite at-home test – To verify that we’re using the firewall as the path to the internet, I also took a screenshot of the public IP associated with it to compare to the speed test.

You will note that the IP address in both the pip and on the speed test are the same, and also that it looks like we’re getting 1Gbps through the Firewall Basic SKU to the internet.

Intranet Speed Test:

The second test is between spoke networks, routed through the hub. For this I configured my Linux server with the OpenSpeedTest docker container, you will notice the private IP in the address bar. This is a very cool project that I also run at home for quick diagnostics and load testing.

In this test, the client OS is traversing the peer to the hub network, then being routed by the Azure Firewall across the peer to the server network; with this test I got roughly 1Gbps as well, with <1 ms latency.



While payloads like are surely not what I would consider a thorough load test, these are the types of results I was looking for in this quick test. If we were testing something for production, we would want to include things like max connections, speed per connection, total bandwidth, scaling metrics, etc.

The throughput available in Azure Firewall is impressive, with the Standard SKU capable of up to 30 Gbps and the Premium up to 100 Gbps (see for more information). The Firewall Basic SKU though still holds its own, showing that it’s capable of 1 Gbps throughput which in many cases of lower-level or lab environments is more than enough. While I didn’t create an auto-deploy ARM template for this lab environment, let me know if it would be useful and I’ll put something together on GitHub.

If you have any questions, comments, or suggestions for future blog posts please feel free to comment below or reach out on LinkedIn or Twitter. I hope I’ve made your day a little bit easier!




Azure NVIDIA VM for PyTorch and TensorFlow in an MSDN Subscription

Posted on Updated on

Reading Time: 9 minutes


Unless you’ve been living under a rock, you know that Machine Learning (ML) and Artificial Intelligence (AI) are all the rage right now, and will be going forward in technology. In most cases I will first recommend that customers use offerings such as Azure Cognitive Services, Azure OpenAI, Azure OpenAI (Use Your Data), or Azure Machine Learning. In some cases though, customer want to roll their own ML/AI, or simply want to work with some Open-Source projects which require deployment on a VM with the CUDA toolkit. If you need to do this, I recommend looking at the GPU optimized VMs on Azure (GPU Optimized Virtual Machine Sizes) and choosing a modern VM SKU with a modern GPU.

Although, many people have, and often don’t use, MSDN entitlements which are designated for personal sandboxes and learning, and come with $50, $100, or $150 per month in Azure Credits. The purpose of this blog is to show you how you can deploy a GPU VM in an MSDN subscription for less than $0.50 per hour (an entire day of lab work for less than $5 of your credits), let’s get started!

VM Deployment:

To kick things off, we will need to deploy a VM. There are many “gotchas” to deploying a GPU VM in an MSDN subscription (mainly because the SKUs are typically restricted and reserved for paying customer subscriptions).

We are going to use the following configuration:

  • Ubuntu 20.04 VM
  • East US Region
  • No infrastructure Redundancy
  • Standard Security Type
  • Configure the VM Generation for Generation 1 (you will need to change this from the Generation 2 Default)
  • Standard_NC6_Promo SKU

You can go look at the other sizes offered, but as noted, if you’re in an MSDN subscription all other sizes will likely say they are not available. This NC6_Promo size is set to be retired at the end of August 2023, so I will update this blog post after that time. You will note that the VM size doesn’t support Premium SSDs so on the “Disks” tab of deploying the VM it will select Standard SSD by default. You can change this to Standard HDD but I would not recommend doing so for this type of work.

Since this is just a lab environment, I am going to use Azure Network Security Groups (NSGs) to control my SSH access to the VM rather than something like Azure Bastion. I will configure the NSG after the VM deployment is complete so for now I’m going to say create a Public IP, but do not allow any ports.

Next, just in case I get sidetracked with other work I want to make sure this VM shuts down automatically and doesn’t keep burning credits, so I’m going to set the automated shutdown for 7PM.


After all of the validation has passed you can create the VM. You will note here that the compute cost is less than $0.40 per hour.


After the VM is done deploying, you can go in and under “Networking” add an Inbound Port rule for the SSH service using “My IP Address” as the source, which will use your current Public IP Address.


After the addition of the NSG rule finishes, you can SSH into the VM.

Note: By default the Ubuntu 20.04 image is provisioned with 30GB of OS drive space. The drivers and packages we’re downloading are substantial and you may run out of space. If you try to install both PyTorch and TensorFlow you will fill up the VMs usable space so at this point you may want to resize the disk. To do that, stop the VM, change the disk size, and start it again. The Ubuntu Image in Azure uses a cloud-init package that will automatically resize the / partition, so you don’t’ need to do anything in the OS.

Prerequisite Installation:

Now that the VM is deployed and you’re SSH’d into it, you can look at the hardware of the VM to verify the GPU, and you will see there is a Tesla K80 in the NC6 VM.

While I don’t show it below, I always recommend running “sudo apt update; sudo apt upgrade” with each Linux Vm to make sure everything is up-to-date before you begin. After that, go ahead and install gcc and make as shown below, which are required by the NVDIA driver installer.

Note: both of the following are very large run files and can take a while to install, anywhere between 5-10 minutes each. 

Next, as noted in the documentation ( the latest supported NVIDIA drivers for the K80 card is 470.82.01. You can get the link to that download at this page ( and wget the file to the VM.

Once you’ve downloaded the .run file, you will sudo bash NVIDIA*.run to execute the file with bash (you can execute it with sh if you want as well). The installer will go through a few screens in the terminal, and since it’s an older driver version there are a couple of warnings, but nothing that stops us from doing what we need. Feel free to take note of them for your own self documentation though.

After the installation is complete, you can run nvidia-smi, which comes with the driver installation, that shows you the NVIDIA GPU information like you would expect. If it looks like it does below, the driver install completed successfully.

Now that the NVIDIA drivers are installed, we need the CUDA toolkit so that later on we can leverage CUDA for any AI/ML workloads. To do that, you can go to this page ( to get the run file for the VM architecture and OS that we’re using. In the screenshot below I will run the commands the webpage prompts you to run.

As the installer runs it will prompt you with another screen, you can leave it with the default selections or install all of the categories if you’d like.

Now both the NVIDIA drivers, and CUDA Toolkit are installed. At this point the prerequisites are completed. After driver installations I always like to reboot, so I do at this point, but it shouldn’t be required.

Environment Installation:

Two of the most common frameworks in the Open-Source Community for this type of work are PyTorch and TensorFlow, while there are certainly others these are the ones that I wanted to test since most of the projects on GitHub seem to use one or the other. Let’s get them both installed and validate that they can leverage the GPU. The notes made below are a combination of information gathered from many other Blogs, YouTube Videos, and my own research, but this is what works for the particular setup in this environment.

Anaconda Installation:

While there are other ways, I am going to be using Anaconda as the installation vehicle for both PyTorch and TensorFlow. To install Anaconda, you will go to their website ( to get the link for the installer, and bash install it similar to how we did with the other run files.

The installer documentation on their website says to accept all of the defaults, but I recommend letting the installer do the initialization at the end of the install so you don’t have to do it yourself.

Once the install is complete, you will need to reboot, and then you can elevate your privileges to see the prefix to your shell location noting the conda environment.


PyTorch Installation:

Note: If you don’t need PyTorch you can skip this part.

With Anaconda installed, we’ll use it to install PyTorch. From this point on, these installers print out much more than I can feasibly capture in screenshots, so note that you will see additional content in your terminal.

Next, we want to verify the version of Python that’s running in our environment.

Noting Python 3.10, we can move on to creating a new Anaconda environment that uses the packages we just installed.

After we create that environment we can activate it, and see that it switches from base to pytorch.

At this point we have PyTorch at our disposal, so let’s test it out by importing it, running a quick torch.rand function to test the package, and then run the torch.cuda.is_available function to validate that the CUDA toolkit and associated GPU is available for use.

Note: If that function returns that it’s not available, run nvidia-smi again to validate that the driver didn’t fail. If it did, you can reboot, and in some cases I found I needed to re-run the NVIDIA driver install file for some reason. Linux and GPU drivers are a tricky thing sometimes.

Wonderful! If you see the function is available, then everything we’ve done to this point was successful. Now, if the work you’re going to be doing or project your running only requires PyTorch you can skip the next part and go straight to the conclusion at the end of the blog.

TensorFlow Installation:

If you need TensorFlow, let’s get that installed here. First we will need python3-pip; after that’s installed you will need to pip install tensorflow. If all goes well, you will see that it has been successfully installed.

If you haven’t rebooted after the Anaconda installation earlier, go ahead and do that now. After rebooting, you can elevate your privileges to see the prefix to your shell location noting the conda environment. We’ll run a quick test to verify the python version.

Now that we know we’re running Python 3.10, we can use Anaconda to create an environment for TensorFlow.

Similar to how we did it with the PyTorch environment, you will now activate the environment by running conda activate tf and you will see the active environment notation switch in your terminal. After you’re in the TensorFlow environment, you can install the nvidia-cudnn-11 package which we will need to interact with the CUDA toolkit.

Now this gets a bit messy, because we’ll need to run all of the following commands to setup the variables and other path and environment information for TensorFlow.

CUDNN_PATH=$(dirname $(python -c “import nvidia.cudnn;print(nvidia.cudnn.__file__)”))


mkdir -p $CONDA_PREFIX/etc/conda/activate.d

echo ‘CUDNN_PATH=$(dirname $(python -c “import nvidia.cudnn;print(nvidia.cudnn.__file__)”))’ >> $CONDA_PREFIX/etc/conda/activate.d/

echo ‘export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/:$CUDNN_PATH/lib’ >> $CONDA_PREFIX/etc/conda/activate.d/

Now that those are all set, we’ll pip install TensorFlow into the environment.

Once that’s done there is only one more step, which is to run the command below to validate TensorFlow’s ability to communicate with the GPU and the CUDA Toolkit.

python3 -c “import tensorflow as tf; print(tf.config.list_physical_devices(‘GPU’))”

I like to first run nvidia-smi to verify I can still communicate with the GPU through the driver like normal. If it fails, you can reboot, and in some cases I found I needed to re-run the NVIDIA driver install file for some reason. Linux and GPU drivers are a tricky thing sometimes.

You can see above I rebooted, activated my conda environment, and ran that command. The end line result shows that TensorFlow has access to the Physical GPU device, which means that everything we’ve done up to this point has all worked!



The first thing I will say is that, as I noted, Linux and GPU drivers don’t have a stellar reputation for working the first time, or every time. It may take a bit of patience, but I’ve run through this somewhere between 5-10 times now and the instructions I’ve capture here seem to work pretty well. If you find something else notable, please leave a comment here or on any of the social posts.

In the end, if you are a fast at copy & paste, you can setup this environment end-to-end in about 30 minutes. Your first time through it will likely take you 1-3 hours though, depending on your previous experience. The resulting ability to have a GPU VM for testing, learning, or playing with GitHub AI/ML projects though for such a low cost is great to have in your back pocket.

If you have any questions, comments, or suggestions for future blog posts please feel free to comment below, or reach out on LinkedIn or Twitter. I hope I’ve made your day a little bit easier!




Migrating Data into Azure – Online vs Offline

Posted on Updated on

Reading Time: 7 minutes

I frequently work with organizations who are migrating data from an on-premises datacenter into Azure. Undoubtably the question will come up “should we use an Azure Data Dox to ship that much data?”, and most of the time the room echoes in a resounding “Yes!”.

I’ve been working in Azure for many years and have seen a lot of data migrations, and while data box is a wonderful service and is yet another way that Microsoft enables and empowers customers to do what’s best for them, it’s just that, an option for what might be the best fit. I write the same thing almost every month to various people, and figured it was time to post it to use as a reference.

Note: My thoughts here are in no way indented to conflict with the official product documentation and are rather a more experience-based thought experiment to accelerate time-to-value in regard to data migration, at the bottom of this post there are links to two great official pieces of documentation that are more technically focused, please give those a read as well.

Most of the time when we think about uploading or downloading data to and from the internet, we think in terms of gigabytes – typically single-digit gigabytes at that. Even what the home-ISPs like to reference as the bandwidth heavy services like movie streaming, will typically use less than 7GB/hr. With that in mind, when we think of the amount of data that is used in an enterprise, we’re typically talking Terabytes or even for those very large organizations – Petabytes. When we talk about migrating that amount of data to a different physical location (for example, Azure) it seems outlandish to think about moving it online – or is it?

Azure Data Box

If you haven’t taken a look at the Azure Data Box Family of offerings, I highly suggest it. There are 4 different offerings of Data Box:

  • Data Box Disk: 8 TB SSD for offline transfer
  • Data Box: 100 TB appliance for offline transfer
  • Data Box Heavy: 1 PB appliance for offline transfer
  • Data Box Gateway: A VM appliance storage gateway used for managed online data transfer.

These devices ship to your location for a nominal fee, you load up the data, then ship it back, and Microsoft loads the data into the destination you choose. The idea is that up to a 40Gbps network connection on your local network is going to be much faster than it would be to send this data over the Internet, VPN, or ExpressRoute connection and is a great option.


Offline Transfer Considerations

I challenge everyone to think through this process though when considering an offline migration. Specifically, we need to think about how long it will take to get the process approved (among other factors) to move your company’s data using a shipping carrier. I’ve worked with organizations where the policy for this type of process requires a private courier, active GPS, and someone following the truck along the entire route (I’ve even seen requirements for armed guards or police escort), among many other requirements from various departments within the organization.

Let’s look at the most common components of this process that might influence the timeline of your data migration.

  • Privacy & Legal Team Approvals: Depending on the data, privacy and legal may need to be involved to inspect the process for data device handling, determine who has visibility into the data, how it is destroyed upon completion of the ingestion, and potentially even determine insurance implications.
  • Security Approval: From a technical controls perspective they will want to make sure proper encryption is used at the data level and hardware level, determine who controls the keys for encryption, ensure device attestation, and even certify these devices to be plugged into the datacenter based on the controls in place for certain hardware vendors.
  • Ordering & Shipping: The process of receiving your Data Box takes up to 10 business days, depending on availability and other factors.
  • Loading the Data: There are two points that are important here, the first is how fast can the data be retrieved (e.g., is the data passed through a source that only has a 1 Gb link, are there disk throughput limitations, do you need to limit the transfer rate to not impact other workloads, etc.). The second point to consider is write throughput on the Data Box itself, while there is ample network connectivity with each device, the larger devices are designed for capacity rather than performance and while there is good throughput, they are not designed for high I/O which is important for datasets with smaller file sizes.
  • Shipping to Microsoft: Standard shipping time applies to shipping the device back to Microsoft, typically a few days.
  • Microsoft transferring the data: After the device is received it is inspected for damage, then setup to copy the data to the destination you selected when you requested the Data Box – this could be a few hours to a few days depending on availability, data size, I/O size, and both the type of Data Box itself and the target storage location.

(Time to Legal Approval) + (Time to Privacy Approval) + (Time to Security Approval) + (Ordering & Shipping Time) + (Time to Load the Data) + (Shipping Time) + (Time to Unload the Data)

When thinking about these lead times it’s important to be honest with yourself. How long after you send the email, or meeting invite, will it take to get full approval from Legal, Security, and Privacy? In most cases, this is typically a few weeks and depends on the organizational processes and sensitivity of the data, sometimes up to a few months.

For example, let’s say it takes 1 month for full approval to ship the data, which is certainly a reasonable timeframe. Let’s also assume it takes 2 days to get the Data Box hooked up in the datacenter, and that you’re copying 50 TB at 5 Gbps over the LAN. With a generalized timeline, this operation would roughly look like the following:

Example: 50 TB, 5 Gbps LAN Offline Transfer with Data Box

1 Month for approval + 8 Days for shipping + 2 Days for setup + 2 Days for data copy (~26 hr. for actual data movement) + 2 days to prep for shipping + 3 days for shipping + 1 day for receiving + 1 day for copying data (likely less)

30 days + 8 days + 2 days + 2 days + 2 days + 3 days + 1 day + 1 day = ~49 days

Now let’s assume that same data was copied “online” (Internet, ExpressRoute, VPN, etc.) at even just 100 Mbps averaged across the day. In most cases organizations would be able to leverage more bandwidth than this, but it makes for easy calculations. If you copied 50 TB online, at 100Mbps, it would take ~53.5 days. In this scenario the time to copy the data online vs offline is very close, and without any of the fuss of approvals and shipping. If you assume you can use 125 Mbps of bandwidth you’re looking at ~42.5 days which is even faster than the offline mode.

At this point I’m sure there are a few people saying “yes, but what if I had a LOT of data, say 1 PB!”. I’ve done many multi-PB data migrations to Azure and have seen them go both online and offline, let’s do the calculation and see how it looks. While it may not be the case for everyone, in my experience with the increase of the dataset size comes longer approval lead times for various reasons. Additionally, these types of organizations typically also have more bandwidth capacity – again, these are generalized numbers, but in my personal experience they are realistic.

NOTE: Data Box Heavy requires a QSFP+ compatible cable, which I find is not as common in most datacenters, make sure you have one on-hand prior to receiving the device.

For this calculation let’s assume 2 PB of data that can be copied on the LAN at 10 Gbps. Keep in mind that if there was actually 2PB of data you’d need 3 Data Boxes because you get 770 TB of usable space after overhead per Data Box Heavy. Take note though, that I’m not taking the multiple Data boxes into account in the calculation, which would realistically extend the timeline.

Example: 2 PB, 10 Gbps LAN Offline Transfer with Data Box Heavy

2.5 Months for approval + 8 Days for shipping + 2 Days for setup + 22 Days for data copy + 2 days to prep for shipping + 3 days for shipping + 1 day for receiving + 4 days for copying data

75 days + 8 days + 2 days + 22 days + 2 days + 3 days + 1 day + 4 days = ~117 days (~3.9 months)

Like I said earlier, typically if an organization has this much data they have much more bandwidth – 2 Gbps for this operation would not be unreasonable to assume as a generalization. Given 2 Gbps bandwidth, it would take ~107 days to copy this data online compared to ~117 days copying it offline.

However, I will say that I’ve been in situations where an organization had other limitations such as the total available capacity on a firewall or edge router, and they would have to upgrade at significant expense to be able to handle an extra 2 Gbps so they could only do something like 250 Mbps. At that speed it would take 874 days to copy and at those speeds with that much data it certainly does not make sense to move the data online, and using a Data Box would be much more efficient to copy the data offline.

NOTE: Data Box will not ship across international borders (except countries within the European Union), please see the FAQ reference link if that is a requirement for your data transfer. 

Online Transfer

If you are going to copy the data online, there are various ways to accomplish this task. In general, I see AzCopy, Azure Data Factory, Azure Data Box Gateway, or depending on the target storage location any number of other tools used for online data movement.

There are some considerations when choosing your tooling such as cost (of the tool only, ingress bandwidth to Azure is free), performance, manageability and whether there is data churn that needs to continuously be uploaded after the initial import.  Keep in mind that you can also control your bandwidth with online copies and for example use less bandwidth during business hours and more at night, and some of these tools will help facilitate that for you.

I won’t go into depth on this decision process but let me know if I should write another blog on that topic.

Additional references:

The two reference links below have wonderful information about choosing a data transfer solution, and as noted earlier I HIGHLY suggest reviewing them as well. The purpose of this blog was to talk about some of the processes and procedures that’s typically not addressed when looking purely at the technology.



I hope going through these scenarios was helpful when considering methods for data transfer into Azure. My goal here was not to go in depth on anything in particular, but more think through the process. As takeaways, here are a few points to keep in mind about transferring large amounts of data into Azure.

  • Be honest with yourself about approval timelines for shipping your company’s (and/or customer’s) data.
  • Use a file transfer calculator to see how long it would actually take to transfer X data at Y speeds – it’s probably not as long as you think.
  • For good reason, there will likely be a lot of meetings, documentation, email threads, and other time-consuming activities for shipping data physically – and that should also count for something in terms of cost.
  • There will likely also be some of the aforementioned procedural work for online data migration, but in most cases not nearly as much.
  • Online is not always going to work out, sometimes Data Box is going to be the best fit.


If you have any questions, comments, or suggestions for future blog posts please feel free to comment below or reach out on LinkedIn or Twitter. I hope I’ve made your day a little bit easier!

Azure Mask Browser Extension

Posted on

Reading Time: 2 minutes

I often find myself having to do a lot of editing of both video and screenshots when using the Azure Portal and I wanted to write a short blog post about this very handy extension I’ve been using recently. The Azure Mask Extension is an open source tool written by Brian Clark, it works by masking sensitive data as you navigate the Azure Portal which is helpful for presentations, screen sharing, and content creation. Take a look below at how to install and use it.

First, browse the extension settings in your chromium-based browser. Due to naming issues, the extension had to be renamed and has been “pending review” for over a year now (3/16/21) as noted by the Github repo. In light of this, the extension can’t currently be installed from the store, and must be manually loaded. Once you’re on this screen, you’ll need to toggle “Developer Mode” once you download the extension from the Github repo use the “Load Unpacked” button to upload the extension file.

Browser Extension Settings


Once the extension is installed, you will see it show up on the extensions setting screen. You will then click the Azure Mask extension button in the task bar and move the slider to “Toggle All Masks”.

Extension installed and toggled


Next, head over to the Azure Portal and check out the masking features of this extension!

Masked Azure Portal


Short and sweet blog post, but this extension has saved me a lot of time when sharing my screen, presenting at conferences, recording video, and taking screenshots for blogs. If you have any questions, comments, or suggestions for future blog posts please feel free to comment blow, or reach out on LinkedIn or Twitter. I hope I’ve made your day a little bit easier!

Azure App Service Private Link Integration with Azure Front Door Premium

Posted on Updated on

Reading Time: 5 minutes

Last week , Azure Front Door Premium went into Public Preview. While this did bring about some other cool features and integrations, the one I’m most excited about today is the integration with Azure Private Link.  This now allows Azure Front Door to make use of Private Link Services (not endpoints, which is what most people think about when they hear Private Link). Private Link Services allow for resource communication between two tenants, some of the most common use cases are software providers allowing private access to a solution running in their environment. Today I’m going to walk through how to connect Azure Front Door, through Private Link, to an App Service, without an ASE, the need to work with Private Link, DNS or anything of the sort. I believe this will become the new standard for hosting App Services.

With that, let’s get started! First, we need to create an Azure App Services Web App.

New Azure Web App


*Note* At the time of writing this post (03/01/2021) Private Link Service integration requires the App Service to be a Pv2.


Once the Web App is deployed, you’ll need the URL of the website and want to test it in a web browser. In this instance I’m not hosting anything in particular, simply hosting the sample page to show that it’s working.

Default App Service Page


At this point the web app is created, and you would expect to have to create a Private Link Endpoint now but since Azure Front Door Premium uses the Private Link Service functionality we can let Front Door do the work for us. With that said, let’s now go create the Azure Front Door Premium Service.

Azure Front Door Premium Creation


We need to make sure that the Tier is selected properly as the “Premium” SKU. After that radio button is selected, a section will populate below with different configuration options compared to the Standard Tier. The one we need to make sure to check is “Enable private link service”. After that’s selected, you will select the web app with which you want to establish Private Link connectivity from Front Door. If you would like, here you can also add a custom message. This will be what is displayed as a connection request in the Private Link Center in the next step.

Azure Front Door Creation

Azure Front Door Premium Creation


On the review page, we can see that the endpoint created is a URL for Azure Front Door and this will be the public endpoint. The “Origin” is the web app to which Front Door will be establishing private connectivity.

Final Configuration for Azure Front Door


Once Azure Front Door is done deploying, you will need to open up the Private Link Center. From there you will navigate to the “pending connections”, which is where you will see the connection request from Azure Front Door with the message you may or may not have customized. Remember that Azure Front Door uses Azure Private Link Service to connect it’s own managed Private Link Service to your Web App. You will need to “Authorize” the connection request in order for the connection to be created and allow Front Door to privately communicate with your Web App.

Private Link Setup

Private Link Center

Private Link Setup


After the connection is approved you will notice that the “pending connection” is removed, and has been moved to “active connections”. At this point, you will also notice that access to the Web App through a browser will return an error message the same way it would if you were to have added firewall rules on the Web App. This is because it’s being configured to only allow inbound connections from Azure Front Door.

Private Link Setup

No Public Access


If you want to modify any of the configuration settings, you will go to the “Endpoint Manager” section of Azure Front Door, where you get the familiar interface used by both Azure Front Door and App Gateway.

Front Door Configuration


In my testing, the time between clicking “Approve” in Private Link Center to the Web App being available through the Azure Front Door endpoint is anywhere between 15-30 minutes. I’m not quite sure why this is the case, though it is likely due to the service only being in preview. If you get an error message in the web browser using the Front Door URL, just grab a cup of coffee and give it some time to do its thing.

Once it’s all done though, you can use the Front Door URL in the web browser and see that it routes you to the App Service!

Azure Front Door Endpoint


There we go, all set! This is really a dream configuration, and something a lot of us have been looking forward to for some time. In the past we’ve done something similar with App Gateways, and Private Link Endpoints. The beauty of the solution with Front Door Premium, is that there is no messing around with DNS or infrastructure whatsoever – you can deploy this entire solution in PaaS while taking advantage of Azure Front Door’s global presence!

Click here to get started with Azure Front Door Premium.

If you have any questions, comments, or suggestions for future blog posts please feel free to comment blow, or reach out on LinkedIn or Twitter. I hope I’ve made your day a little bit easier!

Shared Storage Options in Azure: Part 5 – Conclusion

Posted on Updated on

Reading Time: 4 minutes

Part 5, the end of the series! This has been a fun series to write, and I hope it was helpful to some of you. The impetus for this whole thing was the number of times I’ve been asked how to setup shared storage between systems (primarily VMs) in Azure. As we’ve covered, you can see there are a handful of different strategies with pros and cons to each. I’m going to close this series with a final Pros and Cons list and a few general design pattern directions.


Pros and Cons:


Azure Shared Disks:

Shared Storage Options in Azure: Part 1 – Azure Shared Disks « The Tech L33T


  • Azure Shared disks allows for the use of what is considered “legacy clustering technology” in Azure.
  • Can be leveraged by familiar tools such as Windows Failover Cluster Manager, Scale-out File Server, and Linux Pacemaker/Corosync.
  • Premium and Ultra Disks are supported so performance shouldn’t be an issue in most cases.
  • Supports SCSI Persistent Reservations
  • Fairly simple to setup


  • Does not scale well, similar to what would be expected with a SAN mapping.
  • Only certain disk types are supported.
  • Read-Only host caching is not available for Premium SSDs with maxShares >1.
  • When using Availability Sets and Virtual Machine Scale sets, storage fault domain alignment with the VMs are not enforced on the shared data disk.
  • Azure Backup is not yet supported.
  • Azure Site Recovery is not yet supported.


Azure IaaS Storage:

Shared Storage Options in Azure: Part 2 – IaaS Storage Server « The Tech L33T


  • More control, greater flexibility of protocols and configuration.
  • Ability to migrate many workloads as-is and use existing storage configurations.
  • Ability to use older, or more “traditional” protocols and configurations.
  • Allows for the use of Shared Disks.
  • Integration with Azure Backup.
  • Incredible storage performance if the data can be cached/ephemeral (up to 3.8 million IOPS on L80s_v2).


  • Significantly more management overhead as compared to PaaS.
  • More complex configurations, and cost calculations compared to PaaS.
  • Higher potential for operational failure with a higher number of components.
  • Broader attack surface, and more security responsibilities.
  • Maximum of 80,000 uncached IOPS on any VM SKU.


Azure Storage Services (Blob and File):

Shared Storage Options in Azure: Part 3 – Azure Storage Services « The Tech L33T


  • Both are PaaS and fully managed which greatly reduces operational overhead.
  • Significantly higher capacity limits as compared to IaaS.
  • Ability to migrate some workloads as-is and use existing storage configurations when using SMB or BlobFuse compared to using native API connections.
  • Ability to use Active Directory Authentication in Azure Files, and Azure AD Authentication in Blob and Files.
  • Both integrate with Azure Backup.
  • Much easier to geo-replicate compared to IaaS.
  • Azure File Sync makes distributed File Share services and DFS a much better experience with Backup, Administration, Synchronization, and Disaster Recovery.


  • BlobFuse (by default) stores credentials in a text file.
  • Does not support older access protocols like iSCSI.
  • NFS is not yet Generally Available.
  • Azure Files is limited to 100,000 IOPS (per share).


Azure NetApp Files:

Shared Storage Options in Azure: Part 4 – Azure NetApp Files « The Tech L33T


  • Incredibly high performance, depending on configuration (up to ~3.2 million IOPS/volume).
  • SMB and NFS Shares both supported, with Kerberos and AD integration.
  • More performance and capacity than is available on any single IaaS VM.


  • While it is deployed in most major regions, it may not yet be available where you need it yet (submit feedback if this is the case).
  • Does not yet support Availability Zones, Cross-Region Replication is in Preview.


There we have it, my final list of Pros and Cons between Azure Shared Disks, DIY IaaS Storage, Azure Blob/Files, and Azure NetApp Files. Lastly, I want to end with some notes on general patterns when considering shared storage like the ones discussed in this series.


Patterns by Workload Type:



  • If the reason you need shared storage is for a quorum vote, look into using a Cloud Witness for Failover Clusters (including SQL AlwaysOn).
  • If the cloud quorum isn’t an option, shared disks is going to be an easy option, and I would go there second.


Block Storage:

  • If you need shared block storage (iSCSI) for more than just quorum, chances are you need a lot of it, so I’d first recommend running IaaS storage. Start planning a migration away from this pattern though, Blob block storage on Azure is amazing and if you can port your application to use it – I would highly recommend doing so.


General File Share:

  • For most generic file shares, Azure Files is going to be your best bet – with a potential use of Azure File Sync.
  • Azure NetApp Files is also a strong option here since the Standard Tier is cost effective enough for it to be feasible, though ANF requires a bit more configuration than Azure Files.
  • Lastly, you could always run your File Share in custom IaaS storage, but I would first look to a PaaS solution.


High-Performance File Storage:

  • If your application doesn’t support the use of Blob storage, like most commercial products, Azure NetApp files is likely going to be your best bet.
  • Once NFS becomes generally available, NFS on Azure Files and Blob store are going to be strong competitors – especially on Blob and ADLS.
  • Depending on what “high-performance” means, and whether or not you use a scale-out software configuration, storage on IaaS could potentially be an option. This is a much more feasible option when the bulk of the data can be cached or ephemeral.


We’ve come to the end! I hope that was a useful blog series. As technologies and features advance, I’ll go back and update these, but please feel free to comment if I miss something. Please reach out to me in the comments, on LinkedIn, or Twitter with any questions about this post, the series, or anything else!


Shared Storage Options in Azure: Part 4 – Azure NetApp Files

Posted on Updated on

Reading Time: 10 minutes

Welcome to Part 4 of this 5-part Series on Shared Storage Options in Azure. In this post I’ll be covering Azure NetApp Files. We have talked about other file-based shared storage in Azure already with SMB and NFS on IaaS VMs in Part 2, and again with Azure Files in Part 3. Today, I want to cover the last technology in this series – let’s get into it!

Azure NetApp Files:

Azure NetApp Files (ANF) is an interesting Azure service, unlike many others. ANF is actually first-party NetApp hardware, running in Azure. This allows for customers to use the enterprise-class, high-performance capabilities of NetApp directly integrated with their Azure workloads.  I will note that you can also use NetApp’s appliance called the NetApp ONTAP Cloud Volume, which is a Virtual Machine that sits in front of blob storage which you can also use for shared storage, but I won’t be covering that here as the ONTAP volumes aren’t first-party Azure. There are however, along with ONTAP, a number of great partner products that run in Azure for these type of storage solutions. Check with your preferred storage vendor, they likely have an offering.

Before we jump into it, I’ll note that there are different configurations or operations you can do to tune the performance of your ANF setup, I won’t be going into those here but will be writing another post at later time on performance benchmarking and tuning on ANF.

Initial Configuration:

Azure NetApp Files is a bit different from what you would expect with Azure Files, so I’m going to walk through a basic setup here. First of all, ANF currently requires you to be whitelisted for ANF use, to submit your subscription you’ll need to use this form.

After you’ve been whitelisted, head into the portal and create an Azure NetApp Files Account.

Create Azure Netapp Files Account

After it’s created, the first thing you will need to do is create a capacity pool. This is the storage from which you will create volumes later in the configuration. Note: 4TB is the smallest capacity pool that can be configured.

Azure NetApp Files Storage Hierarchy


I’m using an automatic QoS type for this capacity pool, but you can read more about how to setup manual QoS. What is important to choose here is your service level, this cannot be changed after creating the capacity pool. I will talk more about the service levels later in this post.

Azure NetApp Files Capacity Pool


Later on I’m going to be using both and NFS and an SMB share. To use an SMB share with Kerberos authentication you will need virtual network with which to integrate ANF and your source of authentication. I’m going to create a virtual network with two subnets, one for my compute and one for ANF. The ANF subnet needs to be delegated to the Azure NetApp files service so it can leverage that connection, so I’ll configure that here as well.

Create Azure Virtual Network

Delegate Azure Subnet to Azure Netapp Files


Now that I’ve setup the network I’m going to create my compute resources to use in my testing environment. This will be comprised of the following:

  • Domain Controller
  • Windows Client
  • Linux Client
  • Azure Bastion (used for connecting to those VMs)

I’ll use the Windows client to test the SMB share, and will test the NFS share with the Linux Client.

Domain Controller

Create domain controller VM

Azure Bastion

Create Azure Bastion

Windows Client

Create Windows Client

Linux Client

Create Linux Client


Now that those compute hosts are all being created, I’m going to go create my NFS volume. I initially created a 4TB capacity pool, so I’ll assign 2TB to this NFS volume for now. I’m going to use NFS 4.1 but won’t be using Kerberos in the lab, my export policy is also set to allow anything within the virtual network to access it – this can be modified at any time.


Create NFS Volume

Create NFS Volume part 2

Create NFS Volume part 2


Alright, the NFS volume is all setup now and we’ll come back to that later to test on the Linux Client. Now I want to setup and SMB share, which first requires that I create a connection to Active Directory. I built mine manually in my lab, but you can also use this quickstart template to auto-deploy an Active Directory Domain for you . It’s also good to know that this source can either be traditional Active Directory Domain Services or the Azure AD Domain Services.

You will want to follow the instructions in the ANF documentation to make sure you have things setup correctly. I have my domain controller set to use a static IP of, named the domain “anf.lcl” and setup a user named “anf”. Now that this is complete, I can create the Active Directory Connection.

Azure NetApp Files Active Directory Join


Great! Now that we have that configured, we can use the connection in setting up the SMB share. I’ll use the rest of the 4TB capacity pool here and use the Active Directory connection we just finished to create the SMB share.


Create SMB Share

Create SMB Share - Part 2

Create SMB Share - Part 3


After this completes, you can jump into Active Directory and see that it creates a computer account in AD. This will be the “host” of the SMB share, and ANF will use this to verify credentials attempting to connect to the share.

Computer Account in Active Directory


Fantastic, now we have ANF created, with a 4TB capacity pool, a 2TB NFS share, a connection to Active Directory, and a 2TB SMB share. On the Volumes tab we can now see both of those shares are ready to go.

Azure NetApp Files Volumes


Each of the shares has a tab called “Mounting instructions”, I’m going to test the SMB share first so I’ll go grab this information. You can see the UNC path looks like an SMB share hosted by the computer “anf-bdd8.anf.local”, this is how other machines will reference the share to map it. Permission to this share can be controlled similar to how you would control them on any other Windows share, take a look at the docs to read more on how to do this.

SMB Mounting Instructions

With this information we can go use the Azure Bastion connection to jump into our Windows Client and map the network drive.


Mount Network drive in Windows


Mounted network drive in Windows


Voila! The Azure NetApp Files SMB share is mounted on our Windows Client. Now let’s go do the same thing with the NFS share: grab the mounting instructions, use the Azure Bastion Connection to connect to the Linux Client, and mount the NFS share.


Mount NFS volumeMount NFS volume - part 2



Cost, Performance, Availability, and Limitations:


As noted earlier, there are three service level tiers in Azure NetApp Files: Ultra, Premium, and Standard.

  • Ultra provides up to 129 MiB/s of throughput per 1TiB of provisioned storage
  • Premium: 64 MiB/s per 1 TiB
  • Standard: 16 MiB/s per 1TiB


Remember that earlier I selected the standard (lowest performance tier) for my capacity pool, this tier is more designed for capacity situations than performance and is much more cost effective. With that said though, let’s do a quick performance test.

  • 2TB SMB share on the “Standard” tier
  • D2s_v3 Windows Client
  • IOMeter tool running 4 worker nodes, with a 50% read 4Kb test

IOMeter Test


The performance capabilities of ANF are a combination of 3 main things:

  • Performance Tier
  • Volume Capacity
  • Client Network Throughput

As I’ve mentioned in part 2 of this blog series, similar to managed disks, the performance of an ANF volume increases with its provisioned capacity. Also remember that Azure VM SKUs have an expected network throughput and this is important here because the storage in question is over the network. If the VM is only capable of 1,000 Mbps then depending on your I/O size, regardless of the ANF configuration, your tests will only ever perform at up to 1,000 Mbps.

Just to verify that the performance is tied to capacity, I’m going to increase the capacity pool and then double the size of the SMB volume from 2TB to 4TB and run the test again.


resize azure netapp files pool


Resize SMB Share


IOMeter test


We can see that the performance roughly doubled, with no change inside the VM (since we’re not yet hitting the Network Bandwidth limitations of that VM SKU).

Now let’s run the same test using the FIO tool on our Linux Client against the NFS share.

FIO on Linux


Again we’ll go ahead and increase the capacity pool then double the size of the NFS share and run the test again.


Resize NFS volume

FIO Test on Linux


Similar to the SMB testing, after doubling the size of the NFS share it also doubled its performance. Increases in capacity on the pool or volume can happen live, while the systems are running, with no impact.

As I mentioned earlier, I will be writing another blog post at a later time on performance benchmarking and tuning on ANF. In the meantime I recommend reading the ANF documentation on performance, for example this one on Linux Performance Benchmarking.



Similar to what you would expect with a traditional NetApp appliance, ANF does support the use of snapshots. Keep in mind that your snapshots will consume additional storage on your ANF volume.

Azure NetApp Files Snapshot


As earlier noted, Azure NetApp Files is a true NetApp appliance running in an Azure Datacenter and is therefore subject to the same appliance-level availability. In addition, there is a 99.99% financially backed SLA on Azure NetApp Files.

Note: Cross-Region replication is currently in Public Preview so I won’t note it as an option yet, but will edit this post once it becomes generally available.



Pricing for Azure NetApp Files is incredibly straightforward – you pay per GB x hours provisioned.

Currently Pricing ranges from $0.14746/GB to $0.39274/GB based on performance tier. Please see the pricing page for the most up-to-date information.

You can also see this documentation on Cost Modeling for Azure NetApp Files for a deeper dive into modeling costs on ANF.



  • While ANF is rolling out to more and more regions, since it is discrete physical hardware it doesn’t exist everywhere (yet) and may impact your deployment considerations.
  • ANF does not (yet) support availability zones.
  • Additional resource limitations can be found here: Resource Limits for Azure NetApp Files.


Typical Use Cases:

The most common use case for Azure NetApp Files is simple, if you need more than 80k IOPS. Now, keep in mind that IOPS isn’t always straight forward. IOPS (Input Output Operations Per Second) can vary greatly based on the workload – data size, and access patterns. For example, a machine is likely to have significantly higher IOPS if the data size is 4Kb rather than 64Kb, if all else is constant, x16 times more IOPS. Similarly, throughput (eg. MBps/GBps) will be higher based on data size. With that said, if a workload requires incredibly high performance with an application that isn’t designed to run on cloud-native platforms (eg. Blob Storage APIs, etc.) – ANF is likely the place it will land. Remember that (as of the time of writing this, January 2021) the most uncached IOPS a machine can have in Azure is 80,000 (see Part 2 of this blog series).

This comes into play often with very large database systems such as Oracle.

Another typical use case is SAP HANA workloads.


The third most common workload for Azure Netapp Files that I’ve found is in large Windows Virtual Desktop deployments, using ANF for storing user profile data.


Pros and Cons:

Okay, here we go with the Pros and Cons for using an Azure NetApp Files for your shared storage configuration on Azure.


  • Incredibly high performance.
  • SMB and NFS Shares both supported, with Kerberos and AD integration.
  • More performance and capacity than is available on any single IaaS VM.
  • ANF is a PaaS solution with no appliance maintenance overhead.


  • While it is deployed in most major regions, it may not be available where you need it yet (submit feedback if this is the case).
  • Does not yet support Availability Zones, Cross-Region Replication is in Preview.


Alright, that’s it for Part 4 of this blog series – Shared Storage on Azure Storage Services. Please reach out to me in the comments, on LinkedIn, or Twitter with any questions about this post, the series, or anything else!


Shared Storage Options in Azure: Part 2 – IaaS Storage Server

Posted on Updated on

Reading Time: 9 minutes

Recently, I posted the “Shared Storage Options in Azure: Part 1 – Azure Shared Disks” blog post, the first in the 5-part series. Today I’m posting Part 2 – IaaS Storage Server. While this post will be fairly rudimentary insofar as Azure technical complexity, this is most certainly an option when considering shared storage options in Azure and one that is still fairly common with a number of configuration options. In this scenario, we will be looking at using a dedicated Virtual Machine to provide shared storage through various methods. As I write subsequent posts in this series, I will update this post with the links to each of them.


Virtual Machine Configuration Options:


While it may not seem vitally important, the VM SKU you choose can impact your ability to provide storage capabilities in areas such as Disk Type, Capacity, IOPS, or Network Throughput. You can view the list of VM SKUs available on Azure at this link. As an example, I’ve clicked into the General Purpose, Dv3/Dvs3 series and you can see there are two tables that show upper limits of the SKUs in that family.

VM SKU Sizes in Azure

VM Performance Metrics in Azure

In the limits for each VM you can see there are differences between Max Cached and Temp Storage Throughput, Max Burst Cached and Temp Storage Throughput, Max uncached Disk Throughput, and Max Burst uncached Disk Throughput. All of these represent very different I/O patterns, so make sure to look carefully at the numbers.

Below are a few links to read more on disk caching and bursting:


You’ll notice when you look at VM SKUs that there is an L-Series which is “storage optimized”. This may not always be the best fit for your workload, but it does have some amazing capabilities. The outstanding feature of the L-Series VMs are the locally mapped NVMe drives which as of the time of writing this post on the L80s_v2 SKU can offer 19.2TB of storage at 3.8 Million IOPS / 20,000 MBPS.

Azure LS Series VM Specs

The benefits of these VMs are the extremely low latency, and high throughput local storage, but the caveat to that specific NVMe storage is that it is ephemeral. Data on those disks does not persist a reboot. This means it’s incredibly good at serving from a local cache, tempdb files, etc. though its not storage that you can use for things like a File Server backend (without some fancy start-up scripts, please don’t do this…). You will note that the maximum uncached throughput is 80,000 IOPS / 2,000 MBPS for the VM, which is the same as all of the other high spec VMs. As I am writing this, no Azure VM allows for more than that for uncached throughput – this includes Ultra Disks (more on that later).

For more information on the LSv2 series, you can read more here: Lsv2-series – Azure Virtual Machines | Microsoft Docs

Additional Links on Azure VM Storage Design:


Networking capabilities of the Virtual Machine are also important design decisions when considering shared storage, both in total throughput and latency. You’ll notice in the VM SKU charts I posted above when talking about performance there are two sections for networking, Max NICs and Expected network bandwidth Mbps. It’s important to know that these are VM SKU limitations, which may influence your design.

Expected network bandwidth is pretty straight forward, but I want to clarify that the number of Network Interfaces you mount to a VM does not change this number. For example, if your expected network bandwidth is 3200 Mbps and you have an SMB share running on that single NIC, adding a second NIC and using SMB multi-channel WILL NOT increase the total bandwidth for the VM. In that case you could expect each NIC to potentially run at 1,600 Mbps.

The last networking feature to take into consideration is Accelerated Networking.  This feature allows for SR-IOV (Single Root I/O Virtualization), which by bypassing the host CPU and offloading the network traffic directly to the Network Interface can dramatically increase performance by reducing latency, jitter, and CPU utilization.  

Accelerated Networking Comparison in Azure

Image Reference: Create an Azure VM with Accelerated Networking using Azure CLI | Microsoft Docs  

Accelerated Networking is not available on every VM though, which makes it an important design decision. It’s available on most General Purpose VMs now, but make sure to check the list of supported instance types. If you’re running a Linux VM, you’ll also need to make sure it’s a supported distribution for Accelerated Networking.  


In an obvious step, the next design decision is the storage that you attach to your VM. There are two major decision types when selecting disks for you VM – disk type, and disk size.

Disk Types:

Azure VM Disk Types

Image Reference:  

As the table above shows, there are three types of Managed Disks ( ) in Azure. At the time of writing this, Premium/Standard SSD and Standard HDD all have a limit of 32TB per disk. The performance characteristics are very different, but I also want to point out the difference in the pricing model because I see folks make this mistake very often.  

Disk Type: Capacity Cost: Transaction Cost:
Standard HDD Low Low
Standard SSD Medium Medium
Premium SSD High None
Ultra SSD Highest (Capacity/Throughput) None

Transaction costs can be important on a machine whose sole purpose is to function as a storage server. Make sure you look into this before a passing glance shows the price of a Standard SSD lower than a Premium SSD. For example, here is the Azure Calculator output of a 1 TB disk across all four types that averages 10 IOPS * ((10*60*60*24*30)/10,000) = 2,592 transaction units.

Sample Standard Disk Pricing:

Azure Calculator Disk Pricing  


Sample Standard SSD Pricing:

Azure Calculator Disk Pricing  

Sample Premium SSD Pricing:

Azure Calculator Disk Pricing

Sample Ultra Disk Pricing:

Azure Calculator Disk Pricing


The above example is just an example, but you get the idea. Pricing gets strange around Ultra Disk due to the ability to configure performance (more on that later). Though there is a calculable break-even point for disks that have transaction costs versus those that have a higher provisioned cost.

For example, if you run an E30 (1024 GB) Standard SSD at full throttle (500 IOPS) the monthly cost will be ~$336, compared to ~$135 for a P30 (1024 GB) Premium SSD, with which you get x10 the performance. The second design decision is disk capacity. While this seems like a no-brainer (provision the capacity needed, right?) it’s important to remember that with Managed Disks in Azure, the performance scales with, and is tied to, the capacity of the disk.

Image Reference:

You’ll note in the above image the Disk Size scales proportionally with both the Provisioned IOPS and Provisioned Throughput. This is to say that if you need more performance out of your disk, you scale it up and add capacity.

The last note on capacity is this, if you need more than 32TB of storage on a single VM, you simply add another disk and use your mechanism for combining that storage (Storage Spaces, RAID, etc.). This same method can be used to further tweak your total IOPS, but make sure you take into consideration the cost of each disk, capacity, and performance before doing this – most often it’s an insignificant cost to simply scale-up to the next size disk. Last but not least, I want to briefly talk about Ultra Disks – these things are amazing!

Ultra Disk Configuration in Azure

Unlike with the other disk types, this configuration allows you to select your disk size and performance (IOPS AND Throughput) independently! I recently worked on a design where the customer needed 60,000 IOPS, but only needed a few TB of capacity, this is the perfect scenario for Ultra Disks. They were actually able to get more performance, for less cost compared to using Premium SSDs.

To conclude this section, I want to note two design constraints when selecting disks for your VM.

  1. The VM SKU is still limited to a certain number of IOPS, Throughput and Disk Count. Adding together the total performance of your disks, cannot exceed the maximum performance of the VM. If the VM SKU supports 10,000 IOPS and you add 3x 60,000 IOPS Ultra Disks, you will be charged for all three of those Ultra Disks at their provisioned performance tiers but will only be able to get 10,000 IOPS out of the VM.
  2. All of the hardware performance may still be subject to the performance of the access protocol or configuration, more on this in the next section.

Additional Reading on Storage:


Software Configuration and Access Protocols:

As we come to the last section of this post, we get to the area that aligns with the purpose of this blog series – shared storage. In this section I’m going to cover some of the most common configurations and access types for shared storage in IaaS. This is by no means an exhaustive list, rather what I find most common.

Scale-Out File Server (SoFS):

First up is Sale-Out File Server, this is a software configuration inside Windows Server that is typically used with SMB shares. SoFS was introduced in Windows 2012, uses Windows Failover Clustering, and is considered a “converged” storage deployment. It’s also worth noting that this can run on S2S (Storage Space Direct), which is the method I recommend using with modern Windows Server Operating Systems. Scale-Out File Server is designed to provide scale-out file shares that are continuously available for file-based server application storage. It provides the ability to share the same folder from multiple nodes of the same cluster. It can be deployed in two configuration options, for Application Data or General Purpose. See the additional reading below for the documentation on setup guidance.

Additional reading:

SMB v3:

Now into the access protocols – SMB has been the go-to file services protocol on Windows for quite some time now. In modern Operating Systems, SMB v3.* is an absolutely phenomenal protocol. It allows for incredible performance using things like SMB Direct (RDMA), Increasing MTU, and SMB Multichannel which can use multiple NICs simultaneously for the same file transfer to increase throughput. It also has a list of security mechanisms such as Pre-Auth Integrity, AES Encryption, Request Signing, etc. There is more information on the SMB v3 protocol below, if you’re interested, or still think of SMB in the way we did 20 years ago – check it out. The Microsoft SQL Server team even supports SQL hosting databases on remote SMB v3 shares.

Additional reading:


NFS has been a similar staple as a file server protocol for a long while also, and whether you’re running Windows or Linux can be used in your Azure IaaS VM for shared storage. For organizations that prefer an IaaS route compared to PaaS, I’ve seen many use this as a cornerstone configuration for their Azure Deployments. Additionally, a number of HPC (High Performance Compute) workloads, such as Azure CycleCloud (HPC orchestration) or the popular Genomics Workflow Management System, Cromwell on Azure prefer the use of NFS.

Additional Reading:


While I would not recommend the use of custom block storage on top of a VM in Azure if you have a choice, some applications do still have this requirement in which case iSCSI is also an option for shared storage in Azure.

Additional Reading:

That’s it! We’ve reached the end of Part 2. Okay, here we go with the Pros and Cons for using an IaaS Virtual Machine for your shared storage configuration on Azure.

Pros and Cons:



  • More control, greater flexibility of protocols and configuration.
  • Depending on the use case, potentially greater performance at a lower cost (becoming more and more unlikely).
  • Ability to migrate workloads as-is and use existing storage configurations.
  • Ability to use older, or more “traditional” protocols and configurations.
  • Allows for the use of Shared Disks.


  • Significantly more management overhead as compared to PaaS.
  • More complex configurations, and cost calculations compared to PaaS.
  • Higher potential for operational failure with the higher number of components.
  • Broader attack surface, and more security responsibilities.

  Alright, that’s it for Part 2 of this blog series – Shared Storage on IaaS Virtual Machines. Please reach out to me in the comments, on LinkedIn, or Twitter with any questions about this post, the series, or anything else!