Securing your private parts in Azure

Real world private endpoints for Cosmos DB, Storage and Redis

14 min readOct 22, 2020

If you’re anything like me, you were shocked to discover the default configuration of backing services in Azure is public.

When you create resources like Cosmos DB, Redis and Storage, the most intimate and sensitive parts of your infrastructure, these resources come into the world nakedly exposed to the internet with nothing more than a password, the security equivalent of a bikini, standing between your crown jewels and spectacular headlines airing the dirty laundry of your very own data breach.

Don’t get me wrong, it looks like a pretty good password, but it is just a password. You’re up against the computing might and mischief of the internet. Do you really want to run out into that firefight, carrying your customers’ precious data, wearing a skimpy swimsuit? I thought not. Me neither. Bikinis are for beaches. A place where you want to be able to relax. So let’s fix it. How do we improve that default security?

Private endpoints

We want to protect our delicate internals with a more cloud-appropriate level of access. If you create similar resources in AWS or GCP, you’ll have to do some work to expose them. Azure takes the opposite approach: start open and then you do work to reduce access. The Azure service you’re looking for is Private Endpoints.

If you Google the term you’ll likely get a barrage of excitable, well-meaning articles proclaiming a simple and happy Nirvana, just a few clicks of a GUI away. Most of it is vapid hello world or “hey mom I wrote a blog” stuff. If instead you read the Microsoft docs, you’ll have to chow down on an upsetting amount of high-calorie marketing bloatware before you get to usable detail.

Even when you get that hard information, there’s an impish melee of gotchas and pitfalls hiding out between you and success. It can leave you baffled and scratching your head, with little recourse but to spit curse words into the void at that pale blue tormentor.

A workable solution is going to require digging in the dirt, blood, sweat, bruises and Googling. I’ve been there, I’ve suffered and come through the oher side. I’d like to offer a helping hand to anyone struggling down that same road trying to “do the right thing”.

Getting real

Here’s an overview of what you’ll need, in addition to your resources, to get private endpoints working in practice:

A vnet and appropriately configured subnet in which to host your private endpoints
A private DNS zone, associated with your vnet, containing DNS records that point to the private IP addresses of your backing service resources
If you’re using App Service, a delegated subnet in the same vnet to route egress traffic to your backing services, plus a “magic number” DNS server IP address configuration for App Service so that it’ll find your private DNS zone

It’s more “Hello, Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch” than “Hello, world” but it is possible once you know how.

Setting it up

It’s taken late nights and lashings of frustration, but I’ve come to a working design and implementation that feels appropriate, so I’d like to share the recipe. At the time of writing some of this stuff is still in preview, so it may well change, but I hope this gives you a solid head-start.

We’ll go for a triumvirate configuration: Cosmos, Storage and Redis. This gives you a pretty comprehensive collection of serverless (managed) backing services to cover sql, nosql, files and caching. It also illustrates the differences in setup for different resource types so you can see what’s the same and what’s different between them and generalise to other services.

I’ve illustrated these with az cli examples to keep things concise and explicit (and because I’m building this using Github Actions). It saves on screenshots and translates pretty cleanly into Azure Portal options if you prefer the UI. I’ve highlighted the important parts of each command so you can get to the point.

Let’s dive in to the gritty detail.

Defaults: The steps below all assume you’ve set a default location and resource group. I use the following:

az configure --defaults location=uksouth
az configure --defaults group=app-rg

Vnet: this one’s straightforward, just provide a name:

az network vnet create \
  --name app-network

Subnet for private endpoints: you’ll need a subnet to host your private endpoints. It’s worth knowing you can’t use subnet delegation (e.g. for App Service) if you use a subnet for private endpoints. It’s not well explained and I’ve not researched why. It may just be how Azure works. It’ll save you some frustration if you know not to go there.

The CIDR range you choose doesn’t affect the private endpoint so can be changed if you need to use a different configuration:

az network vnet subnet create \
  --name private-endpoints \
  --vnet-name app-network \
  --address-prefixes 10.0.0.0/28

Now we need to configure our subnet so that it works for hosting private endpoints:

az network vnet subnet update \
  --name private-endpoints \
  --vnet-name app-network \
  --disable-private-endpoint-network-policies true

Subnet for App Service (optional): if you’re using App Service, you’ll need a separate subnet for egress traffic from App Service so it can route traffic to your private endpoints. It’s also worth knowing that only one egress subnet can be associated with an App Service Plan, so all apps running on the plan must use the same egress subnet. To reiterate: you can’t both delegate a subnet and host private endpoints (at the time of writing) so this needs to be a separate resource. Again, the CIDR range can be of your choice:

az network vnet subnet create \
  --name appservice-plan-egress \
  --vnet-name app-network \
  --delegations Microsoft.Web/serverFarms \
  --address-prefixes 10.0.1.0/28

Private DNS zones: if you want to route traffic to your private endpoints, you will need to explicitly set up DNS zones (and later records) that will point to the IP addresses of your private endpoints. We’ll create the zones to start with.

App Service:

az network private-dns zone create --name "privatelink.azurewebsites.net"

Cosmos DB (in this case, Mongodb):

az network private-dns zone create — name “privatelink.mongo.cosmos.azure.com”

Storage:

az network private-dns zone create --name "privatelink.file.core.windows.net"

Redis:

az network private-dns zone create --name "privatelink.redis.cache.windows.net"

We now need to link these zones to our vnet:

az network private-dns link vnet create \
  --name app-service-dns \
  --zone-name "privatelink.azurewebsites.net" \
  --registration-enabled false \
  --virtual-network app-networkaz network private-dns link vnet create \
  --name mongo-dns \
  --zone-name "privatelink.mongo.cosmos.azure.com" \
  --registration-enabled false \
  --virtual-network app-networkaz network private-dns link vnet create \
  --name storage-dns \
  --zone-name "privatelink.file.core.windows.net" \
  --registration-enabled false \
  --virtual-network app-networkaz network private-dns link vnet create \
  --name redis-dns \
  --zone-name "privatelink.redis.cache.windows.net" \
  --registration-enabled false \
  --virtual-network app-network

Backing Services

We’ll start by creating the backing services we’ll want to protect and then we’ll apply secure access to them.

Storage Account: we’ll start with a Storage Account. This one happens to be a file share, but feel free to use a different type (e.g. blob) and choose the other parameters to suit your needs. NB when it comes to creating the Private Endpoint, you’ll need to match the group-id (aka “Target sub-resource”) to the storage type you’re using (file, blob, etc.):

az storage account create \
  --name appstorage \
  --sku Standard_ZRS \
  --https-only true \
  --encryption-services file

Storage Private Endpoint: OK, time to create our Private Endpoint for storage. As you may have gathered, this is a relatively small part of the setup. It’s central though, and worth understanding what the different parameters mean.

We’ll start with Storage and generalise to the other types. The first step is querying for the ID of our Storage Account. If you’re using the portal, these are a little tricky to get hold of, but they’re of the form /subscriptions/{subscription_id}/resourceGroups/{resource group}/providers/Microsoft.Storage/storageAccounts/appstorage

Here’s the query you’ll need if using the az cli:

storage_id=$(az storage accounts show --name appstorage --query "id" -o tsv)

Now we’re ready to create a private endpoint for our storage account:

az network private-endpoint create \
  --name appstorage \
  --vnet-name app-vnet \
  --subnet private-endpoints \
  --private-connection-resource-id "$storage_id" \
  --group-id file \
  --connection-name appstorage

It’s worth pausing here to check in on these parameters. The name is just that, a name for this endpoint. The vnet-name and subnet are pretty self-explanatory. The $storage_id is the resource ID value we queried earlier. You can replace that by copy-pasting the ID if that works better for you. The group-id parameter turns out to be what’s referred to in the Azure Portal as “Target sub-resource”, in this case one of file, blob, queue, etc.

I’ve been unable to get a clear definition of connection-name, so if you know what it is, I’d be interested to hear from you. The command line documentation of this parameter offers an example value of tttt and a description of “Name of the private link service connection”. I’m none the wiser.

All being well, you now have a private endpoint. Unfortunately, at this stage, if say an App Service instance looks up the address of the storage account, the DNS query will still return the public IP address and the result of a request to that address will be403 Forbidden. So far we’ve denied ourselves.

Private DNS records: This is where we’ll need to create private DNS records in the zones we created earlier. This requires us to do a couple of queries. First we’ll need the Azure resource ID of the private endpoint we’ve just created and then we’ll get the private IP address of the of the network configuration of that endpoint.

Stay with me. We’re looking for the private IP address of the Private Endpoint that is now routing traffic to our resource, because the public IP address is now denied connection (which is actually what we’ve wanted all along).

If you’re in the Azure Portal, head to the Private Link Center and click on the Private Link (NB not the name of the resource the link is pointing to — a UX gotcha there) and you’ll see the private IP address for your resource listed in the Overview screen.

Here’s how we query for the IP address:

interface_id=$(az network private-endpoint show --name appstorage --query 'networkInterfaces[0].id' -o tsv)interface_ip=$(az resource show --ids "$interface_id" --query "properties.ipConfigurations[0].properties.privateIPAddress" -o tsv)

And here’s how we create the DNS records. This will createa record for appstorage.privatelink.file.core.windows.net and point it to the IP address of our Private Endpoint:

az network private-dns record-set a create \
  --name appstorage \
  --zone-name "privatelink.file.core.windows.net"az network private-dns record-set a add-record \
  --record-set-name appstorage \
  --zone-name "privatelink.file.core.windows.net" \
  --ipv4-address $interface_ip

If you now do a dig or nslookup on appstorage.file.core.windows.net (that’ll need to be done from a virtual machine deployed within your vnet) you should get back the private IP address. It’s a check worth doing to validate your configuration steps so far.

Theoretically you’ll now be able to access this resource privately, but if you’re using App Service, you’ll find out this isn’t the end of the road. App Service will still resolve the public IP address of the storage account. Mystery.

We’ll uncover the mystery below, but first, lets set up the rest of our to-be private resources.

Redis with Private Endpoint: a recent addition, it’s now possible to access Redis via a private endpoint. If you followed the section above on storage, here’s a condensed version for setting it up:

az redis create \
  --name app-redis \
  --sku Basic \
  --vm-size C0redis_id=$(az redis show --name app-redis --query "id" -o tsv)az network private-endpoint create \
  --name app-redis \
  --vnet-name app-vnet \
  --subnet private-endpoints \
  --private-connection-resource-id "$redis_id" \
  --group-id redisCache \
  --connection-name app-redisaz network private-dns record-set a create \
  --name app-redis \
  --zone-name "privatelink.redis.cache.windows.net"az network private-dns record-set a add-record \
  --record-set-name app-redis \
  --zone-name "privatelink.redis.cache.windows.net" \
  --ipv4-address $interface_ip

Cosmos DB with Private Link: I’ve saved this to last because it’s a touch more complicated than the last two. Cosmos tends to have more than one IP address, so you’ll need to do a bit more work here. In my case I’m using uksouth as the region and I get a second IP specific to the region:

az cosmosdb create \
  --name app-mongo \
  --kind MongoDB \
  --server-version 3.6mongo_id=$(az cosmosdb show --name app-mongo --query "id" -o tsv)az network private-endpoint create \
  --name app-mongo \
  --vnet-name app-vnet \
  --subnet private-endpoints \
  --private-connection-resource-id "$mongo_id" \
  --group-id MongoDB \
  --connection-name app-mongomongo_interface_id=$(az network private-endpoint show --name app-mongo --query 'networkInterfaces[0].id' -o tsv)mongo_interface_ip=$(az resource show --ids $mongo_interface_id --query "properties.ipConfigurations[0].properties.privateIPAddress" -o tsv)az network private-dns record-set a create \
  --name app-mongo \
  --zone-name "privatelink.mongo.cosmos.azure.com"az network private-dns record-set a add-record \
  --record-set-name app-mongo \
  --zone-name "privatelink.mongo.cosmos.azure.com" \
  --ipv4-address $mongo_interface_ipmongo_uksouth_interface_ip=$(az resource show --ids $mongo_interface_id --query "properties.ipConfigurations[1].properties.privateIPAddress" -o tsv)az network private-dns record-set a create \
  --name app-mongo-uksouth \
  --zone-name "privatelink.mongo.cosmos.azure.com"az network private-dns record-set a add-record \
  --record-set-name app-mongo-uksouth \
  --zone-name "privatelink.mongo.cosmos.azure.com" \
  --ipv4-address $mongo_uksouth_interface_ip

Phew. It’s a mouthful, but we’re getting there. Hopefully you’re still with me.

App services: for this architecture, I’m using App Service as a stand-in for a container PaaS by creating one instance per microservice and running a container on each instance. We’ll create two instances for this example, a front-end UX and a back-end API. We’ll also now configure the instances to be able to look up addresses in the Private DNS zone.

We start with an App Service Plan. We’re using sku p1v2 in here as it gives us deployment slots, which I’m using for rolling-deployment of microservice updates from CI/CD. Cheaper plans are available that don’t offer this capability:

az appservice plan create \
  --name app-plan \
  --is-linux \
  --sku p1v2

Next we need a container registry. We’ll go with ACR (other registries are available). Note that the registry name is globally unique as it’ll be part of the public DNS name of your registry. The name can only contain alphanumeric characters, no dashes allowed here. If you want to use the continuous deployment feature in App Service, you’ll need to enable admin, if not you can leave it out:

az acr create \
  --name appacr \
  --sku Standard \
  --admin-enabled true

Ok, now we’re ready to create our “microservice runners”, aka App Service instances. Again, these names need to be globally unique as they’ll be part of the public DNS name of the instance. Here the names can contain dashes as well as alphanumeric characters. Standardisation is beautiful. You’ll need to decide on container naming and tagging conventions, but it is possible to create an app service before a container of that name exists in the registry, so we can go ahead:

az webapp create \
  --name app-ux \
  --plan app-plan \
  --deployment-container-image-name app-acr.azurecr.io/app-ux:latestaz webapp create \
  --name app-api \
  --plan app-plan \
  --deployment-container-image-name app-acr.azurecr.io/app-api:latest

Now we have two microservice container runners, we need to route outbound traffic via our egress subnet so that we’ll be able to connect to our backing services. Remember that, because these are both on the same App Service plan, Azure requires that they both specify the same egress subnet:

az webapp vnet-integration add \
  --name app-ux \
  --vnet app-vnet \
  --subnet appservice-plan-egressaz webapp vnet-integration add \
  --name app-api \
  --vnet app-vnet \
  --subnet appservice-plan-egress

If you’re planning on putting your App Service instances behind Private Endpoints (see below on how to configure ingress for those, because they’ll no longe be visible to the Internet) here’s a condensed version. It’s similar to the process for Cosmos, in that there are two DNS records, but in this case there’s only one private IP. Rinse and repeat for each App Service instance:

app_service_id=$(az webapp show --name app-ux --query "id" -o tsv)
az network private-endpoint create \
  --name app-ux \
  --vnet-name app-vnet \
  --subnet private-endpoints \
  --private-connection-resource-id "$app_service_id" \
  --group-id sites \
  --connection-name app-uxinterface_id=$(az network private-endpoint show --name app-ux --query 'networkInterfaces[0].id' -o tsv)interface_ip=$(az resource show --ids "$interface_id" --query "properties.ipConfigurations[0].properties.privateIPAddress" -o tsv)az network private-dns record-set a create \
  --name app-ux \
  --zone-name "privatelink.azurewebsites.net"az network private-dns record-set a add-record \
  --record-set-name app-ux \
  --zone-name "privatelink.azurewebsites.net" \
  --ipv4-address $interface_ipaz network private-dns record-set a create \
  --name app-ux.scm \
  --zone-name "privatelink.azurewebsites.net" \
  --ipv4-address $interface_ipaz network private-dns record-set a add-record \
  --record-set-name app-ux.scm \
  --zone-name "privatelink.azurewebsites.net" \
  --ipv4-address $interface_ip

App Service DNS server: Now for the “magic” step. Currently App Service will still resolve the public IP addresses of your backing services (as well as any App Service instances you’ve just put behind Private Endpoints). To change this behaviour, we have to add two configuration values to our app service instances:

az webapp config appsettings set \
  --name app-ux \
  --settings \
    WEBSITE_DNS_SERVER="168.63.129.16" \
    WEBSITE_VNET_ROUTE_ALL=1

With those now in place, App Service should be able to resolve the IP addresses of your private endpoints — mystery solved.

A note on ingress

The question you may be asking is, “if I make my App Services private, how can I route traffic to them?” — excellent question. The answer to ingress is not entirely straightforward. There are two main candidates if you want to put additinal protection in front of your apps: Application Gateway and Azure Front Door.

Application Gateway has the advantage of being able to deploy into a dedicated subnet in your vnet. That means it can see your private DNS zone and route traffic to a privatised App Service. The downside is it won’t manage it’s own SSL certificate. It doesn’t take many unexpected production outages caused by an expired cert before you swear you never want that to be a manual process ever again.

Enter Azure Front Door. Here certificates can be automatically managed. Great. So what’s the catch? Well, you can only forward traffic from Front Door to a public IP address. What’s a cloud architect to do if the IP you need to forward to is private?

The least worst solution I’ve found is to use both services. You route Front Door traffic to an Application Gateway that has a public IP address. You then route traffic from the gateway to your Private Endpoint. You’re effectively “double-fronting” App Service (and there are cost implications) but it does mean you can combine automated cert management with routing to Private Endpoints, at a trade-off of some duplicate capability.

You’ll probably want to associate a Network Security Group with the dedicated subnet you deploy the gateway into so that you can restrict inbound traffic to the gateway. Without this, traffic could simply circumvent your front door and hit the gateway direct, making front door little more than an expensive way to rotate an SSL certificate. There’s a handy AzureFrontDoor.Backend service tag you can use here, which you can’t use if you try to restrict inbound traffic on App Service, meaning you’d need to hard-code the current Front Door back-end IP range.

You may want to go further, as mentioned in the docs above on restricting traffic to App Service, and figure out how to only allow traffic from your specific Front Door instance. In my case it was a trade-off between ease of infrastructure-as-code automation and how much security was enough for my use-case.

Easy right?

Easy when you know how, and I’m sure you’ll find some gotchas I haven’t spotted (feel free to let me know). This does seem remarkably complicated for now, but no doubt it will change in time.

There’s a whole blog-post’s worth of steps that could be written on how to configure ingress via Azure Front Door and Application Gateway, and another about CI/CD and rolling deployment of containers to App Service for a fully-serverless microservic earchitecture. Give me a shout (or some claps) if you’d like me to publish those.

If you’d like to hire my team to help build a secure serverless infrastructure and CI/CD microservice delivery pipeline with you, we’d be delighted to get involved.

For now, I hope this helps fellow engineers along the road to building more cloud-appropriate infrastructures in Azure.