Is production VPC access via VPN an anti-pattern?

Given the recent Uber hack, I’ve been wondering about this. I’ve only worked at small shops so I’d like people with experience at medium to large companies to chime in as well.

At every place I’ve been, we could vpn into the production VPC and while handy for testing hot fixes on production bugs, I’m beginning to wonder if its really such a good idea.

If it isn’t, what’s the alternative? Just solely wait to find that things don’t work after you’ve merged? Or a hybrid where you have VPN access in lower environments (dev, staging), but not in prod?

At the very least, it might be a good time to double-check that the VPN uses a login method that is resistant to phishing.

Having a more direct access to prod(and not via several layers of CDNs and loadbalancers) can enhance troubleshooting quite a bit, but it doesn’t have to be two-way…

Treat it like another entrypoint to the production environment, not like a “datacenter next door” where you can easily connect to and from, make it strictly one-way like the normal access path would be and keep your build and development systems separate.

So, no, having a VPN to your production system isn’t per se an anti-pattern, having a totally open network connection from your production systems to your desktop system is and has been since at least the 1990s :smiley:

The orgs where I’ve worked generally consider this a bad idea. It’s not an anti-pattern per se, but it generally encourages bad security hygiene so its better to restrict VPN access to lower environments. From a devops standpoint, I feel it makes me think of automation less if I know I can just log in and fix things. For prod, at the very least consider using a jump/bastion server instead of logging in from your desktop directly.

I’ve set up Teleport at my last two jobs and it’s worked well. I feel it’s more secure because it forces you to be intentional about the connections you create. In my opinion, it’s better than a VPN.

In our environment, we have VPN access to dev, but not to staging or prod. Merge and test in dev, then promote.

We just use a bastion host and ssh. Don’t need it very frequently since datadog gives us so much info. Terraform puts the private keys in parameter store for us. Log into AWS with SSO through saml

Love to plug my own product, but check out netfoundry.io. it’s a Zero Trust VPN replacement

I am starting to suggest that we maybe should have separate computers for access to prod at work (also through a VPN which we already do).

A developer machine typically has to run so much untrusted stuff due to all the package managers and tools that it’s not really a good system to also have any kind of direct access into the most important technical infrastructure.

I am not really looking forward to potentially lug around 2 laptops when I’m on call so maybe something has to be worked out there.

Having your VPN connect to your production VPC with no access control in between is a bad idea. You should at least have a bastion host in between. VPN connected to your bastion, and then the bastion (sometimes called jumpboxes) lets you “jump” into your production VPC. It’s far more secure than just a VPN connection constantly open to a VPC.

However, I would say that even that is a bare minimum and in the past few years has started to go out of fashion.

A better way to do it is a tool like Hashicorp Boundary or Teleport. These offer fine grained control over hosts and all credentials are short lived and autorotated. It’s also much easier to “kill switch” an employee and allows for much better auditing or control over what users can do. It’s also really nice if a developer needs access to a server to do one change. You can grant them 1 hour access to a single server.

The absolute best method is to go to 100% immutable servers with no VPC access at all. You want a very well tuned deployment and imaging process (like packer) for this to work well. But basically you never SSH into a production server. If a server breaks, you pull down an image of it into a dev VPC, fix the image (change settings or whatever) and then re-image the machine from the dev version and redeploy it into production. You never actually edit a live production server. So production VPC access is unnecessary. So then you can lock down your production VPC to everyone. Now your main attack surface is your pipelines and the access control you have there.

Teleport is an interesting option in this space. Teleport setups the connection based on user role using short lived certs. Teleport has its own cli that can be used and the web portal can also proxy web pages.

Why is the term anti-pattern even being used here? To me, this is a ridiculous question far too broad for anyone to reasonably answer. It depends on what data you have in production and who you grant VPN access to as well as technical questions like what VPN you’re using, are your systems up-to-date etc.

Implement bastion host or vpn services on your private cloud. Provide access level control every user and also restrict root level of access remotely in production servers.

Can’t you use a bastion host to access it? Possibly using SSM so you don’t need to expose any port externally and your access permissions are only tied to your AWS account permission

honestly the take away here is more of a “don’t leave creds lying around in scripts” and “better quarantining process” plus prob quite a bit of “automated and efficient cred roll mechanisms” too. compromises are a given because people with access will always make mistakes, how to minimize that mistake is key

Many folks have a ‘bastion’ for just this purpose.

Oauth2-proxy for anything that can run, and is a web service. A VPN for everything else. This includes SSH, databases, and web services that are a part of AWS services. It doesn’t matter what environment. Just divide Production and daily development environments into different cloud accounts.

aws Session Manager or similar to avoid bastion and multi host hops

I think you are looking at the wrong lesson (I understand the issue was password in powershell script). Must do:

  • MFA every account that can access your cloud environment from external means.
  • MFA every admin account where it accesses you services from.
  • Have secret management for automated access

Social engineering and network access is pretty must inevitable. Limiting the blast radius is the most important job for sec/infra/DevOps/sre people.

Just worked two jobs. First was a mess of VPNs and “Jumpboxes” with people using shared passwords for some of them.

On the one I am today there is three levels of VPNs (using some kind of SSO because we just use our company account). By default you get access to the Dev env VPN. Then you can request access to live VPNs - one is for most systems and another for systems with user accounts and credit card data