Get the STS tokens / access keys and then use them from anywhere (Tor/proxy/privacy vpn/non-extradition country) because 99.9% of cloud customers are not applying any SourceIP controls in their IAM policies.
For me the goal should be to not access prod directly. If you found something broken, it should be fixed by making another PR and having the pipeline to deploy the fix into prod.
But I understand that not all the systems work that way… maybe you need to access those systems, in that case I think I would go with something like SSM where you use your IAM role to access the instance, maybe even a pipeline with some approval to enable the access on demand if you are really into security.
I just think that you have more control that way.
You can’t fix human stupidity. If you have a bastion box, or other means the same Uber shit would have gone down.
The real way to mostly fix this issue is having 2 accounts and 2 computers. 1 computer is for internet access and your day to day. The other machine is where all the admin things happen and is not internet connected. And no, a vdi box doesn’t suffice. It needs to be a dedicated physical machine that is gapped from everything else. If your RDP to a jump box or connect to some other vdi system from just the one internet connected computer your giving that social engineer way to fuck you still. That is why you need that separate machine.
What about using of zero-trust solutions like Teleport or HashiCorp Boundary?
Partially depends on:
- High level, is valuable or sensitive data potentially exposed via your prod VPC? If the VPC is taken out of service, are your customers or internal stakeholders f’d (or is there more resiliency)?
- What else has access to your prod VPC - if you look at the inbound firewall rules, is there admins, CI/CD systems, management/visibility systems, user access, inter-VPC workloads, etc?
In your opinion are 2FA apps good enough or is a security key the only good way?
I’m unfamiliar with the ‘one-way’ term, so thanks for mentioning it. Got some googling to do…
Agreed, i’ve definitely seen the convenience make people do one off fixes and forget to enshrine said fix in code.
if code worked in development and staging, it works in prod.
I can’t imagine a reality where this is true… The only way to confirm code works in prod is to observe it working in prod. Even that only guarantees it worked in prod when you observed it.
if code worked in development and staging, it works in prod.
LOL
Ah, would be really cool to work somewhere like this, lucky you!
Same here.
If we really need SSH access to troubleshoot something that badly, we have an “in case of fire, break glass” procedure where we whitelist an engineer’s IP for SSH into a Bastion host, and whitelist the Bastion host for SSH into everything else.
Any engineer who would need this level of access for troubleshooting would also have the ability and access to change the security groups.
We considered SSM, but it opens up another avenue of attack for something we only need once or twice a year. I think we only needed production SSH once in the last 1.5 years.
Can you give us some examples of the extensive observability systems? I am trying to build something similar and would love to know what you did ![]()
I tried Teleport. The setup is clunky, buggy, and not very flexible. We are using a combination of oauth2-proxy and a VPN.
It is probably better if you pay for Teleport. I only tried the open source version.
Open source Teleport just isn’t very usable, and is buggy. I wouldn’t recommend it.
100% immutable is a recipe for lack of life/work balance. I can’t take the server down without disrupting people during business hours, but I can’t make any change without taking it down. I consider it a very short-sided model.
No SSH or equivalent is just not practical 100% of the time. Sometimes you need to debug production. Just how easy Kubernetes makes getting into the containers is one of the things I love about it.
I like this idea. I’ll bring it up with the team, and yes, SSM makes this so much easier these days.
VPNs are much more versatile than bastions. They are also better for less technical employees.
The 2FA apps still fall a bit short, in my mind, as an attacker can still ask for the current code on the phishing site and use it immediately.
I think it’s not fair to expect a human to make the correct choices 100% of the time. Even experts make mistakes. The security key is the only way, that I’m aware of, that can prevent mistakes.
I’m probably a bit weird, but I wouldn’t choose to have my 2FA solution dependent on the same device that also has my password manager installed.
Just like a one-way street ![]()
YOU should be able to access your services, but nobody who’s broken into one of your servers should be able to get back into the office…
Without additional configuration, a VPN is open in both directions, adding NAT will implicitly limit traffic, but what you really want is an explicit(and because of that: auditable) set of filter rules that will allow certain services in prod to be accessible from the office and nothing but established connections to return to the office…
When they say “one-way” they’re talking about the ability to establish new connections.
Production environments have no business establishing connections to systems on campus/VPN networks, but engineers might need to connect to production environments to perform a number of operations tasks. Not defending prod access, but in my 20 years I’ve never seen a totally isolated prod environment.