How the “probably biggest data leak in the history of Germany” could have been prevented

(Find the articles from DIE ZEIT and c’t)

Just by entering the correct IP address into the Windows file explorer it was possible to download the data.
The “probably biggest data leak in the history of Germany” includes personal information of 3 million persons. The car rental company didn’t react to the first two emails pointing out the issue – just after the involment of the two media companies they disabled the public access to the data.
This massive data leak did happen on an on-premise architecture. But regardless of the platform you use, you must take care to take all steps necessary to prevent such a catastrophic failure. If it had been on AWS however, we could have made use of several readily available security tools to prevent us to be the victim of the next big leak:

Issue #1: open port on public IP address

In professional setups systems which don’t provide public services are usually hosted on private RFC1918 IP addresses. They are only reachable via secure VPN networks. If services need to be publicly available we use firewalls, but in any big company or environment we frequently run into the situation that the firewall (aka Security Groups on AWS) configuration does tend to suffer from exponential growth in complexity. Good standards and pratices should be implemented to mimize this problem, but since we’re human in the end, we want to make sure that we did not miss anything. Two possible solutions are addressing this problem from different directions. By using automation and templating in the form of AWS Cloudformation (or Terraform) we can aim to prevent any malformed Security Groups from being created in the first place. Especially the use of CloudFormation Stacks do bundle predefined resource stacks to be used by your organization. These can then bundle all the best knowledge and configuration, hardened by your security specialists to be easily available for everyone in your company. In an ideal world, this would prevent an “open port situation” from happening in the first place, but enforcing this might lead to a quite restrictive environment, which might limit your agility and must be carefully balanced. The second possiblity to address this situation is to continuously scan your enviroment for malformed resources, without limiting your developers to only a small set of predefined resources. This is the prime usecase of AWS Config. A every growing set of predefined rules as well as your own rules is used to ensure that your environment is compliant to your very own standards. On example would be to never allow the SMB port 445 to be publicly available. AWS Config does then report any violations so you can be sure to still be secure, even if you refactor your whole firewall configuration.

Issue #2: no authentication

It is possible to configure a SMB file share without any authentication at all, which was the case in this leak. It is out of question that to abstain from using authentication for storing unencrypted backups is a very bad idea. In the AWS ecosystem, it would be even easier to just use the secure by default automatic backups from Amazon RDS. We could as well just save the backups in a private S3 bucket, authenticated for example via EC2 IAM instance roles. Assigning roles to your application even enables you to prevent the use of static security credentials, which is a well known attack vector which should be avoided, if possible. IAM roles provide the service with temporary credentials, which are only available to your application instance and rotate frequently.

Issue #3: not using encryption

AWS does offer the use of encryption in nearly all of its services. In a basic setting, this is often as easy as checking a box. There is hardly any excuse not to use encryption – both at rest and in transit – in a cloud native environment. In reality, the encryption part is often not the actual problem, regardless of cloud or on premise setups, the biggest challenge is often how to secure and manage the keys themself. Especially in a cloud environment we have to account for a highly automated and dynamic environment which only increases the difficulty to solve this issue. Luckily AWS does offer the “Key Management Service” (KMS) to ease the use and control of encryption keys. With KMS it is very simple to govern a large set of dedicated encryption keys and seamlessly integrate it into your application and work flows. One interesting aspect is that it actually never allows the extraction of the key itself. This might sound counter intuitive but has a very interesting consequence: since no one is able to actually extract the encryption key itself, all encryption and decryption is guarded by the KMS service. This enables to define very fine grained set of permissions. In the case of a dedicated backup, it would for example allow a user or system to be able to encrypt the backup, without anyone being able to decrypt the backup after it has been stored.

Issue #4: automatic classification/identification of personal data

Since May, 2018, Europe regulates the handling of personal data in the “General Data Protection Regulation” (GDPR, in German DSGVO). Every business has to take greatest measures to protect the personal data of its clients. With AWS Macie Amazon offers a service to automatically discover, classify and protect sensitive data in AWS using machine learning. It recognizes sensitive data such as personally identifiable information and provides dashboards and alerts that give visibility into how this data is being accessed or moved. The service continuously monitors data access activity for anomalies, and generates detailed alerts when it detects risk of unauthorized access or inadvertent data leaks.

Issue #5: missing IPS/IDS

In addition to all the security measures mentionend above, one additional aspect of security is to actually notice that anything went wrong. Even if you have missed to prevent a particular attack vector, you must be notified as fast as possible if any unauthorized access patterns do occur. Waiting for a forensics team to identify your vulnerability might be too late. By getting notified, you can at least shutdown the systems being breached to prevent data exfiltration. AWS does offer GuardDuty which combines static analysis together with machine learning algorithms to continuously scan your systems for suspicious behaviour and anomalies, as well as attacks and reconnaissance against your systems. It combines various sources like network flow logs, DNS queries and audit logs to immediatly alert and trigger counter measures.

If you want to prevent being the victim of a data breach in the news talk to us – we’ll help you to secure your data and the personal information of your clients!