Insights: Capital One data breach


It was recently revealed that Capital One had been breached and over 100 million credit applications were stolen by a hacker. All information currently available indicates the root of the breach was a misconfiguration, and no platform is immune from a misconfiguration.

Therefore, as a Senior GCP Architect, I was interested in how I would respond to customers who might be concerned their GCP deployment is or could be vulnerable to a similar breach. The below summary and suggestions are focused on security best practices and Google Cloud Platform controls, which enterprise customers can take to help protect themselves and their customers.

First, we should review how the breach occurred to better understand what controls may have helped limit the impact or prevented it. This is our understanding of events based on the court filing. The details are high level so until more information is revealed, certain assumptions will have to be made.  

Step 1: The hacker gained access to a virtual machine which she should not have had access. She gained this access due to a misconfigured firewall rule.

Step 2: The hacker extracted the credentials of a service account, referenced in the report as: *****-WAF-Role account.

Step 3:  The hacker used the credentials of this service account to request more information about the environment via API calls: List Buckets.

Step 4:  The hacker used the credentials of this service account to download/steal the data

These steps are a very high level summary of actions which occurred. A few interesting points which were included in the court filings:

  • “A firewall misconfiguration permitted commands to reach and be executed by that server, which enabled access to folders or buckets of data in Capital One’s storage”
  • “These commands were executed from IP addresses that I believe to be TOR exit nodes. According to Capital One, the ******-WAF-Role account does not, in the ordinary course of business, invoke the List Buckets Command.”

An interesting take away from the report is that this breach does not seem to have been a highly complex hack such as a zero day exploit. This leaves us asking ourselves what, if any, actions could have been taken to help reduce the impact of this attack or possibly block it altogether. We first considered a few core security principles:

The principle of least privilege

This principle states that a user or service account should only have the necessary permissions to perform their job/function. The report indicates the compromised service account executed commands which it does not normally execute in the course of daily business. While this does not necessarily mean the account did not require these permissions, it does raise questions. The execution of least privilege can be difficult if the necessary permissions are unknown. Sometimes separation of duties will also play a part in helping to reduce the amount of permissions required by a single account.

Logging, monitoring, and alerting

Much of the report indicates logs were being collected and had the necessary information to help with the investigation. This is useful when investigating a breach such as the one which occurred. Monitoring and alerting can help raise the necessary red flags for security to investigate suspicious or anomalous behavior quickly.

Automation

The report indicates the initial breach occurred due to a misconfigured firewall rule. Without more information we don’t know how or why this occurred. However, most misconfigurations occur when an environment is deployed and managed manually. Automation helps to reduce misconfigurations prior to deployment (Infrastructure as Code) as well as provide quick remediation post deployment (Configuration Validation). Automated scanning of an environment (Penetration Testing, Vulnerability Management, Security Scanning, etc.) for security risks should also be used to help improve security posture.

Automation, along with the previous two security principles, can help wherever an environment is deployed. Below are some Google Cloud Platform tools, which can help reduce the risk of similar breaches within GCP environments. Please note, the following tools do not represent a comprehensive list of Google’s GCP security features. Please also note, not all of these features are generally available and thus should be considered in the context of their product launch stage.

  Overview Breach relevance  
Organization policies “The Organization Policy Service gives you centralized and programmatic control over your Organization’s cloud resources. As the Organization policy administrator, you will be able to configure restrictions across your entire resource hierarchy.”   Disable Service Account Key Creation This feature blocks the creation of service account credentials, which would have impacted the actions taken by the hacker.    
VPC service controls Provides context-based perimeter security for your GCP resources sitting behind Google APIs (e.g. Google Cloud Storage bucket – storage.googleapis.com) Note, Cloud IAM provides identity-based security.   The hacker started accessing the storage service from IP addresses which were owned by a VPN provider. While access was limited to an identity (service account), VPC Service Controls could be used to additionally limit access based on context (e.g. IP Address).  
Cloud security command center “Cloud Security Command Center (Cloud SCC) is the canonical security and risk database for Google Cloud Platform (GCP). Cloud SCC is an intuitive, intelligent risk dashboard and analytics system for surfacing, understanding, and remediating GCP security and data risks across an Organization.”                                                                               The following security scanners may have detected and alerted the enterprise to a possible breach.                    
Policy intelligence – recommender   Helps recommend reductions in permissions for users and service accounts based on actual account activity. This would make the Principle of Least Privilege easier to execute.   Following the Principle of Least Privilege can be difficult, and changes in application behavior or activity could make old permissions no longer needed.  
Admin activity audit logs   “Admin Activity audit logs contain log entries for API calls or other administrative actions that modify the configuration or metadata of resources”   Setting an alert to trigger on a configuration change to the environment not desired such as: Firewall rule change, service account key creation, security control configuration change, etc.  
Data access logs “Data Access audit logs contain API calls that read the configuration or metadata of resources, as well as user-driven API calls that create, modify, or read user-provided resource data.”   An alert could have been configured to monitor data reads, perhaps alerting on a rapid rate in change. This could have triggered an alert to investigate why a service account started accessing data at a much higher rate than normal.  
  Identity aware proxy       While IaP provides several capabilities, its latest feature has garnered the attention of many admins: IaP TCP Forwarding. Useful for those who want to provide access such as SSH to VMs, but do not want to create and manage a Bastion host.   Access to VMs can be guarded through this service so they are not even exposed publicly (no Public/External IP address), and thus a misconfigured FW rule might be less damaging as it’s not exposed publicly.
  Google application default credentials   A strategy to access application credentials. Rather than creating a service account key (long lived credential), Application Default Credentials are one hour lived access tokens (thus must be refreshed each hour).   The attacker downloaded long lived cre­dentials to a service account referenced as: ******-WAF-Role   If Application Default Credentials are used in conjunction with the Organ­ization Policy “Disable Service Ac­count Key Creation”, then an attacker would only have access to the short lived access token. This would place a greater burden on the attacker to act very quickly or continue to return to the compromised device to obtain another token.    
Forseti     “Forseti Security is a collection of community-driven, open-source tools to help you improve the security of your Google Cloud Platform (GCP) environments. Forseti consists of core modules that you can enable, configure, and execute independently of each other. Community contributors are also developing add-on modules to offer unique capabilities. Forseti’s core modules work together, and provide a foundation that others can build upon.”   Configuration Validator to check code prior to deployment to catch any mis­configuration.    

Conclusion

At the heart of the cloud is flexibility and speed. While this can help rapid growth and change for an Organization, it can also increase complexity and risk. Constantly reinforcing security principles such as least privilege, defence in depth, security automation, monitoring and alerting should help create the right cultural mind-set that security is applicable everywhere. Knowing and properly using multiple tools to secure all aspects of a deployment will help an Organization reduce their overall risk. Unfortunately, the risks will never go away, but hopefully through applied principles, advanced tools, and shared knowledge they can be greatly reduced.

Terms

WAF: Web Application Firewall
TOR: The Onion Router
GCP: Google Cloud Platform