Despite the media coverage afforded to the SolarWinds and Kaseya breaches, Palo Alto Networks, Unit 42 threat research indicates supply chain security in the cloud continues its growth as an emerging threat. Much remains misunderstood about both the nature of these attacks and the most effective means of defending against them. To better understand how supply chain attacks occur in the cloud, Unit 42 researchers analyzed data from a variety of public data sources around the world and, at the request of a large SaaS provider, executed a red team exercise against their software development environment. As you'll hear in the podcast, overall, the findings indicate that many organizations may still be lulled into a false sense of supply chain security in the cloud. Case in point: Even with limited access to the customer’s development environment, it took a single Unit 42 researcher only three days to discover several critical software development flaws that could have exposed the customer to an attack similar to that of SolarWinds and Kaseya.
In the podcast, Unit 42 researchers Nathaniel "Q" Quist and Dr. Jay Chen, draw on Unit 42’s analysis of past supply chain attacks. The Cloud Threat Report explains the full scope of supply chain attacks, discusses poorly understood details about how they occur, and recommends actionable best practices that organizations can adopt today to help protect their supply chains in the cloud.Comprehensive, full-stack cloud security
**NOTE: Generated via ML. Expect crazy stuff to be translated that may have never actually been said by the host or guest :-) ***
[00:00:18] MC: Oh, I absolutely love Christmas music. So, had to start the show with some Christmas music and hopefully you're enjoying some of the tracks that we pick for each of these episodes.
Today, we're going to be chatting about a topic that I think many of you, if not most will find extremely interesting, and that has to do with some of the SolarWinds in the CSEA breaches that happened over the last number of months here in 2021. What's interesting, though, is we're not going to go deep into those breaches, but rather we're going to hear directly from the researchers at Palo Alto Networks, Unit 42 group that did their cloud threat report on cloud native supply chain risks, and this is something that I don't think is talked about enough, which is why I think you're going to find it quite interesting.
But first, let's hear a word from our sponsor.
[00:01:14] MC: Prisma Cloud secures infrastructure, applications, data, and entitlements across the world's largest clouds, all from a single unified solution. With a combination of cloud service provider API's, and a unified agent framework, users gain unmatched visibility and protection. Prisma cloud also integrates with any continuous integration and continuous delivery workflow to secure cloud infrastructure and applications early in development. You can scan infrastructure as code templates, container images, serverless functions, and more, while gaining powerful full stack runtime protection. This is unified security for DevOps and security teams. To find out more, go to prismacloud.io.
[00:02:03] MC: Today, we feature the Unit 42, latest cloud threat report for the second half of 2021. And today we have Nathaniel Q Quist and Dr. Jay Chen from the Unit 42, cloud threat team joining us. Jay and Q, thanks for joining us.
[00:02:20] NQ: Awesome. Thank you for having us.
[00:02:22] MC: So, in this report, you guys focused on supply chain security in the cloud, right? Obviously, this is an area that has been talked about a lot with SolarWinds, CSEA. So, I guess my question for you is why did you focus on cloud supply chain security in this edition of the cloud threat report? What was the reason behind that? So maybe, Jay, I'll start with you. And Q, I would love to hear you chat about that as well. What drove you guys down this path?
[00:02:51] JC: Sure. So, supply chain attack is nothing new, right? I’ve heard it 10 years ago, at least. However, supply chain attack has been a buzzword in the last 10 months, most likely due to the very high impact attack, such as SolarWinds, CSEA. So, the natural question I have is that how a supply chain attack can happen in comparison? And if it actually happened, what will be the target? And what will be the weak links in the environment and the attacker most likely to target? So, we also try to answer what are the strategies and actions that we can take to protect or prevent this type of attacks from happening in cloud environment.
[00:03:46] MC: Q, how about your take on that?
[00:03:48] NQ: Yeah, very well said for Jay. I mean, I can echo pretty much everything that he just said. I mean, supply chain attacks have been common and frequent for a very long time. Unit 42 had reports that we've been tracking since 2015. Some of those notable ones, back like, CC Cleaner just a couple years ago. I mean, CSEA and most notably SolarWinds, just to kind of build on top of that, and kind of show in an ever increasingly complex world, how supply chain can really affect not just one organization, but that trickledown effect that it can affect, thousands of organizations just by one SAS provider being compromised.
[00:04:28] MC: So, tell us, what did you find most intriguing about these findings? I want to hear, I know each one of you kind of have a different specialty within cloud security and just threat research in general. So, I want to hear what really stood out to you in the findings. Jay, why don't you go first and then, Q, I want to hear from me on this as well.
[00:04:51] JC: Sure. So, the part of the research I did focus on analyzing a large number of TerraForm template and Kubernetes Helm chart template and container image, trying to understand and how the vulnerabilities of this company misconfiguration. This commonly use cloud native code maybe introduced. So, here are some numbers in my findings. 60% of the TerraForm template, public TerraForm template that analyze content, at least one misconfiguration. And for the Kubernetes Helm chart that I analyze, 99% which was like unbelievable high number, 99% of this Helm chart content, at least one is configuration.
For the container image that I saw in those Helm charts, 96% of them contain at least one known vulnerability. So, these are nothing – I mean, this numbers, not something really new. This type of research has been done many, many times before. But in this report, we try to link them together, try to link them together when cloud native application is deployed, it first deployed infrastructure with this number of misconfiguration and it deploys a Kubernetes infrastructure, which another set of vulnerability, and eventually come with the container replication, which again, come with some sort of vulnerabilities. This is the first time that we look at all these misconfigurations and vulnerabilities together.
[00:06:39] MC: Q, how do you see this?
[00:06:42] NQ: Cloud is ever complex, right? And anytime we bring complexity into question, there's going to be some failings, either human failings or just sheer oversight, that someone didn't pay attention to something that that's going to be magnified or amplified within code, infrastructures, code templates, however, that is being created. And that's carried over even into this environment.
If we look at the red team event that we did, that the exercise, we found that the failings still persist. I mean, there was simple failings. This is a mature environment that we looked at and we still found that it was relatively easy to compromise them not by any crazy vulnerability, or because we're expert zero-day code writers or anything like that. We found hard coded credentials in code repositories, there's a lot of ways that we can simple mistakes that just due to the complexity, I believe, of that environment, just allowed it to be overlooked.
So, there's a lot of work that we, as an industry, can do to really push for automation scanning, push for that shifting left of security to make computers actually do what computers are supposed to do, especially within cloud, with a dynamically scaling environment that moves faster than a human can push a button. Be able to have that system, a security system, actually scanning those environments, as fast as they're being deployed, really can make these environments a whole lot more secure.
[00:08:10] MC: So, just to give our audience maybe they haven't read the report yet, you're talking about I read this on – I'll look up the page number here in a couple minutes. But I know that you guys actually did, this is what you're referencing, a hands-on red team exercise at the request of a customer. So, I guess, let's kind of double click on the red team exercise, Q. I want to hear from you basically, what was the context of it? What did they ask you to attack? What were maybe some of the confines? And then tell me just about maybe what you found in a little bit more detail?
[00:08:44] NQ: Yeah, awesome. So, what the red team at fire exercise was supposed to be, we weren't going to assume the role of an internal developer. So, we're just a generic role that a contractor might have with an environment to build, you know, X, Y, Z widget or something within that environment. So, we were given access to a very standard user-based access control that says, “You're a developer, you can get into this environment, you can modify code within your environment.” So, we're going to take that and as masquerading insiders, we're going to see if we can find a way to go from that developer role, and move into administrator role and actually owned the entire cloud environment. So, that was the kind of scope of our agenda.
[00:09:30] MC: I want to interrupt you real quick. So, this was a real environment, right? This wasn't like a sandbox, this was a real development environment. Is that right?
[00:09:38] NQ: Yeah, that's correct. This was an actual developers’ environment. So, within a CI/CD pipeline, not to go too far down the hole. CI stands for continuous integration, and that is primarily what we were targeting. And that is like the initial phase of a supply chain. It's how you integrate systems, how you build those processes with those components, within that particular supply chain. And then they have trickle out effect into the rest of the supply chain, where they actually go into production after that.
And so, SolarWinds, we really use that attack framework that APC 29 did in order to attack SolarWinds, the Orion product. We wanted to model our attack, very similar to how they perform that attack. So, we didn't want attack directly the source code itself, we wanted to go to, you know, kind of a side chain supporting infrastructure for that supply chain. So, we really used the SolarWinds attack as a framework and methodology to initialize and conceptualize how we were going to perform the red team exercise itself.
[00:10:38] MC: So, it sounds like there were some similarities. Again, I know you guys have a lot of detail on this in the report. So, we won't go into all the details here on the podcast. And by the way, if the audience wants to download the threat report, all they need to do is go to cloudthreat.report, just put that in your browser clou threat.report, it'll take you right to a place where you can download the report. But I guess one of the questions that I had when I read the report was, you guys didn't attempt to like replicate the whole thing that happened with SolarWinds. That was outside in. It looks like, is it fair to categorize this as like an inside out type of scenario that you played out?
[00:11:16] NQ: Yeah, the actual attack itself is inside up, when we did get a privilege escalation, we were able to grab administrator rights to that cloud environment. So, we did a lateral movement, and then privilege escalated up to own the environment. In essence, we would own supply chain systems applications themselves. We could just walk right into them and then modify the code that we wanted to. Yeah, APC 29, they actually did a phishing attack, that's how they officially gained access to SolarWinds itself and they went through that. So, we just kind of skipped to the initial access portion. And we jump straight into actually have like, say, we've already owned a box inside the system, and now we’re laterally moving to own the rest of the environment.
[00:11:56] MC: Okay, that makes sense. That makes sense. And that's really, really intriguing. So, one last point, I want to key in on and then, Jay, I’ll jump to you. But Q, you mentioned that this client might had what most would consider good, or mature cloud security. Why did you say that? And what were they missing?
[00:12:15] NQ: Certainly. I mean, they had all the tools there. They did have role-based access control. They actually did have user granularity. The AWS environment they were using, they actually had specific IAM controls for specific services, for specific users. They have their environment very well locked down. And they also had the security tools like say, AWS or Amazon’s Guard Duty, deployed and configured, at least for some of their accounts. And they also had obviously, Prisma Cloud, in there as well, which was, you know, aggregating and collecting a lot of the log data and looking for misconfigurations, and things of that nature.
Those were all very critical, because during the red team exercise, a portion of our exercise was actually caught, was actually detected by the organization, the security team, and that's good. That's what we want to have. However, in the same token, not all of it was configured 100% correctly, and we were able to kind of squeak through the site, even though one aspect was caught, we were still able to maintain access in the environment. So, they had tools, they had security tools that were great. They were able to perform security actions to mitigate the risk that they had their particular cloud environment. So, we wanted to highlight that. We were caught, this is how we got caught, but we need to go. We've gone through steps one, two, and three, now we need to go for five, six, seven, and go to the next little step.
[00:13:40] MC: That's great. Well, I appreciate that. So, for the audience, if you want to jump right to the red team exercise and report, download the report, cloudthreat.report and go right to page five, that's where we cover the red team exercise in depth. So, that was one part of the report. The other part really went very deep into infrastructure as code, and just how that is a key to supply chain protection for organizations that are building cloud native infrastructure.
So, Jay, I know that you spent a lot of time really kind of doing a tear down on a lot of the popular components, open source components, TerraForm modules, Kubernetes Helm charts, container images. Looking at some of the data that you presented, when analyzing popular open source repos, the report shows that when it came to TerraForm modules, you found that 64% of those downloaded resulted in at least one high or critical insecure misconfiguration. Tell us what does that mean? And what's the impact for those building cloud native applications?
[00:14:49] JC: We analyze more than 1,500 different TerraForm open source projects. And I want to say that not all modules are not all open source codes are there equal. Some modules are used more often are more popular than others. So, when we see that, this one module has a critical vulnerability, however, if this module is only used by a handful of users, that may not be the deal. However, if we see another module, one moderate misconfiguration, but it has been downloaded millions of times by many thousands, or more different organization and user, it’s very concerning the misconfiguration right there.
So, one thing we look at is the number of downloads for modules that we analyze. This number, the number of downloads, sort of indicate how popular and the number of the users and organizations may be impacted by the misconfigurations in this module.
[00:16:10] MC: So, the way I read this was that some of the most popular modules, which means the reason that they're most popular is because they're highly used, right? And in cloud native infrastructure 64%, the vast majority mean that anybody who's using those are starting from an insecure state, is that a correct assumption from that? Is that right?
[00:16:33] JC: So, those modules are created for usability, not really for security. Anybody who download and import those modules can start immediately, however, the users still need to lock down those infrastructures, depending on their use case and use in their environment. So, those modules are not really written in security by default.
[00:17:02] MC: That makes sense. So, I noticed on page 13, this is another thing again, there's there's so much really good data in this report. I'd love to go over all of it, but we'd be here probably for three or four hours. But let's jump to page 13 of the report. You have a chart that talks about the relationship between the number of misconfigurations and dependencies. Walk us through what you found, and just help us to try to understand that as, I guess, as at a most high level as possible, what the impact is, of that relationship between misconfigurations and dependencies.
[00:17:38] JC: I think a quick summary is that we find a positive correlation between the number of miscalculation and the number of dependencies. And that, for an application, the more dependent code or dependent package you have, then the more misconfiguration in your application. That is what we found in both Kubernetes Helm chart, as well as container image.
So, let me explain a little bit how modern quality applications is deployed in the cloud environment. Usually, people use a Kubernetes template, Helm chart is a tool that allow automation deployment and management of the Kubernetes application. So, most people will just download and a Helm chart and deploy this Helm chart for the application. However, when packaging your application as a Helm chart, each Helm chart can also depend on other Helm charts.
So, what we found is that most of the Kubernetes Helm charts that we see in public space now, has at least one dependent Helm chart, and some Helm chart, actually, depending on the 10, or even 20 different Helm chart, and when we analyze the application and consider the dependent Helm chart importing, we found that, “Hey, there is a clear positive correlation between the number of misconfiguration they will be deployed in your environment and the number of dependent contract that you put.”
[00:19:29] MC: So, one of the things I know you focused on was also containers, right? So, talk about Kubernetes Helm charts, obviously, Kubernetes and containers are very well and very strongly interlinked. And Kubernetes are pretty much used in every cloud native environment. With the news any better with what you found respect to containers and dependencies?
[00:19:51] JC: We saw identical pattern, and I'll say it. So, analyzing the Kubernetes Helm chart, we focus on configurations. But when analyzing the link, we focus on the vulnerabilities. Concept is pretty much the same when building the container image. Most developers also rely on many third-party libraries, that is for application are needed for the platform in the system. So, when peeling off layer, or in taking down to the platform and an operating system, under the application, we realized that most of the modern count of application depend on many, many party libraries or package. Again, we found a positive correlation between the number of vulnerabilities in a container image with the number of dependent package that this container image, use. The more dependent package in this container image usually indicate vulnerabilities.
[00:21:14] MC: I know that when I read the report, I was kind of – I was somewhat shocked, just because I knew that the cloud itself has long suffered from misconfigurations. We see the same thing, infrastructure as a service platform, as a service SaaS, that they're like, all plagued by the same thing. But the detail that I think you guys were able to tease out in this report and how it specifically fits to, what so many organizations are doing right now. So, I think, elsewhere, somewhere in the report, I think it may say this that kind of the first wave of migrations to the cloud, were really lift and shift. It was pretty much just taking existing applications, moving them to the cloud.
But now the second wave that's been going on now for probably three plus, you know, almost four years, is really about organizations going cloud native, where they are using all of these different components that you guys talk about in the report. CI/CD pipelines, they're automating the builds with infrastructure as code, they're using containers. I guess what this leaves me thinking is, and probably most of our listeners are, they're hearing this A, they're thinking, “Okay, how does this impact my environment? And then B, what can I do about it? Or how do I prevent and mitigate some of these things?”
Q, I'd love to hear your perspective on this. Maybe just, we don't have time to obviously do an exhaustive discussion on it, but what would be a good first step?
[00:22:39] NQ: I like exhaustive conversations. Let’s talk about this forever. So, a good first step. I mean, I think you're right. I think most organizations, I mean, are doing really well in building, using infrastructure as code, building dynamically. For one, because that's where their money is, right? So, that's where they want to – they want to build and maintain, keep that 99.9999% uptime, as well as they possibly can.
So obviously, getting that infrastructure as code, operating, making sure that their clusters, their Kubernetes, clusters can scale dynamically and all that is very important. I think the next step is, again, how do we build security into the beginning of it, I think, the really cool thing, I think the really cool thing about infrastructure as code is that you can start putting security in the very beginning, and it's already containerized, right? It's already modular in nature. So, you can just like dump a security tool at the very beginning of it, and it's not going to break the rest of your build. It's going to just augment and enhance the rest of your build.
So, build your security scanning tools, put in your Chekov, if you're going to do open source, if you have bridge crew, build that right into the very beginning, as your containers are building as they're going up, as they're going through Jenkins or whatever CI tool that you're using, Circle CI or, J frog, what have you. And really start building in, doing those checks, you know, are these containers misconfigured? Does this particular application have a vulnerability? And do I need to grab a newer version of this particular containerized application of et cetera? At the exact same time using tools like Bridge Crew and Chekov to actually say, “Do I have hard coded credentials in this build somewhere?” Trying to knock out some of those early pieces.
Again, means we have this infrastructure as code functionality or this architectural framework, we can build security also into even after it's already been created and off the runway. We can just add security as we're flying down the road, or depth through the sky or whatever. It makes it really nice. So, yeah, it's great that we can scale dynamically move fast, but we need to get that security in there. And we really need to get developers used to those tools now have security teams being like, “No, you can't do this.” But say, “Hey, this is how we keep – you can move as fast as you possibly want to move. We want you to move as quickly you want to move and build as fast as you want to build, but make sure that it's the least satisfying some of those basic security checks like hard coded credentials, getting those things taken care of, knockout from the very beginning.”
[00:25:11] MC: I appreciate that response. I think there's a lot of wisdom in that. Jay, how do you see this? And I guess the other thing I'd throw into that is, will it be a good first step? And then are there I think, Q, mentioned, Bridge Crew, Chekov as a possible tool to use that can be enabled given to developers. Jay, how do you see this? What are some tools maybe that you recommend, as well?
[00:25:33] JC: So, of course, multiple steps. But I guess, I only have time for one step. My first step will be to get in and the deeper I work on this research, the more I realize how this visibility help enable – help accelerate the [inaudible 00:25:55] how much information is hidden from the user. The application we deploy and we use, probably use only less than 20% of the code for publication, and a large number of the dependency and complexity than from us.
And speaking of visibility, we can talk in different layers. We need to assert, to protect your cloud environment, you need to gain visibility into your infrastructure, into your application, into your resource, and into your high end. So, I think this is the most important first step to gain visibility to anything, because before you can start to create, before you can start to manage these different objects or different resource, different components in your environment. In cloud environment, gaining visibility is actually much easier because everything is API based. You can code in API to see on the virtual machine, your EC2 environment, for example. You don't need to worry about the one in the past, in the physical environment, someone can easily drop over Raspberry Pi into your data center, and you have no idea. But in cloud environment, API really makes it much easier to gain visibility.
[00:27:25] MC: Thank you guys for giving us this quick, high level overview of some of the findings in the threat report. So, one last question for each of you. What are you working on next? I know that you're constantly and continuously doing big projects. Q, let's start with you. What are you working on? What are you most excited about right now in cloud security?
[00:27:43] NQ: Yeah, I mean, just if I could go back, I love working with this team. It's awesome. Because I mean, what Jay just said right there with gaining visibility in cloud is just so important, so critical. How can we gain that insight into like, what's running runtime in an environment? So very cool. I love it.
The thing that I'm most excited about right now is that it still seems like we're still so new. So, there's still so much stuff to look at, within cloud infrastructure and cloud space. I love thinking about how things are being attacked and how things are being compromised. And I like to know who's doing it and how they're doing it and why they may be doing it. So, that that person behind the keyboard, behind the cloud is something that I'm very interested in. That being said, gaining visibility into some of those more integrated systems, like service meshes, network meshes, things of that nature. That's the next piece that I'm looking at and I'm diving down doing my research on. So, look forward to something coming down that in the future.
[00:28:37] MC: That's awesome. That will be exciting. I haven't seen a lot around service meshes. So, I'll be curious to see what you come up with here over the next few months. Jay, what's next for you? What are you looking at?
[00:28:47] JC: I'm interested in looking into the multi cloud type of infrastructure. More and more organizations are moving toward multi cloud infrastructure because they need certain features of service from specific service or service provider. They prefer the AI, ML or Big Query service from Google. But they need to store the data in like a glacier or snow mobiles, snowflake type work environment, AWS, and they prefer to manage their identity using Active Directory under Azure.
So, this is a pattern that we commonly see that organization adopt technologies from different cloud service provider. And the next question is when integrating multiple cloud service providers, technologies together, how can we integrate them security? How can we integrate this service in a secure way? Most of the service we think of our service providers can work integrate very well together and they are also called native service to help you oversee or watch your service. However, when one service provider and another service from another one, you need to find a way to integrate not only the service, but also the security monitoring system, how you can monitor all things together.
[00:30:20] MC: For both of those, sounds really exciting research projects. So, thanks to each of you for just the contributions that you've had to the community. Obviously, most of the world is moving to public cloud, so the findings that you guys have had both here in this most recent edition of the cloud threat report, as well as your ongoing research, I personally think has had a really big impact on just raising awareness from a global perspective around some of the challenges that organizations are having when it comes to securing public clouds like AWS, Azure, and Google Cloud. So, thanks for both of your work. Thanks for keeping us apprised of the latest threats. So, thanks so much for joining today. It's great having each of you on the program.
[00:31:01] NQ: Thank you so much for having us, appreciate it.
[END OF INTERVIEW]
[00:31:11] ANNOUNCER: Thank you for joining us for today's episode. To find out more, please visit us at cloudsecuritytoday.com.