Alcide Logs and Coralogix
With all the power that Kubernetes provides with its layer of abstraction, comes a rising tide of complexity and learning curve. Kubernetes has many internal components and the applications typically deployed on K8s are also complex and frequently evolve.
For SecOps teams, enforcing security policies on deployments and users, identifying security gaps and violations, and making sense of K8s in this new environment requires a deep understanding of K8s’ internal operations as well as how users and automated services should and should not interact with applications and K8s infrastructure.
K8s audit logs are the natural source of information for these tasks, but analyzing these logs is both time-consuming and tedious. This task calls for an automated system that can immediately identify violations of security policies configured by SecOps. More importantly, such a system should automatically identify unexpected suspicious behavior in the cluster related to potential attacks so that security specialists can then investigate.
As you deploy your mission-critical applications on K8s, you should consider managing the security risks that such complex environments present. Alcide kAudit can identify rules violations based on K8s audit logs.
Furthermore, it employs sophisticated machine learning algorithms to identify anomalous behaviors and suspicious patterns in the operational activity initiated by users and automated services.
It proactively investigates and forensically analyzes Kubernetes cluster deployment for breaches, anomalous behavior, and misuses in real-time. Using kAudit enables you to identify patterns associated with misconfigured RBAC, credentials misusage, vulnerability exploitations, security policies that conflict with compliance requirements, best practices, and more.
Alcide kAudit is bundled with a GUI that can be used for presentation, analysis, and investigation of its findings. However, kAudit can also send these logs to your log management solution of choice.
This is where Coralogix, a centralized log management solution comes into play. It complements kAudit by consolidating Alcide cluster anomalies, incidents, and policy violation notifications across clusters. Coralogix can also enrich the logs with security and geo-location data.
K8s is a central infrastructure component, but only one component in a larger environment containing many moving parts like applications, DB’s, and other infrastructure software and hardware elements. All of them interacting with each other and adding to the management complexity problem.
Coralogix complements Alcide and kAudit by creating one pane of glass through which DevSecOps as well as other teams like engineering and CS can view logs from different parts of the infrastructure in a consolidated way. Beyond just visualizing the data, Coralogix gives all these teams powerful analysis tools and helps identify correlations between applications and components events using ML-techniques.
Coralogix can also put logs in the context of the application’s lifecycle in the CI/CD process – allowing you to assess the impact of every change to your infrastructure.
The end result is faster problem identification and time to resolution.
kAudit can send two log categories to Coralogix:
- Policy Findings: The K8s audit log entries that show violations of the policy which are configured by the Sec and DevOps team.
- Detections: Suspicious behavior with potential security risks to the K8s cluster which is automatically detected by kAudit.
How to Integrate Alcide Logs Into Coralogix
Now let’s put everything we discussed together. The following use cases demonstrate the value and power of Alcide and Coralogix combined.
Abnormal Access Patterns
One of kAudit’s powers is in identifying anomalous behavior. For example, it can learn normal user behavior, like for example when a user connects to a cluster, and when and how K8s APIs are called. kAudit will automatically generate an anomaly log when a deviation from the norm occurs. Here is an example of an anomaly generated when a usage pattern was out of the norm for principal JohnD.:
“doc”: “change in count of unusual URIs in write access attempts”,
“short-doc”: “Change in usage pattern”,
“unusual-uri”: [“uri/CRM/25”, “uri/CRM/24”, “uri/CRM/24”]
“doc”: “unusual change in count of unusual URIs in access attempts”,
kAudit does it per cluster. Coralogix can take these logs and allow the engineers and analysts to view them across clusters and in the context of the wider environment.
Let’s look at a user analysis example. Using Coralogix we can look for example for a user by the name ‘JohanD’ with abnormal activity in the form of ‘Change in usage pattern.
We can go to the ‘Log’ screen’ and insert the following query:
‘eid:JohnD AND short-doc: ”Change in usage pattern”’.
The query result will show all anomaly logs associated with the user ‘JohnD’ that have a short-doc field, which indicates a change in usage pattern. This enables the user to check if this anomaly is across clusters or specific to one cluster.
Clicking on the parameter will bring up the cluster distribution for the anomaly:
This indeed happens across clusters for ‘JohnnD’. The analysis can continue by checking if this occurs with other users by changing the query parameters to:
‘Short-doc: ”Change in usage pattern”’. This time we are not specifying a user and we could look at the cluster distribution again.
This brings good indication that it happens across clusters for other users as well. As you can see the cluster distribution has many more count results in the graph.
The same way the analysts/engineer can look at the eid field and see the distribution across users as well.
Using Coralogix’s Alcide dashboards, they can look at trends and examine if there is an increase in the occurrence of this anomaly or related ones and how this specific user scores compared to others in different profile comparisons.
Using filters on Principal, short-doc and cluster, they can then zoom in and examine JohnD anomalies over time:
It is clear that this began recently and by using the cluster filters, we quickly discover the time sequence pattern across clusters.
Going even further, Coralogix users can easily configure predefined views which can empower less technical users to use the valuable data in kAudit logs. This data in conjunction with the views can help make the organization more efficient and productive.
In this saved view all analysis can be done using the custom filters on the left, using a common ‘faceted-search’ pattern which most online users are familiar with.
The value of the integration and workflow described above is not limited to anomalies. In addition to anomalies and incidents that flag deviations from the norm, Alcide sends logs that indicate rules violations to Coralogix.
Customers can define the rules and Alcide will generate the log based on cluster API activities. The same analysis steps mentioned in the first use case would apply to most rule violations, so we won’t repeat them here. We’ll just use this example to highlight another Coralogix capability that enables users to take a proactive management approach at the enterprise level.
Coralogix alerts allow users to take the Alcide rule concept and apply it across clusters and application layers. In our example use case, our user defined a rule that generated a kAudit log when certain admin actions had an authentication failure.
We’ll name our rule ‘admin authentication failure’. Some failures are to be expected during the normal course of operation so a sophisticated attacker might decide to spread the failed malicious operations across dozens of nodes. Each of these failures would generate a log by kAudit but wouldn’t necessarily be recognized as an anomaly.
Our Alcide users are able to create an alert that triggers when failures across the enterprise pass a threshold. This is very easy to do:
The alert query would be:
rule:”admin authentication failure”
The alert condition would be:
more than n times in ‘time-period’.
In addition, Coralogix takes alert conditions to the next level by providing ratio and dynamic alerts.
- Ratio Alerts – Users can set a threshold for the ratio of failed admin authentications vs. successful ones. You can also add a rule that generates an Alcide log for successful authentications.
- Dynamic Alerts – Users can choose a condition called ‘more than usual’ and very similar to Alcide at the cluster level, Coralogix will learn what is considered the normal range of failures per time of day across the enterprise and will alert if that threshold is crossed.
Such alerts can be applied to other use cases as well. For example, when there are too many failed permission info requests which can indicate a scan that could be a precursor to an attack, or an unusual number of direct connections to pods and components. Using the Coralogix alert engine gives users the flexibility to tune the alert to fit their needs.
Immediate alerts (that are not dependent on a threshold and time period can be configured for highly sensitive kAudit data that indicate critical rule violations or admin operations like some changes to a DB’s configuration, or access rights as just a few examples.
Application and infrastructure upgrades are always risky periods in an applications’ lifecycle. The same can be said about business events that carry peaks in transaction loads like user registrations, etc. Coralogix can tag these events and put the collected logs in the context of each event. The event tags can be added to the log flow automatically or manually.
Tags provide a very powerful tool to the analysts and engineers as they can look at events related to kAudit and other logs that represent application events in the context of each change, and quickly understand the impact on the infrastructure. It could mean minor adjustment is needed, or it can lead to a rollback.
This is a tag we created for a CRM cluster upgrade and we can see that we got 96 alerts and that the ratio of critical and error logs went up by about 20%.
As an aside, the anomalies mentioned here are Coralogix enterprise wide anomalies and not Alcide’s. Clicking on the alerts will open a list and allows deeper log examination:
Users can also look at Alcide kAudit logs in the context of this tag. The tag appears as an icon on the log graph and the users can use the mouse to zoom into this period of time:
Once zoomed in, users can of course use all analysis capabilities provided by Coralogix.
This post demonstrated the powerful combination of Alcide and Coralogix and how it can help you further protect your organization. It also highlighted a few use cases to illustrate the potential of using them together.
Don’t hesitate to contact us at email@example.com or firstname.lastname@example.org with any questions you might have.