Here’s a recent scenario: an organization’s security team receives an alert from the monitoring system on their Slack channel with the content below:
AWS Account : SomeCompany_Development
IAM User : firstname.lastname@example.org
AWS API : AllowSecurityGroupIngress
Source IP Address : xxx.xxx.xxx.xxx
Security Group ID : sg-4fxxx4dx
Security Group Region : us-east-2
IP Protocol : tcp
From : 22
To : 22
IPRange : 0.0.0.0/0
The security team learns that one of their team members, “email@example.com,” has opened up port 22 in the company’s development AWS account to the whole world. The member of the security team immediately pings Joel over Slack, and Joel confirms that the port opening was a mistake from his side while he was trying to whitelist for his own IP address. The security team members ask Joel to undo the change, validate it, update the security ticket along with the conversation history, and close the case. The incident handling duration is reduced from hours to minutes. Bingo.
In today’s world, innovation is happening at an exponential pace. People want real-time analyses of everything from elections to baseball games. Our desire for immediate information has led to a demand for real-time development, operations, and work management. To achieve this demand, people, processes, and tools need to move out of traditional communication modes into real-time collaboration and conversation. That’s where ChatOps comes into play. ChatOps is often described as a conversation-driven collaboration or development between multiple teams that reduces overhead and brings more transparency into a system. The process of using ChatOps generally starts with people who adopt a collaboration tool, such as Slack or Stride, which interacts with a bot responsible for processing requests and executing them on behalf of the user. ChatOps has turned chat messages into command line tools, allowing them to auto-remediate incidents, respond to customers, perform production upgrades, and log tickets.
ChatOps and Transformation
Before we dive into ChatOps and the transformation it has caused around the world, it is essential to understand the critical challenges faced by modern organizations and their customers. They include:
- High mean time to resolve incident or service requests
- More dependency on the organization
- Lack of self-service mechanisms for customers’ problems
- More emails and meeting time but a lack of organizational transparency
- Repetitive manual work
- Higher operating costs which turn into higher customer costs
These are just a few of the issues organizations face. To remediate these, ChatOps was developed and is now an integral part of every DevOps organization. Use of ChatOps not only reduces the time and effort required for the user and the team to achieve their objectives and resolve incidents, it also brings transparency, predictability, and visibility to the entire system. These benefits are extended to the end customer, since no one in this fast-moving world wants to call a support line or help desk to resolve their issues. A simpler step is to interact with chat support which can automatically understand the request and provide a resolution to the problem.
Over the last few years, Slack, Atlassian, and VictorOps have conducted surveys which focused on the evolution of ChatOps and its transformation of organizations. A quick summary of those surveys reports that, since the adoption of ChatOps:
- Team productivity has increased by 32%
- Numbers of internal emails have decreased by close to 50%
- Meeting hours have decreased by 35%
- 85% of employees use a chat tool to communicate internally
- 76% of employees found it easier to reach subject matter experts
- Downtime hours dropped, contributing to less revenue loss, better branding, happier employees, and a positive workplace morale
- reduced alert fatigue, which makes the alerts contextual, actionable, and less redundant
It is important to note that these surveys are two to three years old; however, the data points are still valid and have most likely increased exponentially since then, precipitating new positive outcomes. A simple proof of this can be seen by reviewing Slack numbers. The company currently reports a $200 million estimated revenue. Forty-three percent of Fortune 100 companies use Slack, and an average user plugs into it for 10 hours each weekday.
Another transformation caused by ChatOps results from its integrations. The popularity of ChatOps started with better incident healing and reduced manual effort, but with the system and product integrations that were developed, ChatOps’ adoption moved to a different level. An example of this can be seen in JIRA’s integration with Slack. Because of this integration, the user can create a JIRA bug ticket or story directly from Slack. Another example might be a smooth onboarding process for a team member when they are directed to the proper knowledge base by placing a query in Slack.
ChatOps and Cloud Integration
In the world of technology, it is often said that “ChatOps is the backbone of DevOps.” For a DevOps organization, there is only a thin line separating the DevOps engineers from the operations team responsible for maintaining the application. While the operations team is responsible for ensuring the uptime of the infrastructure, the DevOps team has to ensure the responsiveness and performance of the application. So, when an issue happens, ChatOps provides a space for both teams to work collaboratively on the integrations with the monitoring systems in place. This collaboration helps them to eliminate the “your issue—your issue” problem or the “blameless situation” by bringing more transparency and openness to the system.
ChatOps integrations are not limited to third party tools. They also integrate with services offered by cloud providers such as AWS, GCP, and Azure. Not long ago, AWS published a blogpost which described the incorporation of critical DevOps services like AWS CodeCommit and AWS CodePipeline for deploying the new application build into the production environment. It explained how to send an approval message to the approver over the Slack channel. This step eliminated the need for the manual trigger to push the build. Another example can be found in QA results published to the Slack channel. In these, teams can discuss, review, and post their feedback which can then be fed to the issue trackers and handed over to the development team for the bug fix. The process eliminates the need to compile a QA report, hand that over to the development team, and discuss and file the tickets.
For a security team in the cloud environment, ChatOps is indispensable. It provides a real-time view of both the situation and the alerts triggered by the monitoring systems across the environment. It helps the team to normalize the incidents as well as act, remediate, and resolve them in real-time. The security team only needs to integrate the environment audit trail with a chat collaboration tool to get real-time visibility into all of the changes happening in the environment. This becomes handy for preventing security lapses and undergoing postmortem processes to backtrack and understand why incidents occurred at the first place. Rather than sending the email to the development team to describe the security lapse, which is like opening up port 22 to the world, the security team can directly interact with the personnel involved over the chat in order to better understand the reasoning behind their issues and update the security ticket as needed. The overall process makes for faster security incident handling and less overhead for the team. Another benefit to using ChatOps is that the security team’s focus moves away from solving the simple security lapse, which can be auto-healed, to addressing the complex issues which require more analysis—eventually leading to more security-focused teams with stronger skill sets.
ChatOps, Slack, and Alcide
Recently, Alcide announced the new release of its cloud-native security platform for modern data center and cloud environments. As a critical part of this feature, every alert detected by the Alcide platform can be sent to a dedicated Slack channel in real-time.
The Slack notification contains relevant data about the alert and a link to the Alcide console which provides a filtered view of the specific alert.
Here is an alert example:
New Alert detection:
Detection Time: 2018-09-06 14:30:49.027 GMT
Alert Type: Behavior Anomaly – Hostname Reputation
Alert Info: Traffic to a hostname in cryptomining domains reputation feed.
Entity UID: 5a2bd61a-8831-11e8-9aa0-06d0678add18
Entity Type: Node
Detection Reasons: Traffic to hostname in cryptomining domains reputation feed was observed over the last 60 seconds.
The integration provides better visibility and observability to the security team who can get notifications, review them, and further investigate all the security threats for their environment in real-time and in one place.
ChatOps is building strong cultures across organizations by inspiring better behaviour. This form of communication is public and fosters knowledge sharing. It also encourages accessibility, as it allows users to query information for various systems. Better visibility enables the broader group to see all the changes happening across an environment. These benefits shape the goals of an organization and facilitate their achievement. The more people can collaborate, the more creative they get. In that process, organizations can rise to new heights.
Ready to see it for yourself? Book a personal demo today