7 Metrics DevSecOps Need to Be Monitoring

Some of the basic questions that an effective metric measurement plan can answer include:

  • Is the software operating properly?
  • Is the organization achieving its business goals?
  • Is the application secure?
  • Is the infrastructure able to support the demands of the application?
  • Can it be effectively scaled up to meet future requirements?

However, it can also give you insight that can be applied to future projects. For example, positive, consistent metrics can indicate that a process is reliable and repeatable - or not, if your metrics show unexpected variations - or identify the best points in a product's lifecycle to make changes.

Image

7 key DevSecOps metrics to put at the heart of your monitoring

You won't be able to gain any of this insight unless you're monitoring and measuring the right DevSecOps KPIs. With this in mind, here are seven you shouldn’t ignore.

  1. Availability This measures the uptime or downtime of the application, either as a percentage or as a time value. This is the most critical indicator of the reliability of the solution, taking into account both planned and unplanned maintenance. Ideally, this needs to be as close to 100% (or zero minutes) as possible. As well as informing you of any issues with the resilience of the application, it relates directly to service level agreements for customers, which will usually specify a minimum performance for availability.

  2. Issue resolution time If you do have downtime to fix a problem or make an update, the time taken to resolve any issue and get back online is vital. This shows how effective your team is at identifying a problem, coming up with a solution and then deploying it. Long resolution times could indicate there are too many complexities or variables to consider, but fast fixes are essential in remedying any security vulnerabilities.

  3. Application deployment frequency This looks at how often you make deployments into production within a given time frame. However, this isn't a metric you can take in isolation - what is regarded as a success depends on context. For example, if you already have a proven and tested product, a low frequency may be desirable as it indicates you have a stable platform that needs less downtime. However, for new applications, you may want a high frequency to make changes or additions on a more regular basis.

  4. Application change time Used in conjunction with deployment frequency, change time monitors how long it takes between a code being committed and the changes appearing in production. This gives you a clear insight into the efficiency of the software development pipeline - including the time taken to build, test and release an update. Generally speaking, shorter time frames indicate a more efficient environment, but you'll need to look at this alongside other metrics to ensure you aren't sacrificing quality for speed.

  5. Change failure rate One way to tell if your DevSecOps processes are functioning well is to look at the overall failure rate for changes. How often do deployments to production have to be recalled or changed because they don’t function as intended, or introduce new vulnerabilities into an application? If you do have a higher than normal failure rate, this could be a sign that something isn't working within your team, whether this is communication, confusion over what everyone's responsibilities are or unclear operational goals.

  6. Time to patch This metric covers the time taken between identifying a vulnerability and deploying a fix to production. While it is similar to issue resolution time, it’s especially relevant to DevSecOps teams as it offers a more granular insight into your team's ability to troubleshoot security problems as opposed to other bugs and defects. If you're making security a priority, this can tell you whether or not your team has the skills and resources needed to succeed.

  7. Time to value Finally, time to value reflects the actual business outcome of the DevSecOps process - namely, how long does it take for a feature of an application to go from the planning stage to actually delivering a benefit for end-users in production? This can be hard to define, which is why it's important to set out clear, relevant business goals for every project, such as generating additional revenue, adding new functionality or making the business more competitive.


What Are Metrics, and Why Do We Need Them?

Software metrics enable stakeholders in the development of software—developers, security personnel, operations personnel, development teams, and executives—to know key things they need to know about software projects, answering such questions as the following:

  • Is the service delivering value to the users?
  • Is the service operating properly?
  • Is the organization achieving its business goals?
  • Is the service secure?
  • Is the infrastructure able to support throughput, memory constraints, and other requirements?
  • Is the service being attacked?
  • Can future needs be supported?
  • What will be the cost and risk of adding new features?

Data that is generated by the DevSecOps methodology can help provide answers to these and similar questions.

Metrics are measurements of system properties or performance that inform decisions. They can be used to understand what happened or what might happen in the future. They help to determine such things as

  • if the process is stable
  • if the process is capable
  • if goals are being met
  • how alternative processes, tools, or products compare
  • how to manage change

Limitations of Existing DevSecOps Metrics

Studies have identified four key metrics that support software development and delivery performance. Two relate to tempo and two to stability.

Tempo

  • deployment frequency
  • lead time from commit to deploy

Stability

  • mean time to recover from downtime (mean time to restore [MTTR)]) or - - mean time between failures (MTBF)
  • change failure rate or percentage

In our work with customers on DevSecOps, we have observed that focusing only on the tempo and stability metrics can result in an accumulation of technical debt and insufficient security and operations practices.

The U.S. General Services Administration (GSA) provides a larger set of metrics to measure success at implementing DevSecOps. The table below lists the GSA’s set of high-value DevSecOps metrics. These include the above four key metrics.

Metric Description Associated Domain
Deployment frequency Number of deployments to production in a given time frame Associated deployment; authority to operate (ATO) processes
Change lead time (for applications) Time between a code commit and production Overarching; ATO processes; patch management
Change volume (for applications) Number of user stories deployed in a given time frame Overarching
Change failure rate Percentage of production deployments that failed Application development
MTTR (for applications) Time between a failed production deployment and full restoration of production operations Application deployment; backup and data lifecycle management; patch management
Availability Amount of uptime/downtime in a given time period, in accordance with the service-level agreement Availability and performance management; network management
Customer issue volume Number of issues reported by customers in a given time period Overarching
Customer issue resolution time Mean time to resolve a customer-reported issue Overarching
Time to value Time between a feature request (user story creation) and realization of business value from that feature Overarching; ATO processes
Time to ATO Time between the beginning of Sprint 0 to achieving an ATO Overarching; ATO processes
Time to patch vulnerabilities Time between identification of a vulnerability in the platform or application and successful production deployment of a patch ATO processes

The DevSecOps pipeline tracks key transitions and events, such as bug reports submitted, change requests submitted, code commitments made, builds, tests, deployment, and operating failures and recoveries. In DevSecOps, the system is continuously monitored for assurance by tracking application usage and latency, and the volume and sources of network traffic. Sources for data include

  • change-request systems
  • bug-tracking systems
  • documentation of peer reviews
  • source revision control and configuration management
  • build, test, and deployment platforms (i.e., change or release management)
  • static and dynamic testing tools
  • outputs of the project-planning system
  • application-monitoring tools

Table 2 below demonstrates what such DevSecOps instrumentation for measurement might entail.

When Tool Measures (always include timestamp)
Change request Change-control system Request ID
Bug report Issue/defect repository Issue ID, description
Code check-in Source revision control Change size, changes addressed
Static analysis Static-analysis tool and scripts Volume of code, findings
Build Scripts, continuous integration Success/failure
Test Scripts, test environments Number of tests, number of failures
Deployment Scripts, continuous delivery/deployment Success/failure
System failure Application performance monitoring System downtime, logs

Table 2: Automated Measurement-Data Sources

For the development (Dev) component of DevSecOps, instrumentation could provide a wealth of useful information. Examples include the following:

  • When tools are instantiated indicates when work actually began, providing empirical evidence of adherence to plans and schedules and of their fidelity.
  • The time of code check-ins indicates when reviews took place, when code was completed, and when bugs were fixed.
  • When builds and integrations are executed indicates the number of times that builds were necessary to fix bugs.
  • When automated tests are triggered indicates how many issues were found and resolved.
  • When deployment was triggered indicates that the product is complete and ready for use.

Security (Sec) in operations could be measured by, for example, the types of evaluations performed and the weaknesses identified.

For operations (Ops), we would look at events and events over time. Use indicates value to the user. Resource limitations and failures indicate lost revenue or opportunity. Other events, such as network activity, can indicate adversarial probing for weakness or denial-of-service attacks that can be fed back into attack-surface analysis.

Key measures for DevSecOps focus on continuously monitoring the overall health of the system.

  • Lead time—indicates responsiveness and will be affected by quality, productivity, and resource utilization.
  • Deployment frequency—how often a new release is deployed into production.
  • Availability and time to recovery—indicate reliability.
  • Production failure, operational errors, and rework—indicate quality issues.
  • Exploits and attacks—lagging indicator of the security of a system.

Process indicators such as these can be leading indicators that predict outcomes. When indicators suggest a problem, other metrics can be used to identify the root cause.

source: post

info: devsecopsguides





rhnux :: | | :: Made with MkDocs + Simple Blog