Edit

Share via


Manage flaky tests

Azure DevOps Services

Productivity for developers relies on the ability of tests to find real problems with the code under development or update in a timely and reliable fashion. Flaky tests present a barrier to finding real problems, since the failures often don't relate to the changes being tested. A flaky test is a test that provides different outcomes, such as pass or fail, even when there are no changes in the source code or execution environment. Flaky tests also impact the quality of shipped code.

Note

This feature is only available on Azure DevOps Services. Typically, new features are introduced in the cloud service first, and then made available on-premises in the next major version or update of Azure DevOps Server. For more information, see Azure DevOps Feature Timeline.

The goal of bringing flaky test management in-product is to reduce developer pain cause by flaky tests and cater to the whole workflow. Flaky test management provides the following benefits.

  • Detection - Auto detection of flaky test with rerun or extensibility to plug in your own custom detection method

  • Management of flakiness - Once a test is marked as flaky, the data is available for all pipelines for that branch

  • Report on flaky tests - Ability to choose if you want to prevent build failures caused by flaky tests, or use the flaky tag only for troubleshooting

  • Resolution - Manual bug-creation or manual marking and unmarking test as flaky based on your analysis

  • Close the loop - Reset flaky test as a result of bug resolution / manual input

Flaky lifecycle

Enable flaky test management

To configure flaky test management, choose Project settings, and select Test management in the Pipelines section.

Slide the On/Off button to On.

Screenshot of Test Management, Flaky test detection enabled, System detection.

The default setting for all projects is to use flaky tests for troubleshooting.

Note

Switching between systems is inherently disruptive, as all flakiness history stored in Azure DevOps is erased during the transition.

Flaky test detection

Flaky test management supports system and custom detection.

  • System detection: Azure DevOps has a built-in mechanism for detecting flaky tests. This involves rerunning failed tests within the same pipeline execution. If a test case fails initially but passes on a rerun, it is marked as flaky. This detection is tightly coupled with the VSTest task, which reruns failed tests within the same task execution. Another method involves rerunning failed jobs in the pipeline (manually by clicking on "rerun failed jobs" in any pipeline run). If a test passes in the rerun, it is marked as flaky.

    Note

    Once a test is marked as flaky, the data is available for all pipelines for that branch to assist with troubleshooting in every pipeline.

  • Custom detection: This approach allows external systems to integrate their own logic for detecting flaky tests and rely on Azure DevOps for consistent tracking and management. Communication with Azure DevOps is enabled using the Result Meta Data - Update API. The API requires a Test Case Reference ID, a flag indicating whether the test is considered flaky, and the repository branch where the flakiness was observed. User should be able to get the Test Case Reference Id from the Get Test Result By Id API. Once this information is sent to Azure DevOps, the system stores and propagates the flakiness status for that test case in subsequent pipeline runs. After a test is marked as flaky, Azure DevOps will continue to treat it as such until it is manually unmarked.

Screenshot of Test Management, Flaky test detection enabled, Custom detection.

Flaky test options

The Flaky test options specify how flaky tests are available in test reporting as well as resolution capabilities, as described in the following sections.

Flaky test management and reporting

On the Test management page under Flaky test options, you can set options for how flaky tests are included in the Test Summary report. Flaky test data for both passed and failed test is available in Test results. The Flaky tag helps you identify flaky tests. By default, flaky tests are included in the Test Summary. However, if you want to ensure flaky test failures don't fail your pipeline, you can choose to not include them in your test summary and suppress the test failure. This option ensures flaky tests (both passed and failed) are removed from the pass percentage and shown in Tests not reported, as shown in the following screenshot.

Flaky Reporting

Note

The Test summary report is updated only for Visual Studio Test task and Publish Test Results task. You may need to add a custom script to suppress flaky test failure for other scenarios.

Tests marked as flaky

You can mark or unmark a test as flaky based on analysis or context, by choosing Flaky (or UnFlaky, depending on whether the test is already marked as flaky.)

Mark flaky Test

When a test is marked flaky or unflaky in a pipeline, no changes are made in the current pipeline. Only on future executions of that test is the changed flaky setting evaluated. Tests marked as flaky have the Marked flaky tag in the user interface.

Confirm flaky Test

Help and support