Over the recent months many of our customers asked for our guidance and building their test suites properly. Performance, ease of management and scalability were key priorities for them. We decided to share our suggestions with the broader audience.
We came up with the list while building our own test suite. Since we strongly believe in dogfooding, Testim is tested using Testim. We also learned a lot working with a variety of companies from small startups to large enterprises.
Rule 1: Prioritize
Most apps include thousands of scenarios. Testing the functionality and correctness is merely one step of the way to shipping high-quality software and meeting users expectations. There’s performance testing (speed and responsiveness in a normal state or under load/stress), visual testing (checking that 1+1 = 2 is one thing, but many visual bugs can make it a very bad experience), and so on.
Start by listing the most important user flows, for example login, add to cart, checkout, etc. Order the list by those more vital for user conversion.
Try to sort by first adding those that must succeed e.g. create account, checkout and payment. You can use your analytics tools (e.g. Google Analytics, Mixpanel, Kissmetrics, Woopra) to find which are the common scenarios. We also found Segment.io to be useful.
Rule 2: Reduce, Recycle, Reuse
The next step is to break the user’s scenarios down to simple, single purpose, almost atomic flows. e.g. “Create user,” “Login,” “Send email,” etc. Create a test for each of these flows. When completed, compare this to the the user stories.
Naming convention – Defining a good naming convention is an important part. For example, including the component name in the flow name, which can result in flow structure (e.g “Account.login“ and “Account.logout”).
Reuse components. Don’t Copy/paste – for non-experienced developers/testers, a copy/paste seems like a reuse. However, the challenge is to update those steps when the flow changes. Even a single step/line that repeats in hundreds of tests is a huge hassle.
Rule 3: Create Structured, Single-Purpose Tests
Single purpose tests verify one thing only!
Single purpose tests are easier to create and maintain, they will also run faster compared to longer and complex ones. Ideally, they should only fail if the feature they test is broken or one of its dependencies (which we’ll try to minimize). If you test several features and the first one fails, you know nothing about the other features, since you haven’t reached them.
A test should consist of three parts:
- Setup – e.g. login
Those steps gets the AUT (application under test) to the state required for testing. This could vary by the test: sometimes you only require a login. you require a complex structure to pre-exist.
Note: Premature optimization is not a good practice. Implementing the login flow (user/pass fields) is an easy way to start, and if you reuse the same login in all your tests, it wouldn’t be difficult to change the implementation later on to a quick implementation (e.g. by using cookies), so there’s no need to start with the speed optimization if you know it’s easily fixed. - Actions – the flow you want to test.
Continuing with our aforementioned “login” example above, unlike the setup steps, which require the login to be completed successfully (don’t care in which way), testing the “login” itself is a different task, and should not be reused. - Validations – validate UI state, validate DB – the result you expect.
There are different validations.his depends greatly on whether you’re testing data which is being updated from production on a weekly basis, or whether it’s the same Test Data.
In UI testing, the basic validations are done on elements in the UI.- Validate an element is visible.
- Validate an element comprises some text.
- Validate an element is in a specific state.
- Visual Validation – validate the is comprising the expected text, image and has the correct styling (e.g. font, layout).
Using a predefined dataset has the advantage of making the validations easier, as when we know the exact setup state, you also know the precise expected state after the stimuli steps were completed.
If the dataset is updated, you get different scenarios which you didn’t expect (as different user use your application differently), making it harder to debug failures (as the scenario might not be reproducible).
Example: A test for a login could look like:
- Set initial state:
- Open AUT
- Navigate to the signup page
- Actions:
- Choose a random username and it to a variable named username
g. “user” + Date.now() + “@mailinator.com”. - Sign up for new account by using username.
- Set username to username
- Set random password
- Click submit
- Validations:
- Open mailinator.com
- Navigate to “inbox”
- Look for email to username
- Check that there is one email with the expected link.
- Click on the link
- Login with username and pass
- Validate login was successful
- Choose a random username and it to a variable named username
Note: One of the biggest differences between unit tests and end-to-end tests is the overhead. Opening a browser and logging-in to the AUT used to be extremely slow, pushing testers to perform many tests after reaching the Setup state, saving valuable time. Today, with docker, it’s much faster to boot a browser up, the speed of the browser help developers to write applications that load quickly, minimizing the overhead of writing end-to-end tests.
Rule 4: Tests’ Initial State Should Always be Consistent
Since automated tests always repeat the same set of steps, they should always start from the same initial state. The most common maintenance issues when running automated tests are ensuring the integrity of the initial state. Since automated tests depend on that state, if it is not consistent test results won’t be, either. The initial state is usually derived from user previous actions.
For example:
- Adding an item to a list that already contains it (from previous test run).
- Views that behave differently for new users and returning users.
- Application redirect automatically if user already logged in.
Possible solutions:
- Create new user for each test run
- Use a dedicated app environment for test automation, as opposed to using your production one.
- Seed your application with initial data before each suite run, also known as fixtures.
- Use conditions on steps/group to handle dual cases in test which are hard to set.
- Eliminate dependencies between tests. The AUT’s state after “Test_1” shouldn’t affect the flow of “Test_2”.
Rule 5: Compose Complex Tests from Simple Steps
Complex tests should emulate real user scenarios, you should prefer composing those test from simple tests parts (shared steps) instead of recording the entire scenario directly. Since you already tested all the simple actions in the simple tests, the complex tests should only fail on integration issues (this is where Rule No. 2 comes in handy).
Make sure to order your suite to run the simple tests first. It will give you indication when something is wrong in the complex tests when running the whole suite.
Rule 6: Add Validations in Turnover Points
Validations are usually used at the end of the test to signal the test is in a passed/failed state. It is also best to add them at points of major changes as a checkpoint to stop the test if an action failed. Since validation will run until its timeout reached, like all steps, it is also a mechanism to wait for a long async operations in your app (e.g. fetching data from server, continuous view rendering, etc.) before running the next steps.
Examples:
- On action completion – navigation, form submission, modal display, etc.
- After major DOM changes add validations to ensure you are in the new state before trying to play next steps.
- Following a client-server communication which change app state – e.g. data submitted successfully message, etc.
Rule 7: No Sleep to Improve Stability
Sleep with random duration (what I call magic numbers) is one of the main sources of flaky tests. Although all apps are asynchronous in nature, there are rare cases where a human doesn’t know when the page is ready and perform the next action. The same should be true for automated tests.
The reason to avoid static sleep is you rarely know the load of machine upon you run the test on. Only in performance testing do you have the machine to yourself (knowing for sure all CPU and memory are for you). Since you probably use virtual machines or Docker you will get timeouts if you set the sleep too slow, or slow tests if you set it too high.
In most cases you add a wait-for-condition step, which waits just like a human would, or sometimes waits for some internal state (e.g. javascript value).
At Testim we provide all kinds of built-in wait-fors, ranging from wait for element to be visible to wait for a specific text (you use also variables and regex), wait for some visual change (e.g. color change) or code-based changes.
Rule 8: Use a Minimum Two Levels of Abstractions
If your test is composed mostly of user interactions, such as clicks and set-text actions, then you’re probably doing something wrong. If you used low-level frameworks (e.g. Selenium), you might have heard of the PageObject Design Pattern, which recommends to separate the business logic (e.g. login) from the low-level implementation (e.g. set username, set password and click login button). This is true for Testim as well. Having layers contribute to reuse reduces the maintenance and allows other members of the organization to quickly understand what happen, as sometimes the testers write the tests but the developers run them (before merging new code). The developers in this case need to understand quickly what the test does, and what is the bug.
Rule 9: Reduce the Occurrences of Conditions
A test should have as few as possible conditions (if-statements). Those that have many are usually unpredictable (don’t know the exact state you’re in) or complex (you might even see loops, too). Try to simplify your tests by:
- Start your test at a predefined state;
- Disable random popups; and
- Disable random A/B testing, and choose a specific flow.
Rule 10: Write Independent and Isolated Tests
An important methodology of test authoring is creating self contained, independent flows. This allows to run tests in high parallelism, which is crucial for scaling the test suites. If, for example, you have 1,000 tests that run for a minute each, running them one by one will take more than 16 hours. Minutes, at full concurrency, can cut this testing time to 1 minute.
Having a fast build reduces the feedback time between the moment that developers commit their code and the moment they knows their feature is production ready and can lead to true agile development cycle.
In Summary
Proper planning of your tests can make a big difference in your development process. The ability to get fast feedback, adapt tests to new user flows quickly and quickly analyze the results, identify bugs and get enough information to quickly fix them. Proper planning can also save a lot of time and expenses. Suites that are not properly planned require high maintenance and keep the R&D team away from building new functionality and increasing coverage. Finally proper planning can help expand the responsibility for quality beyond the Q&A organization and let others contribute.
About the Author / Oren Rubin
Oren Rubin is CEO of testim.io. He has over 20 years of experience in the software industry, building mostly test-related products for developers at IBM, Wix, Cadence, Applitools, and Testim.io. In addition to being a busy entrepreneur, Oren is a community activist and and the co-organizer of the Selenium-Israel meetup and the Israeli Google Developer Group meetup. He has taught at Technion University, and mentored at the Google Launchpad Accelerator. Connect with him on LinkedIn and Twitter.