Testing is always about comparison , actual vs expected results , we often get to compare two strings in selenium. I know it’s a simple thing to do but the question is how we can make our tests more reliable and get read of all flakiness.
Thats why I will try to cover in this article couple of approaches of comparing strings and how your tests will run in a reliable way.
Simple strings comparisoon
When you have very tiny strings you can apply the most simple aproach
Assuming we get the text from the page (actual results):
And here we have our expected results from the cucumber example table or from the api :
All good here it’s the most simple way of comparing two strings , but this can become unreliable when you have longer strings , capitalised strings from a source and non-capitalised from the other source.
In this case your tests will fail because :
So what are we doing if api is returning the data set with capital letters and on the page the ui is formating them with capital letters ?
We need to format the data before we start comparing it , we apply the same transformation for api data set and UI data set.
Simple as that , but we still might have to compare big data sets , like paragraphs , we all know is not a good practice in selenium to compare big paragraphs but there can be still an option if you still wanna do it.
Usually the api is returning the text embeded in html so when you get the text from UI that will not contain any html code , so your tests will fail . White spaces as well will make your tests to fail.
Comparing paragraphs in selenium
Using Nokogiri we can parse the html and transform it in an actual text, and at the end we remove all white spaces from the string, so a string like : “blah blah blah” will become “blahblahblah” simple as that. I know there might be voices saying that ok this doesn’t mean the text can be from the very begging like this so you’ll never find the actual bug and that’s true in a way, that’s why we will be discussing option 2.
We can use levenshtein distance , Levenshtein distance (LD) is a measure of the similarity between two strings, which we will refer to as the source string (s) and the target string (t). The distance is the number of deletions, insertions, or substitutions required to transform s into t.
Here we can return a percentege and we can decide our tolerance.
Option 3 is available of course when you have to compare two strings that in the mean are the same but in the actual comparison they are not the same.
So here is a situation , imagine the api is returning a standard data like ” house for sale – 4 bedrooms” but the UI guys will apply some logic around it and will swipe the works in certain cases and your string will become “4 bedroom house – for sale”. In the mean they are the same but if you try to compare them will return false.
So what do we do ? Option 3 is avaiable.
We can build the histogram in a hash mapping each object found to the number of times it appears. Histogram expose the relative popularity of the items in a dataset ,very useful to compare strings like ” house for sale – 4 bedrooms” => “4 bedroom house – for sale”.
And If they match we return true or false.
Here is the code how you can do it:
So here you need a new gem that will help you to get the frequency :
gem 'facets', '~> 3.1'
Hope you find this useful , Happy testing!