Selenium use at PicScout

At PicScout, we believe that every step in our development workflow that can be automated should be. Testing is one of those steps where automation is needed. Automated testing allows us to implement new features quickly, as we can always prove that the product still works as we expect. 
 
In order to automate our testing process, we chose to work with Selenium among other tools we are using. Selenium is an open source set of tools for automating browser-based applications across many platforms. It is mainly used for web applications testing purposes, but is certainly not limited to just that. 
 
Our approach is that most of the automation should run locally to keep the continuous integration environment clean as much as possible. This is very important step for stepping up to continuous deployment. Therefore, we build a mechanism that enables to run the automation including selenium locally. By using this mechanism, the SW engineers can ensure quality up to a certain level.
 
Each developer must first run the automation including Selenium on the local machine before hisher changes are pushed to the source control repository. To achieve this, each developer works on his own environment and doesn’t interrupt other developers. That way the selenium testing environment stays “clean”. In order to ease the developers’ life a set of tools was developed which is responsible for updating the DB and running the tests.
 
The tests use dedicated DB which contains the relevant data; hence before running them the DB needs to be restored from a backup stored in the source control repository (aka GIT). In case a new selenium test is written which requires some data changes, the tool allows publishing the local DB to the repository.
 
Furthermore, selenium tests are written over NUnit Framework. NUnit Framework runs the tests in sequence, meaning one after the other. This can be time consuming since some tests can run in parallel as long as they are not affected by each other. Running tests in sequence takes about an hour, while in parallel it takes only 10 minutes!
 
Therefore, we developed a tool that supports serial and parallel modes for running tests. In order to run tests in parallel, we had to isolate each test from another. To achieve this, we had to make sure that the tests don’t use shared DB resource or information (including but not limited stored procedure, tables etc.).  By doing that, we know for certain that at any given time, no test will disturb any other test while still pushing the performance to the limit and finishing all of the tests in minimal time, as opposed to running tests in sequence manner. 
 
That’s about it on how we use selenium at PicScout.

Picscout ‘s development process at a glimpse

The role of software engineers

At Picscout, each SE takes full responsibility on a task. This is achieved by using the following guidelines: 
  • User stories (can also be refer to as tasks) are written and described in advanced, either by product owner or R&D team, and inserted into a queue.
  • When a Software Engineer (SE) is available, they pull a US from the queue and start to work on it.
  • The SE should read the US and verify they understand it completely. For that they may turn to the person who wrote the US, other group members, group leader, operation team or anyone who can help them understand the US.
  • If the US is not well defined in such way the SE can’t start working on it, they should raise a flag and talk to the person who wrote it. In this case, the US may return to the queue and it will be reviewed again.
  • US should not take more than 5 days. If there’s a US which we think will take longer, it should be divided into smaller User Stories. It is up to the SE to decide and divide the US.
  • Once they started to work on a US, they should set a due date for it (no longer than 5 days as discussed previously). This due date helps planning the team schedule and set milestones.
  • A SE should handle each US end to end. Understanding, designing, implementing, testing and deploying are part of the task.
  • When a US is finished, it should be flagged as delivered or done. If the US involved code changes, it should be flagged as ready for ‘Code Review’

Code quality

Maintaining our code in high quality is one of our main goals. In order to achieve this goal we use a variety of processes and tools:
Coding skills and analysis
We use code analysis tools such as FxCop and Sonar to give better insight into the code quality at both the developer’s and the project’s levels. At the same time, we put a big emphasis on the human factor:
  • Code reviews are done on a daily basis
  • Code reads are done once a week- A team member will show code that he/she has written to a group from R&D and the group will discuss the code (design, architecture, implementation considerations etc…)
  • Clean code lessons and educational meetings are done every 2-3 weeks
  • Special events such as hackatons and code retreats are done every 6 weeks
Tetsing
We do not have a QA team. We don’t believe we need one. Why?
Because we believe our software engineers should write code that works – that’s as simple as that.
hat seems like a very naïve idea of how software development can work in the real world. So how can we achieve this goal in practice?

When we develop new code we also develop some layers of testing for it:

  • Unit tests (Using NUnit)
  • Integration tests (Using NUnit)
  • Acceptance tests (using Selenium, Specflow and NUnit )
we do have a small team of automation verification engineers that help us manage and thicken the Acceptance testing layer Some manual testing are sometimes needed of course, and our software engineers do manual testing whenever required.
To support the development of maintenance of our products we also require a very efficient process of CI – this was discussed in our previous post.

The CI process at PicScout

 We’ve been using Jenkins as our build server for some time now, but recent switch from TFS to Git allowed us, among other things, to implement a more sophisticated approach to Continuous Integration process.
 
We’ve already had all CI principles covered before – single branch everyone was working and committing to on a (almost) daily basis, automated self-testing builds etc. But if you look at the definition of CI in Wikipedia, it says that in CI “no errors can arise without developers noticing them and correcting them immediately”. Unfortunately this was not our case.
 
But first of all, why errors resulting in a broken build are such an issue ? Because it impacts entire team, anyone “getting latest” is likely to encounter issues at some stage because of the errors introduced by someone else and it may take a lot of time to realize that it’s not your fault after all. And a check-in on a broken build makes things even worse. In addition, you are not guaranteed to have a stable version for deployment.
 
Why did it happen for us ?
 
A little bit of background to start with – we have one job (build in Jenkins) per solution, and we have dependencies set up between jobs to reflect dependencies between projects in separate solutions. Whenever job is run successfully, it will trigger its dependent jobs, so a check-in may actually result in multiple Jenkins jobs running in chain one after another. The whole process used to take more than 30 minutes, with developers waiting all this time for possible error notifications whereas the source was being dirty since the check-in. Moreover, any other check-in during this time frame somewhere along this chain resulted in jobs near the end of chain aggregating additional changes from TFS. Consequently, failure notification emails were sent to a group of developers and they were in no hurry to take responsibility for something that was not necessarily their fault.
 
What do we have now ?
 
First of all, we put a lot of effort to minimize jobs execution time and right now the longest chain of jobs completes in well under 10 minutes. But after migration to Git a major change was in our new strategy to tackle “broken builds”. Each developer now works on a local branch, pulls from the central repository master branch but pushes back to his/her personal branch. Git Plugin allows Jenkins to merge master branch into this personal branch, run all necessary jobs on it and merge it back to master on success. In case of a failure, the broken branch is ignored, master branch remains untouched.  Any pushes made by other developers at the same time are run separately on different branches and don’t affect each other. Feedback on success/failure of the build is sent only to the developer who triggered it, so no more lame excuses.
 
This concept is not new for CI servers, there is a “Gated Check-in” in TFS or “Delayed Commit” in TeamCity. What makes our approach a bit different is that the process is automated – no need to specify which build definition you want to use for your changes to be built, tested and pushed back to master. We have incorporated logic in Git post-receive hook that inspects the changes made by the developer, identifies and then triggers corresponding job in Jenkins. Another advantage compared to the 2 methods mentioned above is that the branch with broken build can be easily accessed by other developers for review or assistance. In fact, with this approach it can even be more productive to allow a broken build than to try to prevent it all the time.
 
That’s about it on how we do CI these days.

How to split an array in C#?

How to split an array in C#?
 
Every now and then, we need to perform mundane operations that are very simple but don’t have a built-in function in the language. So we write some ad-hoc code, maybe even copy something from StackOverflow and are done with it. What we sometimes fail to notice, however, is the affect this has on performance.
This entry will focus on splitting an array, but this is relevant for other operations as well.
 
The time it takes to perform an operation is only important if the code is in a time-critical section and/or is performed a large amount of times. If this is not the case, whatever implementation you choose will probably be OK.
Having said that, let’s look at some of the ways to split an array. We’ll use a medium sized byte array for this purpose (256), splitting it to a short array (16 bytes) and the rest.
Assuming you are using .Net 3.5+, the most natural way to go about this is to use LINQ:

private static void SplitArayUsingLinq(byte[] data)
{
         byte[] first = data.Take(16).ToArray();
         byte[] second = data.Skip(16).ToArray();
}
As you can see, the method is very elegant, only 1 line of code required to create each part. In addition, it seems like using Take and Skipon a small number of runs can’t cause a performance issue.
However, running the above code a 1,000,000 times takes around 10 seconds, which is a considerable amount of time.
Let’s compare the LINQ version to a good old for loop:

private static void SplitUsingForLoop(byte[] data)
{
 byte[] first = new byte[16];
 for (int i = 0; i < first.Length; i++)
 {
  first[i] = data[i];
 }
 byte[] second = new byte[data.Length - first.Length];
 for (int i = 0; i < second.Length; i++)
 {
  second[i] = data[i + first.Length];
 }
}
 
This yields a run time of less than 2 seconds – more than x5 improvement! The looping method seems to be much more efficient. Let’s try to improve this some more.
If we do our Googling right, we find that copying arrays actually is a library function – Array.Copy. Let’s test this:

private static void SplitArrayUsingArrayCopy(byte[] data)
{
         byte[] first = new byte[16];
         Array.Copy(data, first, first.Length);
         byte[] second = new byte[data.Length - first.Length];
         Array.Copy(data, first.Length, second, 0, second.Length);
}
We get a result of 250ms – another x8 improvement, a total of x40 compared to the LINQ version!
Digging even dipper and only in the case of byte[], we can use another method called Buffer.BlockCopy that actually performs a low level byte copy:

private static void SplitArrayUsingBlockCopy(byte[] data)
{
         byte[] first = new byte[16];
         Buffer.BlockCopy(data, 0, first, 0, first.Length);
         byte[] second = new byte[data.Length - first.Length];
         Buffer.BlockCopy(data, first.Length, second, 0, second.Length);
}
 
Now the results are 180ms, which is yet another improvement, albeit not as dramatic as the previous ones.
In conclusion:
Method
Time for 1,000,000 splits (s)
Improvement factor
LINQ
10.2
for loop
1.93
x5
Array.Copy
0.25
x40
Buffer.BlockCopy
0.18
x56
Kids, don’t trust LINQ blindly for what really matters J