Solving Unintended Consequences

PushToTest did a load test for a customer. The customer was running a fidelity program where teenagers played a Flash-based game. Every 15 minutes the service awards gifts to people playing the game. The Flash component checks the backend system every 15 minutes for the new gifts. The customer implemented no time-out mechanism.

Know what happens next?

The teenagers leave the Web page open, even when they are sleeping. The Flash components turn into a distributed denial of service attach engine. The site operations managers notice that every 15 minutes the server load increases, and it never reduces.

It was really interesting and fun to surface this problem and offer a mitigation.

That is the kind of challenge PushToTest Global Services loves!



Frank Cohen

Software Test Management and Metrics

Software Testing has become more and more complex. There are many reasons attributed to this complexity; one of the main reasons is that software applications today have a variety of technology needs and user interfaces. The applications are all-pervasive. Today, the users touch and feel the application functionalities from many points – mobile devices, voice-activated kiosks, etc.

Unlike in the past, where the traditional medium of CRT, keyboard and mouse were the only interfaces that programmers need to tackle, today, the application development teams are faced with the ever increasing touch points or interfaces to an application. Consequently software testing has seen a significant increase in complexity while the users expect the testing cycles to shrink tremendously. This has become an inversely proportional relationship and thus the challenges faced by the testing team have gone up by an order of magnitude.

Metrics therefore provide a major assistance to both the software testers and application development teams to manage the expectations. Historically, application development teams have used one or other tools (FPA, for example) to estimate the size of a given system. The resultant effort estimates are then moderated or increased using complexity factors or overloading parameters.

Based on these, an estimate is arrived at and a plan is produced. Using this plan, the project teams get budgets, establish timelines and begin the execution. However, historically, 80% of such projects have failed to meet the expectations and been delivered over budget and beyond the timelines predicted earlier.

Metrics were then introduced as a means to bring in accuracy to such estimates and project schedules. Metrics used in this manner are nothing but a collection of past experience that is categorized and tabulated in an orderly manner. The senior team members in a development or testing team analyze the raw collected data and group them in a pre-determined order for future usage.

While estimating the testing efforts for applications, such metrics are still not completely usable because the lay of the land keeps changing every time. That is, the application characteristics are totally different from the ones that were used in collecting the data. How do they overcome such challenges? The same way that the teams in the past have done; that is using overload or weighted parameters. What happens here?

The subjectivity creeps in. The estimates are no longer the product of an objective process. Therefore, the real numbers when the project is executed will begin to differ from the estimated model. Is there a solution to this problem? Until now, nothing has been found that entirely addresses this anomaly.

Look at what happened when software systems were used in a controlled environment. The estimates in those days were used as a guideline or a base number. The only variable in those days was the user who was not able to clearly articulate what he/she expected out of a software application/system.

Today, this problem has diminished to a large extent; however, many new dimensions have been added to complicate any software development or testing – one such example is the variety of user interfaces that exist today.

In the past, primary point of interface to many applications was human being. Today, it does not have to be. With web 2.0, there are a lot of transactions that happen behind the scene where no human being is involved. It should be easy then to predict the behavior, because after all it is only two programs that connect with each other. However, the recent experiences do not corroborate this statement.

With so many uncontrolled variables, we still have to live a real world, where people want hard answers for any software development/testing effort numbers. Therefore, we should continue to pursue on perfecting the metrics database. This hopefully will take us the level of engineering software development/testing.




Pallavi Nara


What is Code Coverage?

Code coverage measurement simply determines those statements in a body of code which have been executed through a test run and those which have not. In general, a code coverage system collects information about the running program and then combines that with source information to generate a report on test suite’s code coverage.

Code coverage is part of a feedback loop in the development process. As tests are developed, code coverage highlights aspects of the code which may not be adequately tested and which require additional testing. This loop will continue until coverage meets some specified target.

Why Measure Code Coverage?

It is well understood that unit testing improves the quality and predictability of your software releases. However, how well your unit tests actually test your code? How many tests are enough? Do you need more tests? These are the questions code coverage measurement seeks to answer.

Coverage measurement also helps to avoid test entropy. As your code goes through multiple release cycles, there can be a tendency for unit tests to atrophy. As new code is added, it may not meet the same testing standards you put in place when the project was first released. Measuring code coverage can keep your testing up to the standards you require. You can be confident that when you go into production there will be minimal problems because you know the code not only passes its tests but that it is well tested.

In summary, we measure code coverage for the following reasons:

* To know how well our tests actually test our code

* To know whether we have enough testing in place

* To maintain the test quality over the lifecycle of a project

Code coverage is not a panacea. Coverage generally follows an 80-20 rule. Increasing coverage values becomes difficult with new tests delivering less and less incrementally. If you follow defensive programming principles where failure conditions are often checked at many levels in your software, some code can be very difficult to reach with practical levels of testing. Coverage measurement is not a replacement for good code review and good programming practices.

In general you should adopt a sensible coverage target and aim for even coverage across all of the modules that make up your code. Relying on a single overall coverage figure can hide large gaps in coverage.

How Code Coverage Works

There are many approaches to code coverage measurement. Broadly there are three approaches, which may be used in combination:

Source Code Instrumentation: This approach adds instrumentation statements to the source code and compiles the code with the normal compile tool chain to produce an instrumented assembly.

Intermediate code Instrumentation: Here the compiled class files are instrumented by adding new bytecodes and a new instrumented class generated.

Runtime Information collection: This approach collects information from the runtime environment as the code executes to determine coverage information

The Total Coverage Percentage allows entities to be ranked in reports. The Total Coverage Percentage (TPC) is calculated as follows:

TPC = (BT + BF + SC + MC)/(2*B + S + M)

BT – branches that evaluated to “true” at least once

BF – branches that evaluated to “false” at least once

SC – statements covered

MC – methods entered

B – total number of branches

S – total number of statements

M – total number of methods

Code Coverage Criteria

To measure how well the program is exercised by a test suite, one or more coverage criteria are used. There are a number of coverage criteria, the main ones being:

* Function coverage – Has each function in the program been executed?
* Statement coverage – Has each line of the source code been executed?
* Condition coverage (also known as Branch coverage) – Has each evaluation point (such as a true/false decision) been executed?
* Path coverage – Has every possible route through a given part of the code been executed?
* Entry/exit coverage – Has every possible call and return of the function been executed?

Safety-critical applications are often required to demonstrate that testing achieves 100% of some form of code coverage.

Some of the coverage criteria above are connected. For instance, path coverage implies condition, statement and entry/exit coverage. Condition coverage implies statement coverage, because every statement is part of a branch.

Full path coverage, of the type described above, is usually impractical or impossible. Any module with a succession of n decisions in it can have up to 2n paths within it; loop constructs can result in an infinite number of paths. Many paths may also be infeasible, in that there is no input to the program under test that can cause that particular path to be executed. However, a general-purpose algorithm for identifying infeasible paths has been proven to be impossible (such an algorithm could be used to solve the halting problem). Techniques for practical path coverage testing instead attempt to identify classes of code paths that differ only in the number of loop executions, and to achieve “basis path” coverage the tester must cover all the path classes.

Statement Coverage

This metric reports whether each executable statement is encountered. The chief advantage of this metric is that it can be applied directly to object code and does not require processing source code. Performance profilers commonly implement this metric. The chief disadvantage of statement coverage is that it is insensitive to some control structures. For example, consider the following C/C++ code fragment:

int* p = NULL;

if (condition)

p = &variable;

*p = 123;

Without a test case that causes condition to evaluate false, statement coverage rates this code fully covered. In fact, if condition ever evaluates false, this code fails. This is the most serious shortcoming of statement coverage. If-statements are very common.

Statement coverage does not report whether loops reach their termination condition – only whether the loop body was executed. With C, C++, and Java, this limitation affects loops that contain break statements.

Since do-while loops always execute at least once, statement coverage considers them the same rank as non-branching statements.

Statement coverage is completely insensitive to the logical operators (|| and &&).

Statement coverage cannot distinguish consecutive switch labels.

Test cases generally correlate more to decisions than to statements. You probably would not have 10 separate test cases for a sequence of 10 non-branching statements; you would have only one test case. For example, consider an if-else statement containing one statement in the then-clause and 99 statements in the else-clause. After exercising one of the two possible paths, statement coverage gives extreme results: either 1% or 99% coverage. Basic block coverage eliminates this problem.

One argument in favor of statement coverage over other metrics is that bugs are evenly distributed through code; therefore the percentage of executable statements covered reflects the percentage of faults discovered. However, one of our fundamental assumptions is that faults are related to control flow, not computations. Additionally, we could reasonably expect that programmers strive for a relatively constant ratio of branches to statements.

In summary, this metric is affected more by computational statements than by decisions.

Decision Coverage

This metric reports whether boolean expressions tested in control structures (such as the if-statement and while-statement) evaluated to both true and false. The entire boolean expression is considered one true-or-false predicate regardless of whether it contains logical-and or logical-or operators. Additionally, this metric includes coverage of switch-statement cases, exception handlers, and interrupt handlers.

Also known as: branch coverage, all-edges coverage, basis path coverage, decision-decision-path testing. “Basis path” testing selects paths that achieve decision coverage.

This metric has the advantage of simplicity without the problems of statement coverage.

A disadvantage is that this metric ignores branches within boolean expressions which occur due to short-circuit operators. For example, consider the following C/C++/Java code fragment:

if (condition1 && (condition2 || function1()))




This metric could consider the control structure completely exercised without a call to function1. The test expression is true when condition1 is true and condition2 is true, and the test expression is false when condition1 is false. In this instance, the short-circuit operators preclude a call to function1.

Path Coverage

This metric reports whether each of the possible paths in each function have been followed. A path is a unique sequence of branches from the function entry to the exit.

Also known as predicate coverage. Predicate coverage views paths as possible combinations of logical conditions.

Since loops introduce an unbounded number of paths, this metric considers only a limited number of looping possibilities. A large number of variations of this metric exist to cope with loops. Boundary-interior path testing considers two possibilities for loops: zero repetitions and more than zero repetitions. For do-while loops, the two possibilities are one iteration and more than one iteration.

Path coverage has the advantage of requiring very thorough testing. Path coverage has two severe disadvantages. The first is that the number of paths is exponential to the number of branches. For example, a function containing 10 if-statements has 1024 paths to test. Adding just one more if-statement doubles the count to 2048. The second disadvantage is that many paths are impossible to exercise due to relationships of data. For example, consider the following C/C++ code fragment:

if (success)



if (success)


Path coverage considers this fragment to contain 4 paths. In fact, only two are feasible: success=false and success=true.

Researchers have invented many variations of path coverage to deal with the large number of paths. For example, n-length sub-path coverage reports whether you exercised each path of length n branches.




Defects / Bugs related definitions

Software Defect – The difference between the functional specification (including user documentation) and actual program text (source code and data). Often reported as problem and stored in defect-tracking and problem-management system

Software Defect – Also called a fault or a bug, a defect is an incorrect part of code that is caused by an error. An error of commission causes a defect of wrong or extra code. An error of omission results in a defect of missing code. A defect may cause one or more failures.

Software Defect – A flaw in the software with potential to cause a failure..

Software Defect Age – A measurement that describes the period of time from the introduction of a defect until its discovery.

Software Defect Density – A metric that compares the number of defects to a measure of size (e.g., defects per KLOC). Often used as a measure of defect quality.

Software Defect Discovery Rate – A metric describing the number of defects discovered over a specified period of time, usually displayed in graphical form.

Software Defect Removal Efficiency (DRE) – A measure of the number of defects discovered in an activity versus the number that could have been found. Often used as a measure of test effectiveness.

Software Defect Seeding – The process of intentionally adding known defects to those already in a computer program for the purpose of monitoring the rate of detection and removal, and estimating the number of defects still remaining. Also called Error Seeding.

Software Defect Masked – An existing defect that hasn’t yet caused a failure because another defect has prevented that part of the code from being executed.



Pallavi Nara

Developing a Test Specification

I’ve seen the terms “Test Plan” and “Test Specification” mean slightly different things over the years. In a formal sense (at this given point in time for me), we can define the terms as follows:

1. Test Specification – a detailed summary of what scenarios will be tested, how they will be tested, how often they will be tested, and so o n and so forth, for a given feature. Examples of a given feature include, “Intellisense, Code Snippets, Tool Window Docking, IDE Navigator.” Trying to include all Editor Features or all Window Management Features into o ne Test Specification would make it too large to effectively read.

2. Test Plan – a collection of all test specifications for a given area. The Test Plan contains a high-level overview of what is tested (and what is tested by others) for the given feature area. For example, I might want to see how Tool Window Docking is being tested. I can glance at the Window Management Test Plan for an overview of how Tool Window Docking is tested, and if I want more info, I can view that particular test specification.

If you ask a tester o n another team what’s the difference between the two, you might receive different answers. In addition, I use the terms interchangeably all the time at work, so if you see me using the term “Test Plan”, think “Test Specification.”

A Test Specification should consist of the following parts:
History / Revision – Who created the test spec? Who were the developers and Program Managers (Usability Engineers, Documentation Writers, etc) at the time when the test spec was created? When was it created? When was the last time it was updated? What were the major changes at the time of the last update?

Feature Description – a brief description of what area is being tested.

What is tested? – a quick overview of what scenarios are tested, so people looking through this specification know that they are at the correct place.

What is not tested? – are there any areas being covered by different people or different test specs? If so, include a pointer to these test specs.

Nightly Test Cases – a list of the test cases and high-level description of what is tested each night (or whenever a new build becomes available). This bullet merits its own blog entry. I’ll link to it here o nce it is written.

Breakout of Major Test Areas – This section is the most interesting part of the test spec where testers arrange test cases according to what they are testing. Note: in no way do I claim this to be a complete list of all possible Major Test Areas. These areas are examples to get you going.

Specific Functionality Tests – Tests to verify the feature is working according to the design specification. This area also includes verifying error conditions.

Security tests – any tests that are related to security. An excellent source for populating this area comes from the Writing Secure Code book.

Accessibility Tests – This section shouldn’t be a surprised to any of my blog readers. See The Fundamentals of Accessibility for more info.

Stress Tests – This section talks about what tests you would apply to stress the feature.
Performance Tests – this section includes verifying any perf requirements for your feature.

Edge cases – This is something I do specifically for my feature areas. I like walking through books like How to break software, looking for ideas to better test my features. I jot those ideas down under this section

Localization / Globalization – tests to ensure you’re meeting your product’s International requirements.
Setting Test Case Priority

A Test Specification may have a couple of hundred test cases, depending o n how the test cases were defined, how large the feature area is, and so forth. It is important to be able to query for the most important test cases (nightly), the next most important test cases (weekly), the next most important test cases (full test pass), and so forth. A sample prioritization for test cases may look like:

1. Highest priority (Nightly) – Must run whenever a new build is available
2. Second highest priority (Weekly) – Other major functionality tests run o nce every three or four builds
3. Lower priority – Run o nce every major coding milestone



Ruchi Sharma 

7 Tips to be More Innovative in the Age of Agile Testing to Survive an Economic Crisis

What is Agile Testing?

“Agile testing involves testing from the customer perspective as early as possible, testing early and often as code becomes available and stable enough from module/unit level testing.” – A wikipedia definition.

Why Need of Innovations in the Age of Agile Testing?

Global Recession/Economic downtime effect, Current Events are not Current Trends –

When global downturns hit, there is certain inevitability to their impact on information technology and Finance Sectors. Customers become more reluctant in giving software business. Some customers are withdrawing their long term projects and some customers using the opportunities in quoting low price. Many projects that dragged much longer than expected and cost more than planned. So, Companies started to explore how “Agile with different flavors” can help their Enterprises more reliably deliver software quickly and iteratively. The roles and responsibilities of Test Managers/Test Architects become more important in implementing Agile Projects. Innovations are increasingly being fueled by the needs of the testing society at large.

The Challenges in Agile Testing

Agile Testers face lot of challenges when they are working with Agile development team. A tester should be able to apply Root-Cause Analysis when finding severe bugs so that they unlikely to reoccur. While Agile has different flavors, Scrum is one process for implementing Agile. Some of the challenging scrum rules to be followed by every individual are

– Obtain Number of Hours Commitment Up Front
– Gather Requirements / Estimates Up Front
– Entering the actual hours and estimated hours daily.
– Daily Builds
– Keep the Daily Scrum meetings short
– Code Inspections are Paramount

So, in order to meet the above challenges, an agile tester needs to be innovative with the tools that they have. A great idea happens when what you have (tangible and intangible) meets the world’s deepest hunger

How Testers Can be More Innovative in the Age of Agile Testing?

Here are Important Keys to Innovation:

1. Creative

A good Agile Tester needs to be extremely creative when trying to cope up with speed of development/release. For a tester, being creative is more important than being critical.

2. Talented

He must be highly talented and strives for more learning and innovating new ideas. Talented Testers are never satisfied with what they have achieved and always strives to find unimaginable bugs of high value and priority.

3. Fearless

An Agile Tester should not be afraid to look at a developer’s code and if need be, hopefully in extreme cases, go in and correct it.

4. Visionary

He must have a comprehensive vision, which includes client’s expectations and delivery of the good product.

5. Empowered

He must be empowered to work in Pairs. He will be involving in Pair Programming to bring shorter scripts, better designs and finding more bugs.

6. Passionate

Passionate Testers always have something unique to contribute that may be in terms of their innovative ideas, the way they carry day-to-day work, their outputs and improve things around them tirelessly.

7. Multiple Disciplines

Agile Tester must have multiple skills like, Manual, Functional, Performance testing skills and soft skills like Leadership skills, Communication skills, EI, etc. so that agile testing will become a cake walk.



Anil Kumar 

JIRA – Defect Tracking Tool

Jira is a very powerful tool and can be used as defect tracking system as well as planning tool for Agile projects. In this article, I will describe some interesting ways in which Jira can be configured and improve your productivity – with respect to defect tracking systems. Like many tools, Jira provide you capabilities and how you use it to increase your productivity is up to you.

Lets start with project categories. When you login in to Jira, in the top left corner there are two links for Project and Project Categories. Using project category you can define how projects should be categorized. For example, you might want to categorize projects based on – whether they are being dealt by Team A or Team B, whether its a new development or ongoing maintenance and so on.

Creating new categories and changing them is very easy and probably self explanatory. Categories can be changed from the project view , i.e click on Administration and select the project you want to change. This will give you various options which can be changed for this project, including project category.

One thing you might want to keep in mind is, there could be only one category for the project. So you can not have categories in the lines of Team A / Live project or Team A / New project. But you can always give categories descriptive names like Team A – Maintenance project, Team A – Live project though.

After defining appropriate project categories, you can start exploring / creating various roles using role browser. For smaller teams this might not be very useful, but for larger teams roles can be used very effectively for triaging defects, creating notification schemes and so on.

Third important configuration setting for Jira could be Events. Events are very powerful and acts like triggers. With events, you can specify interesting things like how Jira screens will look like when specific event is triggered, what workflow operations will be available after specific event and who will get notification for this event. For example, if you want to change notification scheme (For example – Do not send emails for comments) or workflows (For example – It should not be possible to close defects directly, even if it is invalid defect it should be resolved, marked as invalid and some one else should close it) etc can be configured here. In order to make changes here, you need to create / modify notification / workflow schemes and associate them with the events.

That brings us to the Workflows, but what is workflow? Workflow is very important feature and lets you configure what happens in every step, how defects / issues are transitioned from one state to another and what options should be available in every transition. All these transitions work as trigger and you can specify conditions, validators or post transition functions for every transition.

Most of the operations in Jira are configured as schemes. Jira lets you create various schemes for workflows, notification & permissions. You need to create separate schemes, because you might have need for different scheme for different projects. For example, if you have resources from vendor working on a project, you might need separate permission scheme for them. Schemes are even used to control look-n-feel of Jira, to decide which fields will be visible on every transition and so on. These can be achieved using Screen schemes.

One of the most interesting feature of Jira is configuring dashboard. On Jira, you can have multiple dashboards and on every dashboard you can publish reports, which are useful for you, may be status of defects, defects for components / projects and so on. This will allow you to get up-to-date information on the Jira front page. In order to configure your dashboard, you need to build your query using the Find Issue option and built chart from the result set. These charts can be published on the home page / dashboard and now whenever you visit Jira dashboard, these charts will be updated with latest information.

So in nutshell, you start with categorizing your projects and defining appropriate roles and users. You than configure various issue types (Defects, stories, sub-tasks) and fields (Priority, Severity, blocking issues, releases and so on) and define events and what should happen when those events are triggered. You also create various schemes for notifications, workflows, screens etc and apply them to projects as needed. After project is configured properly, you configure dashboard to display up-to-date information based on the various queries.




Joanna Fernandes 

A Tester’s Dream – 5 steps to revive a Rejected Bug!

Testing by itself does not improve software quality. Test results are an indicator of quality, but in and of themselves, they don’t improve it. Trying to improve software quality by increasing the amount of testing is like trying to lose weight by weighing yourself more often. What you eat before you step onto the scale determines how much you will weigh, and the software development techniques you use determine how many errors testing will find. If you want to lose weight, don’t buy a new scale; change your diet. If you want to improve your software, don’t test more; develop better. – (Steve McConnell: “Code Complete”)

A tester reports a software bug/defect in the application he is testing. He feels that this is a genuine defect that needs attention. But to his shock and astonishment he finds out that his bug is rejected by the developer team with an excuse of “the-application-works-as-designed”! This can happen with any tester at some point in his testing career. And this can be quite frustrating too; especially if the tester feels that the software defect is a serious one and has got potential to cause severe damage to the client/end user if the software is shipped with the defect.

Having said that, not every defect that is rejected is worth fighting for. So as the first step of self-assessment, a tester might go back to the defect report that he had submitted to the programmers and verify if the defect report was well defined! Few things worth verifying in the submitted defect report are:

a) If the defect report had a well-written summary line and the steps mentioned were readable (clear and precise). Use of words might play a vital role in deciding the fate of a defect report. A single ambiguous word might suppress the seriousness of the defect and your defect report could look like a bunch of garbage, wasting the bandwidth of the defect tracking system!

b) If the defect report contained any unnecessary step(s), adding to the confusion.

c) If the report clearly mentioned what actually happened and what you expected to happen.

d) If you had mentioned the consequences of the defect, in case it is allowed to slip through the Release Phase.

e) If your tone of voice sounded sarcastic, threatening or careless in the defect report. Was it in a way, capable of making the programmer confused, unreceptive, or angry?

A well-written defect report can differentiate a best-selling bug from a flop show! If you had missed to report the defect properly, you should not blame the programmer for turning down your defect as “rejected”! May be you should spend some time on your bug/defect reporting skills and try once again. But suppose, you had reported the defect quite neatly and still it was rejected, decide if you would choose to go with the decision or rather appeal against it as you still strongly feel that this is a serious defect that needs immediate attention. In case, you are planning to appeal against the rejection these are few things that you might consider doing in order to increase your chance of success:

1) Patch the holes – Look for loopholes in your original defect report that could be supported with further investigative data to strengthen your case. When you are going for an appeal, you should anticipate attacks on the weaker areas of your original report. You should understand that your report was weak and unpersuasive at the first place. So try and gather as mush information to make it appealing this time around.

2) Follow it up – Do some additional follow up testing around the original defect in an attempt to find more serious consequences. If you are able to find more areas that are affected by the defect and more severe consequences, it should add to your confidence level. A defect that infests a wider range of functionalities and has severe consequences has more chance of getting attention.

3) Follow the Money – There is a popular doctrine in criminal investigation; “in most of the crime cases, if you will follow the money you should soon able to reach the criminal”! Same can be applied in testing too, while appealing against rejection of a defect. Talk to the major stakeholders like the Managers, the Client, Sales department staffs, Technical Support team, and even the Technical Writers. Try to find out who will be most affected if the Product is shipped along with the defect. Try to get an idea of the financial loss that can result due to this defect if left unfixed. As James Bachdefines – “A bug/defect is something that bugs someone who matters”! Try to identify the “someone” for whom your defect really matters and find out how costly it matters.

4) Build a Scenario to support your Testing – It’s time for story telling. This is where a tester’s story telling capability comes in handy. Use your imagination and your creativity to weave around a realistic story that sounds appealing and at the same time is capable of conveying the seriousness of the rejected defect. Build some scenarios that exemplify how a real-time user might come across the defect and how it might affect the user in a severe way.

5) Look out for similar defects in Competitors – Take advantage of the immense knowledge base of the Internet to find out some case where one of your competitors had released their Product with a defect similar to yours and had to face terrible consequences. Check in the recent press releases, forum discussions, news reports, courtroom cases for a similar case where a defect (similar to yours) had caused serious loss (financial loss in terms of loss in revenue, loss in credibility, loss in loyal customers etc) to a competitor. If you already take notes of important events related to testing, also look into your moleskine notebook for any similar incident that you might have recorded in there! If you are lucky enough to find such a case, your appeal should sound lot better in the review meeting!



Ruchi Sharma 

Testing in Real World !

Quite often I receive mails from friends asking for some testing exercises. According to me, if you are alert enough, you can find lots of testing exercises in your day to day life. Can’t agree with me? Read on…

Today I got an SMS (Forward of course) from a friend. The content of that message was as follows:

“If you are forced to draw money by a robber in an ATM, then just enter your PIN number in reverse order. By doing so, you will be allowed to withdraw money from you’re a/c and at the same time the cop will be informed! So the cop will reach the ATM in a short while and rescue you.”

At first sight, this might seem a very useful message. But a tester is always trained and taught to be skeptical about everything. And I am no exception. So how could I take this piece of information as true, without making further observations/investigations?

So I put on my tester’s shoes and tried to analyze it. And here are my observations:

1. If this was true, then I should have known this before. Because, if it was true, the bank should have informed me about this when I created my a/c and was given my ATM card. How could they miss to transfer such an important instruction?

2. There are hundreds of banks world over. But this SMS never told about the bank which provides this facility. That meant this information was surely incomplete (if not incorrect).

3. Now coming to the loose link of the message. At some point, the SMS tells about entering reverse PIN number in order to activate some security system. At first sight, this sounds like a brilliant method. Isn’t it? But just think for a while, and you will know this can’t be right. If this was true, then how about the PIN numbers like 1001, 2002, 1221, 2332, 1111, 2222 and so on… (Palindromic Numbers) (these are my test data). If my PIN is one of those palindromes, then how to activate that security mechanism? Then I thought one work around for this is to disallow Palindromic numbers as your PIN. But the idea itself sounded stupid. Simply because, there are lots of Palindromic numbers within 9999 (the maximum possible PIN Number). And I have never seen a message in an ATM machine restricting me from using a Palindromic number as my PIN. But I did not want to believe that argument of mine, without actually seeing (executing my test case with my pre-set test data) it. So I immediately rushed to my nearest ATM counter and tested this. And I found that there is no such restriction for such numbers (test case passed!). Then I checked the same test with two other bank a/c ATMs (regression testing!). And as expected here also my test cases passed! This test almost made me sure about the inaccuracy of the SMS message.

4. Just as some additional arguments to strengthen my point, I again looked at the SMS again. And there it is. If at all we accept this message to be true, still then do you think that “the cop will reach the ATM in a short while and rescue you”, keeping in mind that this is India?

There are still lots of information left in the message which prove that the information is a hoax. So I would like to leave them for my readers and would like to see, how they use their testing skills to find them out.

Hints: Always use the 3 basic weapons of a tester. i.e. Observe, Analyze and Skeptical.

There are lots of testing exercises lying around loosely in your own life too. Try to identify them and try to test them using your very own testing skills.



Kamali Mukharjee

The A-Z of Usability

A is for Accessibility

Accessibility — designing products for disabled people — reminds us of two fundamental principles in usability. The first is the importance of “Knowing thy user” (and this is rarely the same as knowing thyself). The second is that management are more likely to take action on usability issues when they are backed up by legislation and standards.

B is for Blooper

Each user interface element (or “widget”) is designed for a particular purpose. For example, if you want users to select just one item from a short list, you use radio buttons; if they can select multiple items, checkboxes are the appropriate choice. Some developers continue to use basic HTML controls inappropriately and these user interface bloopers prevent people from building a mental model of how these controls behave.

C is for Content is (still) king

As Jakob Nielsen has said, “Ultimately, all users visit your Web site for its content. Everything else is just the backdrop.” Extending this principle to all interfaces, we could say that it is critical that your product allows people to achieve their key goals.

D is for Design patterns

Design patterns provide “best of breed” examples, showing how interfaces should be designed to carry out frequent and common tasks, like checking out at an e-commerce site. Following design patterns leads to a familiar consistency in user interaction and ensures your users won’t leave your site through surprise or confusion.

E is for Early prototyping

Usability techniques are really effective at detecting usability problems early in the development cycle, when they are easiest and least costly to fix. For example, early, low-fidelity prototypes (like paper prototypes) can be mocked up and tested with users before a line of code is written.

F is for Fitts’ Law

Fitts’ Law teaches us two things. First, it teaches us that the time to acquire a target is a function of the distance to and size of the target, which helps us design more usable interfaces. Second, it teaches us that we can derive a lot of practical design guidance from psychological research.

G is for Guidelines

Guidelines and standards have a long history in usability and HCI. By capturing best practice, standards help ensure consistency and hence usability for a wide range of users. The first national ergonomics standard was DIN 66-234 (published by the German Standards body), a multi-part ergonomics standard with a specific set of requirements for human-computer interaction. This landmark usability standard was followed by the hugely influential international usability standard, ISO 9241.

H is for Heuristic Evaluation

Heuristic evaluation is a key component of the “discount usability” movement introduced by Jacob Nielsen. The idea is that by assessing a product against a set of usability principles (Nielsen has 10), usability problems can be spotted cheaply and eradicated quickly. Several other sets of principles exist, including those in the standard ISO 9241-110.

I is for Iterative design

Rather than a “waterfall” approach to design, where a development team move inexorably from design concept through to implementation, usability professionals recommend an iterative design approach. With this technique, design concepts are developed, tested, re-designed and re-tested until usability objectives are met.

J is for Jakob Nielsen

Recently promoted from the “the king of usability” (Internet Magazine) to “the usability Pope” (Wirtschaftswoche Magazine, Germany), Jakob Nielsen has done more than any other person to popularise the field of usability and get it on the agenda of boardrooms across the World. As well as writing the best usability column on the internet, he’s also a very nice chap: he recently bought my lapsed domain name and when I pointed out my mistake to him he kindly repointed it to the E-Commerce Usability book web site.

K is for Keywords

In our web usability tests we find that the old adage, “A picture paints a thousand words”, just doesn’t apply to the way people use web sites. No amount of snazzy graphics or icons can beat a few well chosen trigger words as a call to action. Similarly, poor labelling sounds the death knell of a web site’s usability as reliably as any other measure.

L is for Layout

That’s not to say that good visual design doesn’t have a role to play in usability. A well designed visual layout helps people understand where they are meant to focus on a user interface, where they should look for navigation choices and how they should read the information.

M is for Metrics

Lots of people usability test but not many people set metrics prior to the test to determine success or failure. Products in usability tests should be measured against expected levels of task completion, the expected length of time on tasks and acceptable satisfaction ratings. You can then distinguish usability success from usability failure (it is a test after all).

N is for Navigation

The great challenge in user interface design is teaching people how your “stuff” is organised and how they can find it. This means you need to understand the mental models of your users (through activities like card sorting) build the information architecture for the site and use appropriate signposts and labels.

O is for Observation

Jerome K. Jerome once wrote, “I like work: it fascinates me. I can sit and look at it for hours.” To really understand how your users work you need to observe them in context using tools like contextual inquiry and ethnography. Direct observation allows you to see how your product is used in real life (our clients are continually astonished at how this differs from the way they thought their products would be used).

P is for Personas

A persona is a short description of a user group that you use to help guide decisions about product features, navigation, interactions, and visual design. Personas help you design for customer archetypes — neither an “average” nor a real customer, but a stereotypical one.

Q is for Questionnaires

Questionnaires and surveys allow you to collect data from large samples of users and so provide a statistically robust background to the small-sample data collected from activities like contextual inquiry and ethnography. Since people aren’t very good at introspecting into their behaviour, questionnaires are best used to ask “what”, “when” and “where” type questions, rather than “why” type questions.

R is for Red Route

Red Routes are the critical user journeys that your product or web site aims to support. Most products have a small number of red routes and they are directly linked to the customer’s key goal. For example, for a ticket machine at a railway station a red route would be, “buy a ticket”. For a digital camera, a red route would be “take a photo”.

S is for Screener

The results of user research are valid only if suitable participants are involved. This means deciding ahead of time the key characteristics of those users and developing a recruitment screener to ensure the right people are selected for the research. The screener should be included as an appendix in the usability test plan and circulated to stakeholders for approval. For more detailed guidance, read our article, “Writing the perfect participant screener”.

T is for Task scenarios

Task scenarios are narrative descriptions of what the user wants to do with your product or web site, phrased in the language of the user. For example, rather than “Create a personal signature” (a potential task for an e-mail package) we might write: “You want your name and address to appear on the bottom of all the messages you send. Use your e-mail program to achieve this.” Task scenarios are critical in the design phase because they help the design team focus on the customers and prospects that matter most and generate actionable results.

U is for Usability testing

A usability test is the acid test for a product or web site. Real users are asked to carry out real tasks and the test team measure usability metrics, like success rate. Unlike other consumer research methods, like focus groups, usability tests almost always focus on a single user at a time. Because a usability test uses a small number of participants (6-8 are typically enough to uncover 85% of usability problems) it is not suited to answering market research questions (such as how much participants would pay for a product or service), which typically need larger test samples.

V is for Verbal protocol

A verbal protocol is simply the words spoken by a participant in a “thinking aloud” usability test. Usability test administrators need to ensure that participants focus on so-called level 1 and level 2 verbalisations (a “stream of consciousness” with minor explication of the thought content) and avoid level 3 verbalisations (where participants try to explain the reasons behind their behaviour). In other words, usability tests should focus on what the participant attends to and in what order, not participant introspection, inference or opinion.

W is for Writing for the web

Writing for the web is fundamentally different to writing for print. Web content needs to be succinct (aim for half the word count of conventional writing), scannable (inverted pyramid writing style with meaningful sub-headings and bulleted lists) and objective (written in the active voice with no “marketeese”).

X is for Xenodochial

Xenodochial means friendly to strangers and this is a good way of capturing the notion that public user interfaces (like kiosk-based interfaces or indeed many web sites) may be used infrequently and so should immediately convey the key tasks that can be completed with the system.

Y is for Yardstick

Most people carry out usability tests to find usability problems but they can also be used to benchmark one product against another using statistics as a yardstick. The maths isn’t that complicated and there are calculators available. The biggest obstacle is convincing management that these measures need to be taken.

Z is for Zealots

With the advent of fundamentalism, zealots get a bad press these days. But to institutionalise usability, you need usability zealots within your team who will carry the torch for usability and demonstrate its importance and relevance to management and the design team.



 Anupama Verma 

1 30 31 32 33 34 36