Friday, September 28, 2007

12.WebSpiderReview

The goal of reviewing a classmate's WebSpider assignment is to improve the ability to read, write, and test Java systems.

1. Installation Review
This process should be fairly smooth since we all used the same IDE and the same tools to write and test our code. I downloaded my classmate's software and extracted it into my Eclipse workspace. The packages contains two source code files (WebSpiderExample and TestWebSpider), a lib directory with checkstyle, emma, findbugs, and pmd sub directories, and various build and documentation files. The package had all the necessary files. Therefore, I was able to import his project into Eclipse without any effort.

The command ant -f verify.build.xml returned no errors and completed successfully. Because, I was a little bit suspicious, I decided to execute each build file individually but no errors were found. However, emma block coverage is only 83% and the line coverage is only at 85%.

I was able to build a jar file and execute his code with java -jar webspider-ivanw.jar -totallinks http://www.hackystat.org 100. As read from the console, it seemed that some Websites refused the connection. However, this did not bother the code execution and there were 1585 links found. At this time I can not be certain if this number is true since my code returns a slightly lower number of links (Read on).

2. Code format and conventions review
After reviewing the code, I recommend the following formatting corrections:

FILE
LINESVIOLATION
COMMENTS
WebSpiderExample.java46
EJS 6
Spelling ("of" twice)

WebSpiderExample.java58, 161
EJS 33
Keep comments and code in sync

WebSpiderExample.java71, 174, 183, *
EJS 7
use space between code blocks
WebSpiderExample.java150
EJS 49
Omit the subject in summary descriptions of actions or services
TestWebSpiderExample.java3
ICS-SE-Java-2
Do not use wild card "*" in import statements
TestWebSpiderExample.java22
EJS 33
Keep comments and code in sync


3. Test case review

Black box

Ivan's TestWebSpiderExample.java tests all methods within WebSpiderExample through calling the main method. Since all methods are in the same class, it seem like the test cases cover all methods written. Form a black box perspective this might be enough. But I would suggest covering all if-then-else statements as well.

In TestWebSpiderExample.java there are 4 calls to the main method each testing with different number of parameters but the same values. In addition there is one call to testMainMethod.

A nice extension to the tests that he already has would be testing for -totallinks and -mostpopular with 0 pages. Then testing it with a high number of pages and later with some illegal value such as, providing string input instead of number. He also could test with an empty parameter. For example, the URL String, or any other required parameter, could be empty/null. In addition a self made website could be set up to see if the code returns the correct values.

Since only the one call to testMainMethod has a assert statement I would suggest putting more assert statements and maybe calling testMainMethod on all tests since this method returns a value that can be compared for correctness.

White box

The emma coverage report returned 83% of block and 85% of line coverage, which does not seem satisfying from a white box perspective.

Analyzing emma's coverage report, I would suggest to include testing for the following:

1 - The length of linksVisited array is set to 65535. Since all test cases test only 5 pages, the linksVisited will never reach over 65535 links. Therefore, a test case must be set up or the length should not be preset.

2 - Various catch boxes are not covered. My suggestion is to set up a Website that would satisfy covering all the exceptions.

3 - Set up a test case were the user enters something else than the supposed -totallinks or -mostpopular option, and one of the else statement would have sufficient coverage under emma.

Break da buggah

I have set up my own very basic test website. The expected test results for -totallinks should be 3 and -mostpopular should return http://www2.hawaii.edu/~michalz/WebSpiderTestWebSite.html with 2 links when the maximum number of pages to traverse is set to 3 or higher.

Running Ivan's code on it returned incorrect results.

command: java -jar webspider-ivanw.jar -totallinks http://www2.hawaii.edu/~michalz/WebSpiderTestWebSite.html 3

output: 5

Changing the number of pages to traverse from 3 to 5 returned 9.

command : java -jar webspider-ivanw.jar -mostpopular http://www2.hawaii.edu/~michalz/WebSpiderTestWebSite.html 3

output: The most popular page is: http://www2.hawaii.edu/~michalz/WebSpiderTestWebSite.html with 0 pages that point to it.

comment: Right website, wrong number of pages.

Setting the maximum pages to traverse to a different number, gives all kinds of results with one of them being the right one.

command: java -jar webspider-ivanw.jar -totallinks http://www.myspace.com/shakadownstreet 10

output: crash - Sys.Application.notifyScriptLoader(); 'failed: SyntaxError: illegal character

4. Summary and lessons learned
I have learned to think and design better test cases for programs. Most people, including me, tried to get emma coverages as high as possible without even thinking if the tests we write are actually helping to test the actual program execution. I have also learned that it is important to set up an own testing environment before testing the code for correct output. For example, if I would set up a test website for WebSpider right from the beginning, I could be more certain that my code does what it supposed to be doing. I have also learned that using someone's test website isn't very good since I found one link on Randy's website that was down, and therefore my code returned a different answer than expected. If you know what your code does, than it is easy to set up a testing environment and write more sufficient test cases.

No comments: