Functional Specification And Module Design
Animating Algorithms

Team members
Nikhath Akhtar
Edward Clayton
Yuan Chang
Michael Hogg
Mark Howard
Stephen Morley
Tim Steele
Viktor Vafeiadis
Contents
- Introduction
- Design Brief
- Algorithms
- Environment
- Inputs, Functions And Outputs
- Overall Architecture
- Specification Of Components And Test Requirements
- Final Acceptance Criteria
- Management Strategy
- Miscellaneous
1. Introduction
The past two decades have seen innovative development in the teaching of algorithms in computer science. This has been made possible by the exponential growth of computer technology. Algorithms are conventionally taught through textual overhead transparencies and static text material. These mediums try to convey the internal data structures of the algorithms and narrate why such steps are taken, however they do not provide interactivity with users and lack flexibility.
Algorithm animation is concerned with illustrating the dynamic behaviour of an algorithm by visualizing the fundamental changes in its data structure during execution. Such displays have proven to be useful for education and research into the design and analysis of algorithms. Educational Research has shown that student control and interaction with animations is very important. Better control and interaction can be achieved by letting students create their own data sets for the algorithm rather than observing prepared data sets. It makes an algorithm less intimidating by making it more accessible. This project focuses on the development of an animated algorithms program, to aid students in their learning process and to enable lecturers to have extensibility access for their own teaching convenience.
2. Design Brief
Much of practical computer science can be usefully explained by drawing lots of boxes with arrows between them to represent the data structures involved. The contents of the boxes and the connectivity of the lines then changes as an algorithm proceeds. This is difficult to illustrate on a blackboard or overhead projector. What is needed is a program that produces slow-motion animated films of data structures and algorithms at work.
The system should provide a general mechanism for writing an animation script and connecting to an algorithm. The user should be able to set up the initial data for the algorithm and then step through it slowly to watch its operation. Auxiliary displays showing the time and space requirements of the algorithm might also be shown.
The project brief gives a general description of the end product required, but also leaves much open to interpretation by the team. The program must be capable of visually displaying the behaviour of algorithms as they act on data structures, and the user must be able to control the animation playback so that it proceeds at a rate with which they are comfortable.
The brief also makes it clear that the algorithms should be separate from the animator, so that the two can be "connected". To fulfil this requirement, our design stores each algorithm as a separate Java class file. The main program when necessary loads these, and as the algorithm executes, the animator updates the screen accordingly. This design also allows users to write their own algorithms, which can then be "plugged into" the system for animation. Instructions for writing these algorithms can be found in the User Manual.
The initial data supplied to an algorithm must be specifiable by the user if they wish, so our design features a dialogue box in which the user can enter this data. There is also a facility for generating random data where appropriate (for example, data for a sorting algorithm).
Our animator design provides two different playback methods. The first allows step-by-step execution of an algorithm (since this is required by the brief), while the second allows an algorithm to execute automatically and continuously until it has completed. This is useful if an algorithm requires a large number of execution steps, and the user would prefer not to advance each step manually.
To summarise the operation of the program:
- The user selects an algorithm file for animation
- The program loads the file, and connects it to the animator
- The user is asked to supply initial data (or may choose to generate random data if appropriate)
- The algorithm is configured with the chosen data
- The user can now step through the algorithm's execution, or start automated playback
The program must be able to animate "at least five different algorithms for two different problems".
To fulfil this requirement, our design will provide animation facilities for sorting and string-matching algorithms, and will come with at least five different algorithms "built-in" (with the option of adding further algorithms later).
After careful analysis of the project requirements, we decided to implement the project in the following separate sections:
- Algorithm animator
- This is responsible for producing the visual representation of the algorithm's behaviour.
- Graphical User Interface
- This handles the window layout, and provides the user with buttons/text boxes/etc with which to interact with the algorithm and its animation.
- Loading and execution of algorithms
- This handles the algorithm's theoretical behaviour. It loads the relevant class files, and allows the algorithms to be executed when required.
Definitions that will be used throughout this documentation:
- Animation Script
- The script for the visual representation of the animation on the GUI.
- Algorithm
- An algorithm is a set of instructions carried out in a certain order to perform a specified task.
3. Algorithms
Sorting
Sorting is a fundamental device for information processing. The difference between a good and a mediocre sorting algorithm is hard to see when dealing with small amounts of data; but with large amounts, the difference is so significant that it can totally destroy the effectiveness of a program. This makes sorting algorithms an integral part of all computer science courses.
The following sorting algorithms will be implemented:
- Bubble sort
- Insertion sort
- Quicksort
- Shellsort
Bubble Sort
Bubble Sort works by repeatedly moving the largest element to the highest index position of the array. Rather than search the entire array to find the largest element, Bubble Sort focuses on successive adjacent pairs of elements in the array. It compares the two elements, and then either swaps them (if the element with the smaller index number is larger than the other element) or leaves them alone. In either case, after this step, the larger of the two elements will be in the higher index position.
The focus then moves to the next highest position, and the process is repeated. When the focus reaches the end of the array, the largest element will have 'bubbled' from its original position to the highest index position in the array. Bubble sort has time complexity O(n2).
Insertion Sort
The insertion sort works by taking the values one by one and inserting each one into a new list that it constructs, constantly maintaining the condition that the elements of the new list are in the desired order with respect to one another. That is, consider the elements one at a time, inserting each in its proper place among those already considered (keeping them sorted). The element being considered is inserted by moving larger elements one position to the right and then inserting the element into the vacated position. Insertion sort has time complexity O(n2).
Quicksort
In Quicksort, the array of items to be sorted is divided into two partitions and then the Quicksort procedure is recursively called to sort the two partitions. One element in the array is chosen to be the pivot. A pass through the array, called a partition step, is where the entries are re-arranged so that:
- Entries smaller than the pivot are moved to the left of the pivot
- Entries larger than or equal to the pivot are moved to its right
This method is then recursively applied to the array, first to the left then to the right side of the pivot. Quicksort has time complexity O(n lg n).
Shellsort
The basic idea of Shellsort is the following:
- Arrange the data into a 2-dimensional array
- Sort the columns of the array
The effect is that the data sequence is partially sorted. The process above is repeated, but each time with a narrower array, that is with a smaller number of columns. In the last step, the array consists of only one column. With each step, the length of a sequence increases, until the last step where the array is completely sorted. Shellsort has time complexity O(n2)
Searching
The second group of algorithms to be animated are the string matching algorithms. Finding all occurrences of a pattern in a text is a problem that arises frequently in text editing programs. Typically, the text is a document being edited, and the pattern searched for is a particular word supplied by the user. Efficient algorithms for this problem can greatly aid the responsiveness of the text editing program. String matching algorithms are also used, for example, to search for particular patterns in DNA sequences.
The general problem is that given a pattern p, and a string of text t, determine if p appears as a substring of t or not. Also, an auxiliary problem is to identify where each occurrence of p is located in t.
The following methods will be animated:
- Brute Force Method
- Knuth-Morris-Pratt
- Boyer-Moore
Brute Force Method
This involves trying to match the pattern with the text at each possible location. When attempting a single match, compare letters from the beginning, stopping when a discrepancy is found.
The algorithm lines up the first character of the text with the first character in the pattern. If they match, the second character in the pattern is then compared to the second character in the text. This process continues until either the pattern is exhausted, in which case we have found a complete occurrence of the pattern within the text. If, however at any point the text character and pattern character do not match, the pattern is "slid" one position to the right over the text, and the process begins again. This entire process is repeated until a match is found, or all of the characters are exhausted in the text. This method can require about nm character comparisons.
Knuth-Morris-Pratt
This technique uses the structure of the pattern to skip unnecessary comparisons. Pre-processing is done by creating a next[j] table. The key aspect of the Knuth-Morris-Pratt (Kmp) algorithm is that a failed attempt to find a match yields useful information to be used on the next attempt.
Kmp algorithm consists of two stages: preprocessing stage and searching stage.
The preprocessing phase involves pre-computing a table that is used to decide how far to skip after a mismatch. If x characters in the text are successfully matched with x characters in the pattern, and then a mismatch happens at the next character, then how far should we skip to start the next match? The aim here is to skip as far as possible without losing any potential matches. The matching part of a pattern is called the prefix of the pattern. The length that needs to be skipped is the length of the longest border of characters of size less than x in the pattern that appears at both the beginning and ending of the matching prefix. If there is no such border, the length of x characters can be skipped.
The search phase is similar to the searching technique in brute force method except that we make use of the pre-computed table to decide how many positions should be shifted after a successful or failed match. When comparing one position of pattern and one position of text, if there are the same, the next positions are compared. If the pattern is matched in the string, then it is reported. Then the pattern is shifted as far as the widest border allows (shifting without missing any opportunity to compare the text and pattern). In the case of a failure while comparing, the pre-computer table is looked at to determine at which position the next comparison should begin. This string searching algorithm never uses more than m+n character comparisons.
Boyer-Moore
The key to the Boyer-Moore algorithm is the observation that some comparisons are unnecessary, because they can't possibly yield a match between the two strings. Where a brute force comparison left-aligns the strings to be compared and begins with the first character in each string, Boyer-Moore left-aligns the strings and compares the last character of the search string to the corresponding character in the string to be searched. If the compare does not yield a match and furthermore, the character in the string being searched is not in the search string, then the entire search string is shifted n places to the right where n equals the number of characters in the search string. Brute force would simply move one position to the right in this instance, thus taking many more steps to complete the compare. Boyer-Moore string searching never uses more than m+n character comparisons, and uses about n/m steps if the alphabet is not small and the pattern is not too long.
Assumptions made to clarify the specification
It will be assumed that the end users of this product will be lecturers of Data Structures and Algorithms courses. Therefore, it will be assumed that the lecturer will have prior knowledge of Java and is confident in writing java programs and that they are familiar with the algorithms.
4. Environment
4.1 Choice Of Language
The main application will be written in Java. It was developed by Sun Microsystems Inc. and is an easy-to-use, object-oriented language that includes a set of libraries that facilitate cross-platform development. There are several advantages to using java, as due to its object-oriented nature, it encourages code that is modular and re-usable which results in more maintainable code. Also as it is platform-independent, java has the property of 'write once, run anywhere'. As well as this, Java has built in exception (error) handling procedures that helps to produce more robust applications and components.
4.2 Choice Of Tools
4.2.1 CVS
CVS or Concurrent Version System, controls concurrent editing of sources by several users working on releases built from a hierarchical set of directories. It allows users to keep track of their change of code.
The code resides in a central repository, which users are never meant to edit directly. Files in this repository hold the program listing itself as well as the history of changes to the file. Histories include who made the change, a timestamp and a message describing the change, which the person who made the change entered when he or she was finished. This way, the integrity of code and documentation will be maintained.
4.2.2 Javadocs
Javadoc is an API documentation tool that comes as part of a standard Java Development Kit (JDK) distribution. It allows the user to use special syntax within the java classes and packages to denote comments. Within those comments, regular text can be used, or HTML, and Javadoc tags add a lot of standard documentation based on actual java code.
For example, Javadoc automatically generates the details regarding an object's hierarchy, scope, the interfaces it implements, and known subclasses based on the syntax and content of the code.
5. Inputs, Functions and Outputs
Inputs
Files: There will be two sources of user input. The first will be the file that will be loaded by the user containing the algorithm and the second will be the input from the user directly into the graphical user interface.
Graphical User Interface: The user will enter integers to be sorted, or the strings to be matched into the data entry boxes. There will be control boxes that allow the user to step through the algorithm, refresh etc.
Randomly Generated: The user will also have the choice of using internal randomly selected data.
Functions
The application will initially load the selected algorithm script. It will then produce the animation specified and output this to the display. When the user enters data, or selects the random data option, the data will be entered into the algorithm and the algorithm performed.
Outputs
The output of the project will be the animation of the algorithms on the graphical user interface. As well as this, there will a pseudo-code representation of the algorithm with the current line that is running highlighted. There will also be a section to display the time and space complexity of the algorithm, along with the number of swaps performed.
The user will also have the option of a 'race' mode. This will enable the user to load up to three algorithms onto the screen and watch all three simultaneously. The advantage of offering this facility is that students can then understand the speeds of the algorithms in relation to each other.
Overall Architecture

7. Specification Of Components And Test Requirements
7.1 Graphical User Interface
The GUI is an important part of the project since it determines the ease of use of the entire application. As the whole focus of the project is animating algorithms, how the algorithm will be displayed on the screen is a fundamental part of the design process. The screen must be simple to understand and interact with, yet comprehensive enough to provide the user with access to the full functionality of the program. After much debate, the following design was chosen:
- The majority of the program will run in a single main window, which the user may resize by dragging the bottom-right drag handle.
- A menu-bar will run along the top of the window, containing the following menus:
File
Open single mode - this option allows the user to select a single algorithm for animation. Highlighting this option will open a sub-menu, containing the following sub-options:
- A list of all the "built-in" algorithms that are included with the program.
- Load from file - this will open a standard file-selection dialogue box, allowing the user to select a custom-made algorithm class file.
Open race mode - this will open a special dialogue box, in which the user can specify multiple algorithms to be raced against each other. Only algorithms in the same family may be raced, and the GUI reflects this property by providing two (mutually-exclusive) radio buttons that allow the user to select an algorithm family (sorting or string-matching).
The user selects the desired algorithms using pop-up menus, containing lists of the relevant "built-in" algorithms, as well as a "Load from file" option to select custom-made algorithms. Standard "OK" and "Cancel7quot; buttons allow the user to confirm or abort race mode.
Once the user has chosen the algorithm(s), a small pop-up window will appear, allowing the user to specify the initial data. A modifiable text box allows entry of lists of integers (for arrays) and characters (for strings). There is also a button to generate random data.
Help
User manual - opens a separate window containing the user manual for the program.
About - displays a small pop-up window with information about the program and the team.
Website - provides access to the team's website.
Below the menu bar, the central horizontal portion of the window is principally responsible for displaying the animation of the algorithms(s).
- In single mode, a small portion on the right of the screen is reserved for display of the algorithm's pseudo-code. This is a non-modifiable text area (which allows scrolling), and the currently-active instruction will be highlighted. A non-modifiable text field also appears underneath the animation section, displaying comments and explanations of the algorithm's behaviour when it is running.
- In race mode, the space is divided horizontally into multiple sections, each of which contains the animation of a single algorithm. No pseudo-code or comments are displayed.
The bottom section of the window is vertically divided into two sections:
- The left section contains the playback controls for the animation. Buttons allow the user to play, pause, step through or stop the animation. A scrolling slider controls the rate at which the animation plays back.
- The right section contains a small non-modifiable text field, displaying additional information about the algorithm(s). For example, in race mode, the display contains the number of comparisons made so far by each algorithm, along with other relevant details.

Animation
There will be two ways of representing the sorting algorithms: by having boxes with numbers in them, and then horizontal graphs. The two boxes being swapped will be highlighted then move with a smooth arc to their new positions.
For the string matching, they will be displayed by two arrays on top of each other, and as the algorithm runs, the current letter being tested will be highlighted.
7.2 AA Package Hierarchy

7.3 Animation Package Hierarchy

7.4 Testing procedure
A class's testing will begin with the programmer responsible for coding it. During it's development, the coder will perform initial, simple tests as this will mean that many bugs can be eliminated near the beginning of the testing procedure. Once the coder is sufficiently happy, he will mark the class as ready for further testing by the group's testers.
Each class will be tested separately where possible. Some classes will require the services provided by other classes in order to be sensibly tested - in these cases, the independent classes will be tested first. Once the classes that comprise a particular module have been written and tested, the module itself can then be tested. Similarly, once all modules have been completed, testing of the complete application can begin. This hierarchical testing procedure will ensure that bugs are pinpointed and eliminated at the earliest possible stages.
Every class file will contain its specification in the form of comments - this will allow the tester to design a thorough test procedure. For each class, suitable test data will be generated and passed to the relevant methods. This will usually involve writing a separate tester class for each class in the application. Tester classes will pass data to the methods of the class being tested, and then log the returned data. This can then be checked against the class's specification, to verify that its behaviour is correct. In certain cases (for example, testing animation methods), a suitable procedure for logging the class's output may not exist. The tester will then be required to monitor the class's behaviour, and record his observations appropriately (for example, checking that an animation method produces the desired effect on the screen).
Some methods to be tested will have small sets of possible input data - for such modules, it is feasible to test every single possible input. Where the set of input data is too large for this, a representative subset of this data will be used for testing. Also, extreme values will be entered.
After a class is tested, the relevant documentation will be updated, recording the following information:
- Name of file and version tested
- Date tested
- Name of tester
- Result of test (and description of errors, if any)
There are two possible courses of action when a bug is encountered. If the error is sufficiently small, such that the tester may correct the code, then this will be done. In doing so, a new version of the code will be created, and the testing stage must be repeated. In the event of a more complicated error, the original coder will be informed in order to correct the error, and the above testing cycle will be started again.
It is well documented within the computing industry that different errors/bugs are often found by different testers. Hence the more testers that test a section of code, the more likely bugs are to be found. We will employ this method by making sure at least two testers test each piece of final code. "Final code" is the last version of code created before the overall program is compiled.
CVS is to be used to help maintain the integrity of our project. This allows testing to be carried out in a systematic and efficient way. For instance, this will help to prevent redundant sections of code from being tested, hence making the most efficient use of the time available.
8. Final Acceptance Criteria
The project will be deemed 'complete' when each of the following criteria have been met:
- The system must be able to animate at least five different algorithms in total, for two different problems
- The animation must illustrate clearly the behaviour of an algorithm
- The animations must be able to run in one of at least two different modes: step-by-step display (pausing after each step), and slow motion (continuous non-looping playback)
- The user must be able to input the initial data for an algorithm, and only a sensible range of data must be accepted by the system
- The system must be able to generate random initial data for an algorithm, if the user does not wish to enter specific data
- A pseudo-code version of the current algorithm must be displayed for the user
- The currently active line of the pseudo-code must be clearly highlighted in a suitable manner
9. Management Strategy
Given the time allocation and the size of the task, it is clear that working together efficiently as a team will become paramount. Working to meet the deadlines is an important part of the project and will require coordination and communication between all members. For this reason, each group member has been allocated a specific set of tasks to accomplish to ensure that all aspects of the project are dealt with.
- Group Manager
- The group manager is responsible for overseeing the project. This includes organising meetings, coordinating tasks and roles, and resolving any conflicts that may arise. The group manager will also write part of the documentation.
- Coders
- The programmers design the details of the final implementation of the brief, and write the specification of components. They also write the actual code and comment this.
- Testers
- Write the module and system integration testing documents, write a test harness and thoroughly test all parts of written code.
- Webmaster
- Is responsible for maintaining the website. They will also be responsible for generating all the Java documents for the code, and will write the User manual at the end.
- Librarian
- Is responsible for looking after the group file space, and dealing with CVS.
9.1 Company Hierarchy

9.2 Detailed Plan Of Action
Good time management is fundamental to a successful project to ensure that deadlines are met and that each member of the team is making a positive contribution. A Gantt chart is the standard format for displaying a schedule graphically. It consists of a horizontal bar chart with time as the horizontal axis and tasks as the vertical axis. Individual operations are displayed as horizontal bars in the chart, indicating the time at which the job begins and ends.

10. Miscellaneous
10.1 Website
The group website can be found at: http://groups.pwf.cam.ac.uk/clteach/grpproj/alpha/.
It will contain all documentation, as well the minutes of the meetings. It is intended that the final product will be able to appear as an applet form on the site, so will be a useful resource for future users wishing to investigate this problem.
10.2 Group Workspace Structure

10.3 Backup Procedure
Whilst CVS (Concurrent Version Systems) is being utilised to protect previous versions of code, in case a current working version of code becomes corrupt, a backup of all files is required as a precautionary act against system failure, or in the unlikely event that a portion of the network becomes inaccessible for a length of time. This backup will be made by copying the entire file structure from the /grpproj/alpha directory to a member of our group's PWF site (ts270@pwf.cam.ac.uk). The location of the backup was chosen so that it resides on a separate server, in this case the PWF server and not the Linux server. A backup copy, as described above, will be taken daily.