### Course Content DAA

posted Nov 16, 2013, 1:12 AM by Anupinder Singh   [ updated Nov 16, 2013, 1:20 AM ]

 Breadth first search(BFS) is the simplest graph traversal and search algorithm. In this a given graph G=( V , E ) we start traversing the graph from on particular vertex 's' and include all nodes that are joined by an edge to vertex 's'. Once done, we then include all additional nodes that are joined by an edge to any other node previously searched for vertex 's' and we continue in the way until no new nodes are encountered. Therefore this is a layer based search and at each layer we try to discover more connecting vertices. Also, It computes the distance from 's' to each reachable vertex with minimum number of edges on a path joining them. Thus, BFS is not only determining the nodes that 's' can reach, but it also computing the shortest path to them. BFS also produces a tree 'T' with root 's' that contains all reachable vertices. For any vertex 'v' reachable from s, the simple path in the breadth-first tree from 's' to 'v' corresponds to a “shortest path” from 's' to 'v' in G, that is, a path containing the smallest number of edges. BFS algorithm works on both directed and undirected graphs.For defining an algorithm of BFS, we assume following inputs- Graph G =(V , E) and is represented using adjacency list.- 's' will be denoting the source vertex(root).- To keep track of progress, BFS color each vertex green, grey and black. All vertices start out green and later become first grey then black. So to store the color of each vertex 'u' Є 'V' in the attribute 'u.color'- To store the predecessor of 'u' in attribute 'u.π'. In case of no predecessor it will be denoted NIL.- To store the distance(in terms of level) attribute 'u.d' will be used. This distance will be from the root(source).- A first-in, first-out queue 'Q' will be used to manage the grey vertices. Below is the algorithm of BFS with explanation  Let us solve following example for BFS and traverse through the directed graphAbove example explains how the vertices color is changed and how the queue will look like when the BFS algorithm will execute as program.Few more applications of BFS are as: Flood fill in image processing.Finding all nodes within one connected component.Copying collection using Cheney's algorithm.Finding shortest path between two nodes (as shown in above example, it there are weights associated with each edge).Testing a graph for bipartiteness. etc.

#### Introduction to Graphs

posted Nov 15, 2013, 11:43 AM by Anupinder Singh   [ updated Mar 3, 2014, 2:13 AM by Anupinder Singh ]

Graph is basically a non-linear data structure. A graph consists of a set of objects where some pairs of objects are connected by links. The interconnected objects are called as vertices(nodes), and the links that connect some pairs of vertices are called edges. The edges connecting vertices can be directed or undirected. Therefore two types of graphs exist:
• Directed Graph : Hold directed edges between vertices
• Undirected Graph : Hold undirected edges between vertices.

So formally we can say that, A graph G = ( V , E ) consists of two things:
• A set 'V' of elements called vertices(nodes).
• A set 'E' of edges such that each edge 'e' in 'E' is identified with a unique unordered pair [u,v] of nodes in 'V', denoted as 'e'=[u,v]
Examples

Now above figure shows a graph consisting of 5 round object. Therefore if we define it as per above mentioned definition it will be written as:

G= ( V , E )

V = { 1,2,3,4,5} (as there are 5 numbered objects in this graph)

E = {[1,2] , [1,3] ,[2,1] , [2,4] , [3,1] , [3,4] , [3,5] , [4,2] , [4,3] ,[4,5] , [5,3] , [5,4] }

Now if you notice carefully the figure shows only 6 edges, where as set 'E' consists of 12 edges. This is because this graph is undirected graph, so each vertices connected to other vertices will also allows opposite relation to like [u,v] <-> [v,u].

Now lets consider another example with a few changes in same graph

G= ( V , E )

V = { 1,2,3,4,5} (as there are 5 numbered objects in this graph)

E = {[1,2] , [2,4] , [3,1] , [3,4] , [4,5] , [5,3]  }

Here set 'E' will now only contain those edges which directs to certain node, this is because it is a directed graph. for eg. there is a directed edge from node 1 to node 2, but reverse is not available there fore only one entry will be allowed in 'E' set.

Next important point about graphs is how they are stored in computer? this is a little tricky work. There are two standard ways to represent a graph G= (V,E) :
1. A Collection of Adjacency List : Used when 'E' is much less than '|V|2'.
2. An Adjacency Matrix : used when 'E'  is close to '|V|2'.

 Adjacency List for Undirected Graph mentioned in first figure of graph Adjacency List for directed Graph mentioned in second figure of graph

In above figures adjacency list of undirected and directed graphs are shown. If you are aware about data structures this adjacency list actually make used of Linked list data structure. In this we make an array of pointers and then point each index corresponding to each vertices(node) to a linked list containing all end node numbers of the edge.

 Adjacency matrix for Undirected Graph mentioned in first figure of graph Adjacency matrix for directed Graph mentioned in first figure of graph

In above figures adjacency matrix of undirected and directed graphs are shown. This List is basically an array of 'v' by 'v' size and fill value one on the intersection box of start node and end node. For example if we have and edge [1,2] and [2,1] in undirected graph(Left figure) then if you notice the intersection box of 'row 1' and 'column 2' is assigned value '1'. Similarly the intersection of  'row 2' and 'column 1' is also assigned value '1' and so on. Similar setup is used in directed graph.

Degree of Vertex

This is the most important term in graphs and is denoted differently for Directed and Undirected Graph. In case of Undirected graph the degree of vertex is the number of edges inclined to it. In case of Directed graphs the degree of vertex is categorized as: OutDegree(number of out going edges) and InDegree(number of incoming edges)

#### Dynamic Programming Technique

posted Nov 15, 2013, 5:44 AM by Anupinder Singh   [ updated Mar 3, 2014, 2:11 AM by Anupinder Singh ]

 Dynamic Programming Technique is similar to divide-and-conquer technique. Both techniques solve a problem by breaking it down into several sub-problems that can be solved recursively. The main difference between is that, Divide & Conquer approach partitions the problems into independent sub-problems, solve the sub-problems recursively, and then combine their solutions to solve the original problems. Where as dynamic programming is applicable when the sub-problems are not independent, that is, when sub-problems share subsubproblems. Also A dynamic programming algorithms solves every sub problem just once and then saves its answer in a table, there by avoiding the work of recomputing the answer every time the subsubproblems is encountered.Therefore "Dynamic programming is a applicable when sub problem are not independent ,that is,when sub problem share sub problems." As Greedy approach, Dynamic programming is typically applied to optimization problems and for them there can be many possible solutions and the requirement is to find the optimal solution among those. But, Dynamic programming approach is little different greedy approach. In greedy solutions are computed by making choices in serial forward way and in this no backtracking & revision of choices is done where as Dynamic programming computes its solution bottom up by producing them from smaller sub problems, and by trying many possibilities and choices before it arrives at the optimal set of choices. The Development of a dynamic-programming algorithm can be broken into a sequence of four steps:Divide, Sub problems: The main problems is divided into several smaller sub problems. In this the solution of the main problem is expressed in terms of the solution for the smaller sub problems. Basically it is all about characterizing the structure of an optimal solution and recursively define the value of an optimal solution.Table, Storage: The solution for each sub problem is stored in a table, so that it can be used many times whenever required.Combine, bottom-up Computation: The solution to main problem is obtained by combining the solutions of smaller sub problems. i.e. compute the value of an optimal solution in a bottom up fashion.Construct an optimal solution from computed information.(This step is optional and is required in case if some additional information is required after finding out optimal solution.)Now for any problem to be solved through dynamic programming approach it must follow the following conditions:Principle of Optimality: It states that for solving the master problem optimally, its sub problems should be solved optimally. It should be noted that not all the times each sub problem(s) is solved optimally , so in that case we should go  for optimal majority.Polynomial Breakup: For solving the main problem, the problem is divided into several sub problems and for efficient performance of dynamic programming the total number of sub problems to be solved should be at-most a polynomial number.Various algorithms which make use of Dynamic programming technique are as follows:Knapsack problem.Chain matrix multiplication.All pair shortest path.Travelling sales man problem.Tower of hanoi.Checker Board.Fibonacci Sequence.Assembly line scheduling.Optimal binary search trees.

#### Greedy Technique

posted Nov 14, 2013, 9:18 PM by Anupinder Singh   [ updated Mar 3, 2014, 2:08 AM by Anupinder Singh ]

 Greedy Technique uses iterative/recursive approach to solve an optimization problem and selects an optimal(best) solution. Greedy approach look for simple, easy-to-implement solutions to complex, multi step problems by deciding which next step will provide the most benefit. It is not always the greedy technique produce an optimal solution, but it helps in producing an approximate optimal solution in a reasonable time. Greedy generally works on heuristic approach.The greedy technique is one of the simplest approaches to solve the optimization problems in which we want to determine the local optimum of a given function by a sequence of steps where at each stage we can make a choice among a class of possible decisions. In this, the choice of the optimal decision is made in the information at hand without worrying about the effect these decisions may have in the future. The greedy choice property and optimal sub-problem are two ingredients in the problem that lend to a greedy strategy. It says that a globally optimal solution can be arrived at by making a locally optimal choice(with no Guarantee). OR we can say greedy method arrives at a solution by making a  sequence of choices, each of which simply looks the best at moment.Greedy Choice Property: It says, we can assemble a globally optimal solution by making locally optimal choices i.e. The greedy choice is always part of certain optimal solution.Optimal Sub-Problem: It says, an optimal solution to the problem contains within its optimal solution to sub-problems i.e. Global optimal solution is constructed from local optimal solutions.In optimization problems, there are two types of solutions:Feasible Solutions : these are not a clear optimal solution, but are close to optimal solution( can be said as approximate solution)Optimal Solutions : these are fully acceptable optimized solution for the current optimization problem.Any Greedy Algorithm will have following five components:Candidate Set: From which a solution is created.Selection Function: Chooses the best candidate to be added to the solution.Feasibility Function: Used to determine, if a candidate can be used to contribute to a solution.Objective Function: Assigns a value to a  solution or a partial solutionSolution Function: Indicates when we have discovered a complete solution.Greedy algorithms are easy to invent, implement and most of time they are efficient. However there are may problems that cannot be solved correctly by this approach and in many cases there is no guarantee that making locally optimal improvements in a locally optimal solution with produce the optimal global solution.Following are few algorithms that make use of greedy approach/technique:Knapsack problem.Kruskal's Algorithm.Prim's Algorithm.Dijkstra's  Algorithm.Huffman tree building.Traveling salesman problem. etc.

#### Divide and Conquer Technique

posted Nov 4, 2013, 10:13 PM by Anupinder Singh   [ updated Mar 3, 2014, 2:06 AM by Anupinder Singh ]

 Divide and Conquer approach basically works on breaking the problem into sub problems that are similar to the original problem but smaller in size & simpler to solve. once divided sub problems are solved recursively and then combine solutions of sub problems to create a solution to original problem.At each level of the recursion the divide and conquer approach follows three steps:Divide:  In this step whole problem is divided into several sub problems.Conquer: The sub problems are conquered by solving them recursively, only if they are small enough to be solved, otherwise step1 is executed.Combine: In this final step, the solution obtained by the sub problems are combined to create solution to the original problem.Divide and conquer approach has several advantages as follows:Solving conceptually difficult problems, it just require to divide them into sub problems. for example: Tower of HanoiDivide and Conquer approach helps in discovering efficient algorithms like Strassen's Algorithm for matrix multiplication.Divide and Conquer also adapts itself to execute in multiprocessor machines. There fore parallel execution of sub problems on different processors with shared memory helps in parallel execution if master problem, results in increased algorithm performance.Following are the problems which has become very efficient with use of Divide and Conquer approach:Binary Search.Finding the Maximum and Minimum.Merge Sort.Quick Sort.Selection Sort.Strassen's Matrix multiplication.Convex Hull.All above mentioned problems will be explained on other pages with concerned topics or you can click on provided link to directly read them.

#### Polynomial vs. Exponential Running Time

posted Nov 1, 2013, 1:49 AM by Anupinder Singh   [ updated Mar 3, 2014, 2:04 AM by Anupinder Singh ]

 The time complexity(generally referred as running time) of an algorithm is expressed as the amount of time taken by an algorithm for some size of the input to the problem. Big O notation is commonly used to express the time complexity of any algorithm as this suppresses the lower order terms and is described asymptotically. Time complexity is estimated by counting the operations(provided as instructions in a program) performed in an algorithm. Here each operation takes a fixed amount of time in execution. Generally time complexities are classified as constant, linear, logarithmic, polynomial, exponential etc. Among these the polynomial and exponential are the most prominently considered and defines the complexity of an algorithm. These two parameters for any algorithm are always influenced by size of input.Polynomial Running Time An algorithm is said to be solvable in polynomial time if the number of steps required to complete the algorithm for a given input is O(nk) for some non-negative integer k, where n is the complexity of the input. Polynomial-time algorithms are said to be "fast." Most familiar mathematical operations such as addition, subtraction, multiplication, and division, as well as computing square roots, powers, and logarithms, can be performed in polynomial time. Computing the digits of most interesting mathematical constants, including pi and e, can also be done in polynomial time.Ref:WolframMathWorldAll basic arithmetic operations ((i.e.) Addition, subtraction, multiplication,division), comparison operations, sort operations are considered as polynomial time algorithms.Exponential Running TimeThe set of problems which can be solved by an exponential time algorithms, but for which no polynomial time algorithms is known. An algorithm is said to be exponential time, if T(n) is upper bounded by 2poly(n), where poly(n) is some polynomial in n. More formally, an algorithm is exponential time if T(n) is bounded by O(2nk) for some constant k.Ref:WikiExamples of exact Exponential time algorithms can be read from following link of Computer Science AlgorithmsAlgorithms which have exponential time complexity grow much faster than polynomial algorithms. The difference you are probably looking for happens to be where the variable is in the equation that expresses the run time. Equations that show a polynomial time complexity have variables in the bases of their terms. Examples: n3 + 2n2 + 1. Notice n is in the base, NOT the exponent. In exponential equations, the variable is in the exponent. Examples: 2n. As said before, exponential time grows  much faster. If n is equal to 1000 (a reasonable input for an algorithm), then notice 10003 is 1 billion, and 21000 is simply huge! For a reference, there are about 280 hydrogen atoms in the sun, this is much more than 1 billion.for more reference you can check this PDF file

#### Asymptotic Notations

posted Oct 20, 2013, 10:53 PM by Anupinder Singh   [ updated Mar 3, 2014, 1:59 AM by Anupinder Singh ]

< Previous-Performance Analysis                                                                                                                                                                                            Next - Polynomial vs. Exponential Running Time>

Order of Growth: Whenever we work with algorithms, the most critical factor considered for them is their efficiency. And for this we are interested in analyzing how the running time increases when input size increases. when to algorithms are compared with respect to their behavior for large input size, the measure is know as Order of growth. The order of growth can be estimated by taking into account the dominant term of the running expression. The dominant term overpasses the values of the other terms when the input size is large.

For example:
1. Let us consider T(n) = an + b where a>0, If the input size 'n' is multiplied by 'k' then the dominant term in T(n) i.e. 'an'  is also multiplied by 'k'. This produces T(kn) = k(an)+b and we can say that T(n) has a linear order of growth.
2. Let us consider T(n) = an2+ bn + c where a>0, If the input size 'n' is multiplied by 'k' ,this produces T(kn) = (k2a)n2+ (kb)n + c, thus the dominant term is multiplied by k2 and we can say that T(n) has a quadratic order of growth. and so on for cubic, logarithmic etc.
Thus for efficiency with respect to large problems the analysis is made for large values of the input size and dominant terms plays and important role. The analysis that is conducted for large values of input size is called as asymptotic analysis.

In Asymptotic analysis it is considered that an algorithm 'A1' is better than algorithm 'A2' if the order of growth of the running time of the 'A1' is lower than that of 'A2'. Therefore asymptotic efficiency of algorithms are concerned with how the running time of an algorithm increases with the size of the input in the limit, as the size of the input increases without bound.

Asymptotic Notations: For every algorithm corresponding to efficiency analysis, we have three basic cases :
• Best Case
• Average Case
• Worst Case
And there are 5 notations to resemble them:

 S.No Asymptotic Notation Name Symbol Used Efficiency Case Resemblance Examples 1 Theta notation Θ Average Case (Tightly Bound) Θ(1), Θ(n), Θ(n2), Θ(n3), Θ(nn), Θ(log n), Θ(n log n) 2 Big oh notation O Worst Case (Tightly Bound) O(1), O(n), O(n2), O(n3), O(nn), O(log n), O(n log n) 3 Omega notation Ω Best Case (Tightly Bound) Ω(1), Ω(n), Ω(n2), Ω(n3), Ω(nn), Ω(log n), Ω(n log n) 4 Little oh notation o Worst Case (Loosely Bound) o(1), o(n), o(n2), o(n3), o(nn), o(log n), o(n log n) 5 Little omega notation ω Best Case (Loosely Bound) ω(1), ω(n), ω(n2), ω(n3), ω(nn), ω(log n), ω(n log n)

#### Performance Analysis

posted Oct 20, 2013, 10:52 PM by Anupinder Singh   [ updated Mar 3, 2014, 1:57 AM by Anupinder Singh ]

< Previous-What is an Algorithm                                                                                                                                                                                                                              Next - Asymptotic Notations >

Performance analysis of an algorithm depends upon two factors i.e. amount of memory used and amount of compute time consumed on any CPU. Formally they are notified as complexities in terms of:
• Space Complexity.
• Time Complexity.

Space Complexity of an algorithm is the amount of memory it needs to run to completion i.e. from start of execution to its termination. Space need by any algorithm is the sum of following components:

1. Fixed Component: This is independent of the characteristics of the inputs and outputs. This part includes: Instruction Space, Space of simple variables, fixed size component variables, and  constants variables.
2. Variable Component: This consist of the space needed by component variables whose size is dependent on the particular problems instances(Inputs/Outputs) being solved, the space needed by referenced variables and the recursion stack space is one of the most prominent components. Also this included the data structure components like Linked list, heap, trees, graphs etc.

Therefore the total space requirement of any algorithm 'A' can be provided as

Space(A)  = Fixed Components(A) + Variable Components(A)

Among both fixed and variable component the variable part is important to be determined accurately, so that the actual space requirement can be identified for an algorithm 'A'.  To identify the space complexity of any algorithm following steps can be followed:

1. Determine the variables which are instantiated by some default values.
2. Determine which instance characteristics should be used to measure the space requirement and this is will be problem specific.
3. Generally the choices are limited to quantities related to the number and magnitudes of the inputs to and outputs from the algorithms.
4. Sometimes more complex measures of the interrelationships among the data items can used.

Example: Space Complexity

`Algorithm Sum(number,size)\\ procedure will produce sum of all numbers provided in 'number' list`
`{`
`    ``    ``result=0.0;`
`    ``    ``for ``count = 1 to size do                    \\will repeat from 1,2,3,4,....size times`
`    ``    ``    ``    ``result= result + number[count];`
`    ``    ``return result;`
`}`

In above example, when calculating the space complexity we will be looking for both fixed and variable components. here we have

Fixed components
as 'result','count' and 'size' variable there for total space required is three(3) words.

Variable components
is characterized as the value stored in 'size' variable (suppose value store in variable 'size 'is 'n'). because this will decide the size of 'number' list and will also drive the for loop. therefore if the space used by size is one word then the total space required by 'number' variable will be 'n'(value stored in variable 'size').

therefore the space complexity can be written as Space(Sum) = 3 + n;

Time Complexity of an algorithm(basically when converted to program) is the amount of computer time it needs to run to completion. The time taken by a program is the sum of the compile time and the run/execution time . The compile time is independent of the instance(problem specific) characteristics. following factors effect the time complexity:
1. Characteristics of compiler used to compile the program.
2. Computer Machine on which the program is executed and physically clocked.
3. Multiuser execution system.
4. Number of program steps.
Therefore the again the time complexity consist of two components fixed(factor 1 only) and variable/instance(factor 2,3 & 4),  so for any algorithm 'A' it is provided as:

Time(A) = Fixed Time(A) + Instance Time(A)

Here the number of steps is the most prominent instance characteristics and The number of steps any program statement is assigned depends on the kind of statement like
• comments count as zero steps,
• an assignment statement which does not involve any calls to other algorithm is counted as one step,
• for iterative statements we consider the steps count only for the control part of the statement etc.
Therefore to calculate total number program of program steps we use following procedure. For this we build a table in which we list the total number of steps contributed by each statement. This is often arrived at by first determining the number of steps per execution of the statement and the frequency of each statement executed. This procedure is explained using an example.

Example
: Time Complexity

 ` Statement` ` Steps per execution` ` Frequency` ` Total Steps` ` Algorithm Sum(number,size)` ` 0` ` -` `0 ` ` {` ` 0` ` -` `0` ` ``        result=0.0;` ` 1` ` 1` `1` `         for count = 1 to size do` ` 1` ` size+1` `size + 1` `                 result= result + number[count];` ` 1` ` size` `size` `         return result;` ` 1` ` 1` `1` ` }` ` 0` ` -` `0` ` Total` ` ` ` ` `2size + 3`

In above example if you analyze carefully frequency of "`for count = 1 to size do`" it is 'size +1' this is because the statement will be executed one time more die to condition check for false situation of condition provided in for statement. Now once the total steps are calculated they will resemble the instance characteristics in time complexity of algorithm. Also the repeated compile time of an algorithm will also be constant every time we compile the same set of instructions so we can consider this time as constant 'C'. Therefore the time complexity can be expressed as: Time(Sum) = C + (2size +3)

So in this way both the Space complexity and Time complexity can be calculated. Combination of both complexity comprises the Performance analysis of any algorithm and can not be used independently. Both these complexities also helps in defining parameters on basis of which we optimize algorithms.

< Previous-What is an Algorithm                                                                                                                                                                                                                              Next - Asymptotic Notations >

#### What is an Algorithm?

posted Oct 20, 2013, 10:51 PM by Anupinder Singh   [ updated Mar 3, 2014, 2:29 AM by Anupinder Singh ]

 Next - Performance Analysis >DefinitionsThere are various definitions available for term Algorithm as:"A set of rules for solving a problem in a finite number of steps" - Dictionary"A set of steps that are followed in order to solve a mathematical problem or to complete a computer process" - Merriam-Webster"An Algorithm is any well defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values as output"- refer to References"A process or set of rules to be followed in calculations or other problem-solving operations, especially by a computer." - GoogleAny of the above mentioned definition will work, and even all definitions provides a similar summary i.e. a set of well defined steps to solve a defined problem corresponding to provided inputs.Five characteristics of an Algorithm are:Input: Accept Zero or more inputs externally.Output: Must produce at least one output as result of procedure.Definiteness: Each instruction must be clear and unambiguous.Finiteness: If all the instructions are traced in algorithm, then for all cases the algorithm must terminate after a finite number of steps.Effectiveness: Every instruction must be very basic so that it can be carried out  and must be feasible to perform on any machine. One important point about algorithm, it is not a computer program. Instead it is a procedure written as pseudo code or English language type description, providing information about what calculations are to be performed and in which flow.For Example : To add two numbers we can have an algorithm like:English language Type`    1. Ask user to enter two integer numbers.``    2. Add both the integer numbers.``    3. Print the final sum value of two numbers`ORPseudo Code Type` ``   ``Algorithm`` SUM``    ``    ``Input: ``Two Integer Numbers as ONE, TWO``   ``    `` ``Output``: sum of ONE & TWO as RESULT``    ``    ``ONE <- User input required``    ``    ``TWO <- User input required``    ``    ``RESULT <- ONE + TWO``    ``    ``return RESULT`ORFlowchart RepresentationIn case you want to show it visually it can be represented as:Even for some cases all the three types of defining the algorithm is used. I depends on what kind of document you are working on.Next - Performance Analysis >

1-9 of 9