Designing Algorithms Csci 107 Lecture 4

advertisement

Designing Algorithms

Csci 107

Lecture 4

Outline

Last time

• Computing 1+2+…+n

Adding 2 n-digit numbers

Today: More algorithms

Sequential search

• Variations of sequential search

Pattern matching

A Search Algorithm

Problem statement: Write a pseudocode algorithm to find the location of a target value in a list of values.

Input: a list of values and the target value

Output: the location of the target value, or else a message that the value does not appear in the list.

Variables:

The (sequential) search algorithm (Fig 2.9)

Variables: target, n, list list of n values

Get the value of target , n , and the list of n values

Set index to 1

Set found to false

Repeat until found = true or index > n

If the value of list index then

= target

Output the index

Set found to true else

Increment the index by 1

If found is false then

Output a message that target was not found

Stop

Variations of sequential search..

• Modify the sequential search algorithm in order

– To find all occurrences of target in the list and print the positions where they occur

– To count the number of occurrences of target in the list

– To count how many elements in the list are larger than target

Iterating through a list

• Assume the input list is stored in a

1

, a

2

, …, a n

• In general, an algorithm that will have to explore every single element in the list in order will look something like this

Set i = 1

Repeat until (i>n)

<do something with element a i

>

Set i = i+1

More algorithms

• Write algorithms to find

– the largest number in a list of numbers (and the position where it occurs)

– the smallest number in a list of numbers (and the position where it occurs)

– the range of a list of numbers

• Range= largest - smallest

– the average of a list of numbers

– the sum of a list of numbers

More algorithms

Modify the sequential search algorithm in order

– To find all occurrences of target in the list and print the positions where they occur

– To count the number of occurrences of target in the list

– To count how many elements in the list are larger than target

Given a list of numbers from the user, write algorithms to find

– the largest number in a list of numbers (and the position where it occurs)

– the smallest number in a list of numbers (and the position where it occurs)

– the (arithmetic) average of a list of numbers

– the sum of a list of numbers

A Search Application in

Bioinformatics

• Human genome: sequence of billions of nucleotides

• Gene

– Determines human behavior

– Sequence of tens of thousands of nucleotides{A,C, G, T}

– The sequence is not fully known, only a portion of it..

• Problem: How to determine a gene in the human genome?

Genome: …….TCAGGCTAATCGTAGG…….

Gene probe: TAATC

Idea: Find all matches of the probe within the genome and then examine the nucleotides in that neighborhood

A Search Application in

Bioinformatics

• Problem:

– Suppose we have a text

T = TCAGGCTAATCGTAGG and a pattern P =

TA . Design an algorithm that searches T to find the position of every instance of P that appears in T.

• E.g., for this text, the algorithm should return the answer:

There is a match at position 7

There is a match at position 13

This problem is similar to the search algorithm

– except that for every possible starting position every character of P must be compared with a character of T.

Pattern Matching

• Input

– Text of n characters T1, T2, …, Tn

– Pattern of m (m < n) characters P1, P2, …Pm

• Output:

– Location (index) of every occurrence of pattern within text

• Algorithm:

– What is the idea?

Pattern Matching

• Algorithm idea:

– Check if pattern matches starting at position 1

– Then check if it matches starting at position 2

– …and so on

• How to check if pattern matches text starting at position k?

– Check that every character of pattern matches corresponding character of text

• How many loops will you need?

Pattern Matching

• Algorithm idea

– Get input (text and pattern)

– Set starting location k to 1

– Repeat until reach end of text

• Attempt to match every character in the pattern beginning at pos k in text

• If there was a match, print k

• Add 1 to k

– Stop

• Question: is this an algorithm?

– Yes, at a high level of abstraction

– Now we need to write in pseudocode

Pattern Matching Algorithm (Fig. 2.12)

Variables: n, m, T

1

T

2

…T n

, P

1

P

2

…P

Get values for n, m, the text T

1

T

2

…T n m , k, mismatch and the pattern P

1

P

2

…P m

Set k =1

Repeat until k>n-m+1

Set i to 1

Set mismatch=“NO”

Repeat until either (i>m) or (mismatch = “YES”) if P i

≠ T k+(i-1) then

Set mismatch=“YES” else Increment i by 1 if mismatch = “NO” then

Print the message “There is a match at position” k increment k by 1

Stop

Variations on the pattern matching algorithm

How would you modify the algorithm in order to

•Find only the first match for P in T.

•Find only the last match for P in T.

Download