15-122 Principles of Imperative Computation
Recitation 5 - Wed Jan 26

Reviewing is_sorted
Reviewing is_in
Analyzing linear search

Reviewing is_sorted

The function is_sorted is what we call a specification function. This function isn't meant to be run on its own to solve a problem. Instead it's typically used in an annotation for another function to check for correctness when we need it. For example, when we do linear search on a sorted array, how do we know that the array is sorted when the function starts? We can call this specification function to check it for us, but only if we're doing dynamic runtime testing using the -d option.

The following function returns true if the first n elements of the array A are sorted in non-decreasing order (i.e. for any element, the next element is either the same or increasing). It returns false otherwise. When we talked about an array being sorted, we will usually mean this unless we specify otherwise.

bool is_sorted(int[] A, int n)
//@requires 0 <= n && n <= \length(A);
{
    for (int i = 0; i < n-1; i++)
    //@loop_invariant n == 0 || (0 <= i && i <= n-1);
    {
        if (!(A[i] <= A[i+1])) return false;
    }
    return true;
}

Some points to consider:

Remember that the loop invariant must include the value of the loop variable i when the loop condition is tested the last time (i.e. when the loop is about to terminate.

In this case, since n could be 0, we need to explicitly test for this otherwise the rest of the invariant would be false and the code would cause an annotation failure. In this case, the order that we test the two separate conditions for n and i does not matter, even with short-circuit evaluation. But we tend to write the special case first (n == 0) and then the general case next.

The return statement in C0 returns immediately from the function, even if additional statements follow it. So if we find a pair of adjacent array values that are not in the correct order, we don't need to check the rest of the array since the array can't possibly be sorted, so we immediately return false. If we get through the whole loop without returning false, then when we exit, we return true.

Since this is being treated as a specification function, we generally do not write postconditions for this function.

Reviewing is_in

The function is_in is also a specification function, to be used to test assertions in other functions. Effectively, this function is the simplest form of a linear search which assumes nothing about the order of the data.

The following function returns true if the integer x is in the integer array A in one of the first n positions; false otherwise.

bool is_in(int x, int[] A, int n)
//@requires 0 <= n && n <= \length(A);
{
    for (int i = 0; i < n; i++)
    //@loop_invariant 0 <= i && i <= n;
    {
        if (A[i] == x) return true;
    }
    return false;
}

Some points to consider:

In this function, the loop invariant can have i going up to n since we may have to examine every element of the array up to position n-1.

Note that we don't have to check for n == 0 in this loop invariant. (Why?)

Whenever you write an if statement, you don't need to have an else statement with it. You only need an else statement if you need to do something if the condition is not true. So PLEASE do NOT write things like this:

  if (A[i] == x) return true;
  else x = x;                      // OUCH, THIS HURTS MY HEAD!

  if (!(A[i] == x)) i = i + 0;     // AGONY! AGONY!
  else return true; 

In these two specification functions, the variable i is defined locally for the loop only since we declare it in the for construct. We cannot access i once the loop terminates. If you need to access i for some reason after the loop terminates, declare the variable i as an int before you get to the loop, and then just write for (i = 0; ...etc to start your loop.

Linear Search: Reasoning with Invariants (Again)

Let's take another loop at linear search, assuming that the array we're given is already sorted:

int linsearch(int x, int[] A, int n)
//@requires 0 <= n && n <= \length(A);
//@requires is_sorted(A,n);
{   
    for (int i = 0; i < n && A[i] <= x; i++)
    //@loop_invariant 0 <= i && i <= n;
    //@loop_invariant i == 0 || A[i-1] < x;
    {
        if (A[i] == x) return i;
    }
    return -1;
}

This function returns the index of the first occurrence of x in the first n elements of the array A, or -1 if not found.

Let's look at the second loop invariant more carefully. First, it uses a programming technique called short-circuit evaluation (or sometimes called lazy evaluation) where the second condition is evaluated at runtime only if necessary. In this example, if i is equal to 0, then the invariant is true so there is no need to test the second condition. In fact, this is desired since A[i-1] will yield an array index out of bounds exception when i equals 0. In general, if we have the condition A || B, if A is true, B is not evaluated. If we have the condition A && B, if A is false, B is not evaluated. So be careful, in programming languages that provide short-circuit evaluation, A || B does not always yield the same result as B || A. The order does matter sometimes, and we will take advantage of this sometimes.

Now let's look at the invariant:

i == 0 || A[i-1] < x

If this condition is true at the start of a loop iteration, is it true at the end of an iteration?

When we test the loop condition the first time, the invariant holds trivially. After the each iteration is complete, i' = i + 1, so we wish to show that

A[i'-1] < x ==> A[i+1-1] < x ==> A[i] < x
(Note: i==0 is only relevant for the start of the first iteration, otherwise i==0 is false and we need to test the 2nd condition.)

The loop iteration starts with A[i] <= x and to get to the end of the iteration (without returning from the function), A[i] != x. Thus, A[i] < x.

Now, let's determine appropriate postconditions for this function based on what we know about the preconditions and the loop invariants.

First, we know that the result that gets returned is either -1 or an integer between 0 and n-1 inclusive. So we can write:

//@ensures -1 <= \result && \result < n;

Next, we know that if the returned result is -1, then the value stored in x is not in the array in the first n positions. If the returned result is in the range 0 to n-1 inclusive, then we know that this is the first occurrence of x, so x cannot be in positions 0 through \result - 1. Thus, for our next postcondition, we have to satisfy one of the two conditions above, so we can write:

  /*@ensures (\result == -1 && !is_in(x,A,n)) 
             || (A[\result] == x && !is_in(x,A,\result));  
    @*/

Note the special syntax for an annotation that takes up more than one line. (Why did we choose to use one more line here and not just write the entire precondition in one line?)

Exercise: Rewrite the postcondition above so that it doesn't call is_in if A[\result] == x. Does this improve the code? Why or why not?


written by Tom Cortina, 1/27/11