teaching machines

CS 145 Lecture 26 – Arrays Cont’d

November 9, 2016 by . Filed under cs145, fall 2016, lectures.

Dear students,

Today we continue looking at the data that we collected last time:

  1. The number of children your grandparents had (i.e., the number of parents you have plus their brothers and sisters). For example, I have two parents, three uncles, and two aunts, so I’d report 7.
  2. The number of children your parents had, including any half- or step-siblings, and including you. For example, I have just one brother, so I’d report 2.

We collected this data in a plain text file, and we will process it to determine the following statistics:

To calculate the relationship between the two numbers, we will essentially ask this question: “Is the number of children in your family a function of the number of children in your parents’ families?” We will compute a trend line—a linear regression—that gives such a function, and we will see if it’s a good fit or not.

Linear regression is computed with the following formulae:

    meanXY - meanX * meanY
m = ----------------------
    meanXX - meanX * meanX

b = meanY - m * meanX

y = mx + b

We’ll plot the results and see how good a model this function is.

Next we’ll examine some other data that belongs to us: our birthdays! We will do a quick check of the canonical birthday problem:

Put n people in a room. What’s the likelihood that all have different birthdays? At what n, does it flip from unlikely to likely that there’s a shared birthday?

Do we have any shared birthdays in this class? We’ll find out. Arrays will help.

Here’s your TODO list to complete before we meet again:

See you next class!

Sincerely,

P.S. Here’s the code we wrote together…

Generations.java

package lecture1109;

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Arrays;
import java.util.Scanner;

public class Generations {
  public static void main(String[] args) throws FileNotFoundException {
    File inFile = new File("/Users/johnch/numbers.csv");
    Scanner in = new Scanner(inFile);

    int nsamples = in.nextInt();
    int[] oldChildren = new int[nsamples];
    int[] youngChildren = new int[nsamples];

    for (int i = 0; i < nsamples; ++i) {
      oldChildren[i] = in.nextInt();
      youngChildren[i] = in.nextInt();
    }

    in.close();

//    for (int i = 0; i < oldChildren.length; ++i) {
//      System.out.println(oldChildren[i]);
//    }
//    System.out.println(Arrays.toString(oldChildren));
    
    int sumSoFar = 0;
    int sumXX = 0;
    int sumXY = 0;
    int sumX = 0;
    int sumY = 0;
    
    for (int i = 0; i < youngChildren.length; ++i) {
      sumY += youngChildren[i];
      sumX += oldChildren[i];
      sumXX += oldChildren[i] * oldChildren[i];
      sumXY += oldChildren[i] * youngChildren[i];
      System.out.println(oldChildren[i] + "," + youngChildren[i]);
    }
    
    double meanX = sumX / (double) nsamples;
    double meanY = sumY / (double) nsamples;
    double meanXX = sumXX / (double) nsamples;
    double meanXY = sumXY / (double) nsamples;
    
    double m = (meanXY - meanX * meanY) / (meanXX - meanX * meanX);
    double b = meanY - m * meanX;
    
    System.out.printf("y = %f * x + %f%n", m, b);
    
//    double mean = sumSoFar / (double) youngChildren.length;
//    System.out.println(mean);
  }
}

Birthdays.java

package lecture1109;

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Arrays;
import java.util.Scanner;

public class Birthdays {
  public static void main(String[] args) throws FileNotFoundException {
    Scanner in = new Scanner(new File("/Users/johnch/birthdays.csv"));

    int[] counts = new int[31 * 12];

    while (in.hasNextInt()) {
      int month = in.nextInt();
      int day = in.nextInt();

      int daysBeforeThisMonth = (month - 1) * 31;
      int i = daysBeforeThisMonth + day - 1;

      // increment that day's counter
      counts[i]++;
    }

    in.close();

    System.out.println(Arrays.toString(counts));

    for (int i = 0; i < counts.length; ++i) {
      if (counts[i] > 1) {
        int month = i / 31 + 1;
        int day = i % 31 + 1;
        System.out.println(month + " " + day);
      }
    }
  }
}