AP Biology How to DO Science…… AP Biology Lab Expectations • Expected that AP students have acquired basic lab and math skills • New curriculum requires that students receive little guidance when conducting lab work • Students are expected to perform and report experiments just like a research scientist • Will follow the basics of the scientific method Basic Scientific Method • • • • • Background information Hypothesis Experimental procedure/design Results Conclusion Universally accepted characteristics of an investigation/experiment • You must know something about the topic before developing the investigation • Classroom and research knowledge – helps you come up with questions • Hypothesis – clear, in correct format • Null Hypothesis: the hypothesis of no difference or no effect • Your experiment will be trying to disprove your HO • Alternative Hypothesis: hypothesis that experiment will test • Trying to prove your HA Universally accepted characteristics of an investigation/experiment • Independent variable : the “cause”, what you are testing the effect of, what you are doing differently to your groups • Dependent variable : the effect, what you are measuring, the response to what you are doing to each group • Controlled variables: other things that could affect your results if they aren’t controlled/the same for all groups • Experimental group(s) • Control group Preparation for our first inquiry lab • Learn how to evaluate research articles – sections will be the same in your lab write ups • Familiarize with online sources of primary and secondary sources – to use in your own work • Practice creating and interpreting graphs • Use Excel to properly graph data • Learn how to use proper statistical calculations • Use Excel to perform statistical analysis of data • Practice lab write up and mini poster • Practice those lab skills we will use all year • Perform our first official AP Biology inquiry lab You must become proficient at reading and understanding scientific papers • You will be writing your own! • Primary vs. Secondary sources in science • Primary Research Articles: document a scientist uses to communicate results to other scientists - submitted to peer reviewed journal - original report, includes detailed methods and results • Secondary Sources: usually a summary of a research project, written by scientist or journalist • We will concentrate on primary sources • Can you tell the difference? Sources of Primary Research Articles • • • • • • • • Some will be provided in class The journal Science – the father of all journals Mostly you will be looking for sources online Need to refine your searching skills www.biomedcentral.com www.ncbi.nlm.nih.gov/pubmed www.highwire.org Can go to journal websites directly and search – usually just abstracts Reading primary sources can be tough • Very technical • Experiment is testing a hypothesis, trying to show results support hypothesis • Trying to analyze if research is valid • Will be looking up many words that you don’t know in order to understand what you are reading (real scientists do this….are you above them?) Anatomy of a primary source Reference sample provided • • • • Citation Abstract – summary: purpose, design, results, conclusions Introduction – aka literature review, will include hypothesis Materials and Methods: independent, dependent and controlled variables, control groups, experimental groups, repeatability • Results: present and analyze data, charts, graphs, statistical calculations and tests, statistical significance of results – no describing or reasons for data – JUST THE FACTS/DATA • Discussion: explain what the results and/or findings mean, support or refute hypothesis, sources of error or potential criticism, connections to other projects, implications, next steps The Basics of Graphing • • • • • • • Most covered in provided packet Independent variable on X-axis Dependent variable on y-axis MUST LABEL AXES – INCLUDE UNITS Graph title Legend Short explanation so its super duper clear what the graph represents • Bar graphs: These are best used to show numeric data that represent discrete items or experiments. • Line graphs: These are used to best represent data that are samples from continuous phenomena. • Scatter plot: These graphs show the relationship between two measured variables as a scatter of individual points, each representing an item with its position determined along the X and Y axes by its values for the two variables. The points of a scatter plot are never connected, but a regression line (a “best fit” line) is often plotted, showing how one measurement varies in relation to the other. Little reason to graph by hand…. • Microsoft Excel is universally accepted when it comes to data manipulation and graphing • Only trick is to enter your data in a way that the automatic graphing function recognizes • You should have some sense of what the graph should look like so you can recognize problems…… • Computers aren’t always right • Practice making graphs using Excel • Hubbard Brook data manipulation activity Linear Regression • Analyze association between two variables – changes in one variable cause changes in other variable • Fits a straight line to the data and gives values of slope and intercept • Y=mx+c gives you a formula from which you can predict values • Use trendline feature of Excel Statistics • a branch of mathematics that provides techniques to analyze whether or not your data is significant (meaningful) • Statistical applications are based on probability statements • “I am 95% confident that I have proven my hypothesis” • Nothing is 100% “proved” with statistics • Statistics are reported • Statistics report the probability that similar results would occur if you repeated the experiment First Hurdle • Decide which statistical test to use • Need to decide this when designing your experiment • So you collect correct amount of data and can perform a valid statistical analysis Statistics deals with numbers • Need to know nature of numbers collected – Continuous variables: type of numbers associated with measuring or weighing; any value in a continuous interval of measurement. • Examples: – Weight of students, height of plants, time to flowering WHAT KIND OF GRAPH WOULD YOU USE FOR THESE? – Discrete variables: type of numbers that are counted or categorical • Examples: – Numbers of boys, girls, insects, plants WHAT KIND OF GRAPH WOULD YOU USE FOR THESE? Can you figure out… • Which type of numbers (discrete or continuous?) – Numbers of persons preferring Brand X in 5 different towns – The weights of high school seniors – The lengths of oak leaves – The number of seeds germinating – 35 tall and 12 dwarf pea plants – Answers: all are discrete except the 2nd and 3rd examples are continuous. Populations and Samples • Population includes all members of a group – Example: all 9th grade students in America – Number of 9th grade students at EHS • Sample – Used to make inferences about large populations – Samples are a selection of the population – Example: 6th period Biology • Why the need for statistics? – Statistics are used to describe sample populations as estimators of the corresponding population – Many times, finding complete information about a population is costly and time consuming. We can use samples to represent a population. – NEED TO MAKE SURE YOUR SAMPLE IS NOT BIASED – RANDOMLY SELECTED, etc. Normal Distribution • Most of the time we will assume that the measurements we take of a sample population fall into a normal distribution • Typical of natural phenomena • Frequency of data points in a population Distribution Chart of Heights of 100 Control Plants Class (height of plants-cm) Number of plants in each class 0.0-0.9 3 1.0-1.9 10 2.0-2.9 21 3.0-3.9 30 4.0-4.9 20 5.0-5.9 14 6.0-6.9 2 Histogram Number of Plants in each Class 35 30 25 20 Number of plants in each class 15 10 5 0 0.0-0.9 1.0-1.9 2.0-2.9 3.0-3.9 4.0-4.9 5.0-5.9 6.0-6.9 This is called a “normal” curve or a bell curve This is an “idealized” curve and is theoretical based on an infinite number derived from a sample • Fifty percent of the distribution lies to the left of the mean and fifty percent lies to the right of the mean. Descriptive Statistics • • • • • Mean Median Mode Standard Deviation Standard Error Mode and Median • Mode: most frequently seen value (if no numbers repeat then the mode = 0) • Median: the middle number – If you have an odd number of data then the median is the value in the middle of the set – If you have an even number of data then the median is the average between the two middle values in the set. Mean (aka average) Is the mean always an accurate representation of your data? • If collect this data: 5,5,4,5,5,6,5,5,5,5 mean = 5 • If you collect this data: 1,1,1,1,1,9,9,9,9,9 mean = 5 Which mean is really reflective of your measurements? More reliable? Usually need to add a calculation that determines the spread of numbers around the mean……. Measures of Variance… • Range = largest value – smallest value • Not really all that valuable….. • Variance = • Mathematically expresses the degree of variation of scores from the mean • Large variance = individual scores deviate a lot from the mean • Small variance = scores deviate very little from the mean • Standard deviation = Remember our original means…. • 5,5,4,5,5,6,5,5,5,5 mean = 5 • Variance = 0.222 • Std Dev = .47 (this means each measurement varies +/.47 from the mean) • 1,1,1,1,1,9,9,9,9,9 mean = 5 • Variance = 17.8 • Std Dev = 4.21 (this means each measurement varies +/- 4.21 from the mean Which mean is more reliable? • The spread of a normal distribution is controlled by the standard deviation, . The smaller the standard deviation the more concentrated the data. So how does our sample mean compare to the actual population mean? • Use standard error • This relates the standard deviation to the size of your sample • Larger samples = lower standard error (more like the overall population) • Smaller samples = higher standard error (not sure if really representative of actual population Practice…. • Mini experiments • Using excel to make calculations/statistics Comparative Statistics • Compare two sets of data to see if they are basically the same or if one set is significantly different than the other • Gives us a probability (P) that the null hypothesis is true (that both groups are the same) • Differences are significant if they have a P of less than 5% (we reject the null hypothesis…aka accept the Ha) • If P is larger than 5% then there is no significant difference between the two groups (we accept the null hypothesis) Chi Square • • • • • Used with discrete values Phenotypes, choice chambers, etc. Not used with continuous variables O= observed values E= expected values Gives you a value that you can then compare to a critical value on a chart For Example… • If toss a coin 200 times, we would expect to see 100 heads and 100 tails • These are our expected values • When we actually toss the coin, we see 82 heads and 118 tails • These are our observed values • Ho = There is no significant difference between expected and observed – difference is due to random chance • Ha = There is a significant difference between expected and observed • Are the differences between observed and expected a significant difference or a difference due to chance? • Plug into Chi square formula • 6.4 Interpreting a chi square • • • • • Next calculate degrees of freedom # of events, trials, phenotypes -1 Example 2 values-1 =1 Refer to Chi square chart Generally use the column labeled 0.05 (which means there is a 95% chance that any difference between what you expected and what you observed is within accepted random chance. • Any value calculated that is larger means you reject your null hypothesis and there is a difference between observed and expect values. How to use a chi square chart Results • According to the Chi chart, we can reject our null hypothesis (accept Ha) • The differences we see are not due to random chance, they are significant differences, something else is going on here to affect the outcome. So try it….. • • • • • • • • Toss a dice 120 times 1= 22 2 = 18 3 = 20 4 = 23 5 = 18 6 = 19 1.2 Student T-test • Most common, used when there are two sets of normally distributed data to compare • Compare the sample means to each other • Use equation to calculate the t-value • once have t value, need degrees of freedom •Degrees of freedom = total number of samples – 2 •Then refer to t value table, critical value • Ho = There is no significant difference between the means • Ha = There is a significant difference between the means Values in table are called critical values – must meet or exceed critical value in order to reject null hypothesis So if you calculated a t value of 2.34 and had 11 degrees of freedom….. Best to use Excel for this test…. • • • • • TTEST (range1, range2, tails, type) Use 2 tails and type 2 Range 1 = cells that contain data set 1 Range 2 = cells that contain data set 2 Returns P value = probability that the two means are the same (not significantly different) Pea Plant Normal Distribution Curve with Std Dev The Normal Curve and Standard Deviation A normal curve: Each vertical line is a unit of standard deviation 68% of values fall within +1 or -1 of the mean 95% of values fall within +2 & -2 units Nearly all members (>99%) fall within 3 std dev units http://classes.kumc.edu/sah/resources/sensory_processing/images/bell_curve.gif