Set up a nursery to develop 5 different treatment methods for a flower tree seed, each method treated 6 seeds for seedling experiment. Observe the height of the seedlings one year later and obtain the information as shown in the table below. It is known that except for the different treatment methods, other seedling raising conditions are the same and the distribution of seedling height is similar to normal and equal variance, try to judge whether the seed treatment method has a significant impact on the growth of seedlings with 95% reliability.
The format of the copy is very unfriendly, so I wrote a python code for conversion, the code:
import csv i = 0 f = open('C://Users/Administrator/Desktop/Analysis of variance.txt','r') csvfile = open('C://Users/Administrator/Desktop/Analysis of variance.csv','wt',newline='',encoding='utf-8') writer = csv.writer(csvfile) for fs in f: i = i+1 contents_1 = fs.strip() contents = contents_1.split(',') for content in contents: writer.writerow((content,i)) f.close() csvfile.close()
The data can be converted into the following format, which is convenient to run in python's variance analysis:
df = pd.read_excel('C:/Users/Administrator/Desktop/Analysis of variance.xls',header=None,names=['value','group']) d1 = df[df['group']==1]['value'] d2 = df[df['group']==2]['value'] d3 = df[df['group']==3]['value'] d4 = df[df['group']==4]['value'] d5 = df[df['group']==5]['value'] args = [d1,d2,d3,d4,d5] f,p = stats.f_oneway(*args) print(f,p)
The result is shown in the figure:
Look up the table and get F0.05(4,25)=2.76, because F=Sb2/Sw2=4.38﹥F0.05(4,25)=2.76, so the hypothesis H0 is overturned (or rejected), that is, caused by different treatment methods The difference in seedling height growth is significant.