Everyone knows that the database has a groupby function. Today I will talk about the groupby function of a dataframe.
Taking the data above as an example, to explain, first read in the data, and aggregate the data through groupby. (This data is the information collected by Jianshu it Internet articles for a period of time)
import pandas as pd import pymysql conn = pymysql.connect(host='localhost', user='root', passwd='123456', db='test', port=3306, charset='utf8') jianshu = pd.read_sql('select * from jianshu1',conn) group_user = jianshu.groupby('user') group_user.groups
It can be seen that the user id and index position and data type are returned. Calculate how many users there are through the following code.
len(group_user.groups) #result 543
Statistical display through the size method:
size_user = group_user.size() size_user
Sort the top ten users.
sort_user = size_user.sort_values(ascending=False) sort_user[0:10]
import charts series = [{'name':'Apple','data': [10],'type':'column'},{'name':'Android','data': [5],'type': 'column'},{'name':'Other','data': [5],'type':'column'}] charts.plot(series,show='inline')
We need to integrate the data into a data structure that highcharts can recognize, and then draw it.
series1 = [] for i in a.index: data = { 'name':i, 'data':[a[i]], 'type':'column' } series1.append(data) charts.plot(series1,options=dict(title=dict(text='Top ten users submitting articles')))
Here a is the top ten user data, which is sort_user[0:10]. Finally, I wish all mothers a happy holiday