一、DateFrame简介
大约 1 分钟
1.导入pandas和matplotlib.pyplot
import pandas as pd
import matplotlib.pyplot as plt
2.读取facebook.csv数据,并设置第一列为索引,将索引变为时间格式11122
fb = pd.read_csv('data/facebook.csv',index_col=0)
fb.index=pd.to_datetime(fb.index)
3.取数据的前n行数据,默认是前5行
print(fb.head())
Date | Open | High | Low | Close | Adj Close | Volume |
---|---|---|---|---|---|---|
2014-12-31 | 20.400000 | 20.510000 | 19.990000 | 20.049999 | 19.459270 | 4157500 |
2015-01-02 | 20.129999 | 20.280001 | 19.809999 | 20.129999 | 19.536913 | 2842000 |
2015-01-05 | 20.129999 | 20.190001 | 19.700001 | 19.790001 | 19.206934 | 4948800 |
2015-01-06 | 19.820000 | 19.840000 | 19.170000 | 19.190001 | 18.624611 | 4944100 |
2015-01-07 | 19.330000 | 19.500000 | 19.080000 | 19.139999 | 18.576082 | 8045200 |
4.读取矩阵的长度
print(fb.shape)
输出:(780, 6)
含义:数据有780行,6列(Date为索引)
5.生成统计数据
print(fb.describe())
Open | High | Low | Close | Adj Close | Volume | |
---|---|---|---|---|---|---|
count | 780.000000 | 780.000000 | 780.000000 | 780.000000 | 780.000000 | 7.800000e+02 |
mean | 80.212705 | 81.285654 | 79.022397 | 80.264897 | 79.914215 | 1.204453e+07 |
std | 64.226121 | 65.048907 | 63.190963 | 64.198375 | 64.327846 | 8.221848e+06 |
min | 19.250000 | 19.500000 | 18.940001 | 19.139999 | 18.576082 | 1.311200e+06 |
25% | 25.525000 | 26.085000 | 24.845000 | 25.475000 | 25.134512 | 7.215200e+06 |
50% | 53.379999 | 54.034999 | 52.930000 | 53.420000 | 53.035403 | 9.728700e+06 |
75% | 113.322502 | 115.779999 | 110.297499 | 113.702501 | 113.261238 | 1.408885e+07 |
max | 245.770004 | 249.270004 | 244.449997 | 246.850006 | 246.850006 | 9.232320e+07 |
转到 describe
2.提取2015年数据
fb_2015 = fb.loc['2015-01-01':'2015-12-31']
Open 2.288000e+01
High 2.311000e+01
Low 2.273000e+01
Close 2.297000e+01
Adj Close 2.237908e+01
Volume 5.923900e+06
Name: 2015-03-16, dtype: float64
7.提取第一行第一列数据
print(fb.iloc[0, 0])
20.4
8.绘图
plt.figure(figsize=(10, 8))
fb['Close'].plot()
plt.margins(x=0)
plt.show()
