pd.DatetimeIndex()
You can directly generate a timestamp index, and support the use of str, datetime.datetime. The type of a single timestamp is, and the type of Timestamp
multiple timestamps is DatetimeIndex
, an example is as follows:
rng = pd.DatetimeIndex(['12/1/2017','12/2/2017','12/3/2017','12/4/2017','12/5/2017']) print(rng,type(rng)) print(rng[0],type(rng[0])) >>> DatetimeIndex(['2017-12-01', '2017-12-02', '2017-12-03', '2017-12-04', '2017-12-05'], dtype='datetime64[ns]', freq=None) <class'pandas.core.indexes.datetimes.DatetimeIndex'> 2017-12-01 00:00:00 <class'pandas._libs.tslibs.timestamps.Timestamp'>
To DatetimeIndex
as the index Series, to give time series TimeSries chestnut:
st = pd.Series(np.random.rand(len(rng)), index = rng) print(st,type(st)) print(st.index) >>> 2017-12-01 0.081920 2017-12-02 0.921781 2017-12-03 0.489779 2017-12-04 0.257632 2017-12-05 0.805373 dtype: float64 <class'pandas.core.series.Series'> DatetimeIndex(['2017-12-01', '2017-12-02', '2017-12-03', '2017-12-04', '2017-12-05'], dtype='datetime64[ns]', freq=None)
pd.date_range() generates date range in two ways (the default frequency is day):
Give a chestnut:
date1 = pd.date_range('2017/1/1','2017/10/1',normalize=True) print(date1) date2 = pd.date_range(start = '1/1/2017', periods = 10) print(date2) date3 = pd.date_range(end = '1/30/2017 15:00:00', periods = 10,normalize=True) # Added hour, minute, and second print(date3) >>> DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03', '2017-01-04', '2017-01-05', '2017-01-06', '2017-01-07', '2017-01-08', '2017-01-09', '2017-01-10', ... '2017-09-22', '2017-09-23', '2017-09-24', '2017-09-25', '2017-09-26', '2017-09-27', '2017-09-28', '2017-09-29', '2017-09-30', '2017-10-01'], dtype='datetime64[ns]', length=274, freq='D') DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03', '2017-01-04', '2017-01-05', '2017-01-06', '2017-01-07', '2017-01-08', '2017-01-09', '2017-01-10'], dtype='datetime64[ns]', freq='D') DatetimeIndex(['2017-01-21', '2017-01-22', '2017-01-23', '2017-01-24', '2017-01-25', '2017-01-26', '2017-01-27', '2017-01-28', '2017-01-29', '2017-01-30'], dtype='datetime64[ns]', freq='D')
pd.date_range(start=None, end=None, periods=None, freq='D', tz=None, normalize=False, name=None, closed=None, **kwargs)
The meanings of commonly used parameters are as follows:
Give a chestnut to normalize
actually use the parameters:
rng4 = pd.date_range(start = '1/1/2017 15:30', periods = 10, name ='hello world!', normalize = True) print(rng4) >>> DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03', '2017-01-04', '2017-01-05', '2017-01-06', '2017-01-07', '2017-01-08', '2017-01-09', '2017-01-10'], dtype='datetime64[ns]', name='hello world!', freq='D')
The basic usage is as follows:
print(pd.date_range('2017/1/1','2017/1/4')) # default freq ='D': every calendar day print(pd.date_range('2017/1/1','2017/1/4', freq ='B')) # B: Every working day print(pd.date_range('2017/1/1','2017/1/2', freq ='H')) # H: every hour print(pd.date_range('2017/1/1 12:00','2017/1/1 12:10', freq ='T')) # T/MIN: every minute print(pd.date_range('2017/1/1 12:00:00','2017/1/1 12:00:10', freq ='S')) # S: every second print(pd.date_range('2017/1/1 12:00:00','2017/1/1 12:00:10', freq ='L')) # L: every millisecond (thousandth of a second ) print(pd.date_range('2017/1/1 12:00:00','2017/1/1 12:00:10', freq ='U')) # U: Every microsecond (parts per million One second)
Advanced usage is as follows:
print(pd.date_range('2017/1/1','2017/2/1', freq ='W-MON')) # W-MON: Starting from the specified day of the week, every week # Day of the week abbreviation: MON/TUE/WED/THU/FRI/SAT/SUN print(pd.date_range('2017/1/1','2017/5/1', freq ='WOM-2MON')) # WOM-2MON: The first few weeks of the month are counted, here is the second Monday of the month
Generate calendar days with specified frequency:
print(pd.date_range('2017','2018', freq ='M')) print(pd.date_range('2017','2020', freq ='Q-DEC')) print(pd.date_range('2017','2020', freq ='A-DEC')) print('------') # M: The last calendar day of each month # Q-Month: Specify the month as the end of the quarter, the last calendar day of the last month at the end of each quarter # A-Month: The last calendar day of the specified month each year # Month abbreviation: JAN/FEB/MAR/APR/MAY/JUN/JUL/AUG/SEP/OCT/NOV/DEC # So Q-month has only three situations: 1-4-7-10, 2-5-8-11, 3-6-9-12
Generate working days with specified frequency:
print(pd.date_range('2017','2018', freq ='BM')) print(pd.date_range('2017','2020', freq ='BQ-DEC')) print(pd.date_range('2017','2020', freq ='BA-DEC')) print('------') # BM: The last working day of each month # BQ-Month: Specify the month as the end of the quarter, the last working day of the last month at the end of each quarter # BA-Month: The last working day of the specified month each year
Generate a special time with a specified regularity:
print(pd.date_range('2017','2018', freq ='MS')) print(pd.date_range('2017','2020', freq ='QS-DEC')) print(pd.date_range('2017','2020', freq ='AS-DEC')) print('------') # M: The first calendar day of each month # QS-Month: Specify the month as the end of the quarter, the first calendar day of the last month at the end of each quarter # AS-Month: The first calendar day of the specified month each year print(pd.date_range('2017','2018', freq ='BMS')) print(pd.date_range('2017','2020', freq ='BQS-DEC')) print(pd.date_range('2017','2020', freq ='BAS-DEC')) print('------') # BMS: The first working day of each month # BQS-Month: Specify the month as the end of the quarter, the first working day of the last month at the end of each quarter # BAS-Month: the first working day of the specified month each year
Generate a time series with a specified composite frequency:
print(pd.date_range('2017/1/1','2017/2/1', freq = '7D')) # 7 days print(pd.date_range('2017/1/1','2017/1/2', freq = '2h30min')) # 2 hours 30 minutes print(pd.date_range('2017','2018', freq = '2M')) # The first calendar day every 2 months
How to modify a time series with a frequency interval of days to a time series with a smaller unit interval?
ts = pd.Series(np.random.rand(4), index = pd.date_range('20170101','20170104')) print(ts) print(ts.asfreq('4H',method ='ffill')) # Change the frequency, here is D changed to 4H # method: Interpolation mode, None does not interpolate, ffill is filled with the previous value, and bfill is filled with the later value
The following data movement of chestnut lead/lag is numerical:
ts = pd.Series(np.random.rand(4), index = pd.date_range('20170101','20170104')) print(ts) print(ts.shift(2)) print(ts.shift(-2)) print('------') # Positive number: the value moves backward (lagging); negative number: the value moves forward (leading) >>> 2017-01-01 0.575076 2017-01-02 0.514981 2017-01-03 0.221506 2017-01-04 0.410396 Freq: D, dtype: float64 2017-01-01 NaN 2017-01-02 NaN 2017-01-03 0.575076 2017-01-04 0.514981 Freq: D, dtype: float64 2017-01-01 0.221506 2017-01-02 0.410396 2017-01-03 NaN 2017-01-04 NaN Freq: D, dtype: float64
Adding the freq offset parameter will offset the previous index timestamp instead of the value:
print(ts.shift(2, freq ='D')) print(ts.shift(2, freq ='T')) # Add the freq parameter: shift the timestamp instead of shifting the value