Data Science | These time series show operations

Data Science | These time series show operations

Pandas timestamp index-DatetimeIndex

pd.DatetimeIndex() and TimeSeries time series

pd.DatetimeIndex()You can directly generate a timestamp index, and support the use of str, datetime.datetime. The type of a single timestamp is, and the type of Timestampmultiple timestamps is DatetimeIndex, an example is as follows:

rng = pd.DatetimeIndex(['12/1/2017','12/2/2017','12/3/2017','12/4/2017','12/5/2017'])
print(rng,type(rng))
print(rng[0],type(rng[0]))
>>>
DatetimeIndex(['2017-12-01', '2017-12-02', '2017-12-03', '2017-12-04',
               '2017-12-05'],
              dtype='datetime64[ns]', freq=None) <class'pandas.core.indexes.datetimes.DatetimeIndex'>
2017-12-01 00:00:00 <class'pandas._libs.tslibs.timestamps.Timestamp'>
What is TimeSeries time series?

To DatetimeIndexas the index Series, to give time series TimeSries chestnut:

st = pd.Series(np.random.rand(len(rng)), index = rng)
print(st,type(st))
print(st.index)
>>>
2017-12-01 0.081920
2017-12-02 0.921781
2017-12-03 0.489779
2017-12-04 0.257632
2017-12-05 0.805373
dtype: float64 <class'pandas.core.series.Series'>
DatetimeIndex(['2017-12-01', '2017-12-02', '2017-12-03', '2017-12-04',
               '2017-12-05'],
              dtype='datetime64[ns]', freq=None)

pd.date_range()-Generate date range

pd.date_range() generates date range in two ways (the default frequency is day):

  • Start time (start) + end time (end)
  • Start time (start)/end time (end) + offset (periods)

Give a chestnut:

date1 = pd.date_range('2017/1/1','2017/10/1',normalize=True)
print(date1)
date2 = pd.date_range(start = '1/1/2017', periods = 10)
print(date2)
date3 = pd.date_range(end = '1/30/2017 15:00:00', periods = 10,normalize=True) # Added hour, minute, and second
print(date3)
>>>
DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03', '2017-01-04',
               '2017-01-05', '2017-01-06', '2017-01-07', '2017-01-08',
               '2017-01-09', '2017-01-10',
               ...
               '2017-09-22', '2017-09-23', '2017-09-24', '2017-09-25',
               '2017-09-26', '2017-09-27', '2017-09-28', '2017-09-29',
               '2017-09-30', '2017-10-01'],
              dtype='datetime64[ns]', length=274, freq='D')
DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03', '2017-01-04',
               '2017-01-05', '2017-01-06', '2017-01-07', '2017-01-08',
               '2017-01-09', '2017-01-10'],
              dtype='datetime64[ns]', freq='D')
DatetimeIndex(['2017-01-21', '2017-01-22', '2017-01-23', '2017-01-24',
               '2017-01-25', '2017-01-26', '2017-01-27', '2017-01-28',
               '2017-01-29', '2017-01-30'],
              dtype='datetime64[ns]', freq='D')
pd.date_range(start=None, end=None, periods=None, freq='D', tz=None, normalize=False, name=None, closed=None, **kwargs)

The meanings of commonly used parameters are as follows:

  • start: start time
  • end: end time
  • periods: offset
  • freq: frequency, default day, pd.date_range() default frequency is calendar day, pd.bdate_range() default frequency is weekday
  • tz: time zone
  • normalize: time parameter value is normalized to midnight timestamp
  • closed: When the default is None, left is closed and right is closed, left is left closed and right is open, right is left open and right is closed

Give a chestnut to normalizeactually use the parameters:

rng4 = pd.date_range(start = '1/1/2017 15:30', periods = 10, name ='hello world!', normalize = True)
print(rng4)
>>>
DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03', '2017-01-04',
               '2017-01-05', '2017-01-06', '2017-01-07', '2017-01-08',
               '2017-01-09', '2017-01-10'],
              dtype='datetime64[ns]', name='hello world!', freq='D')
Use of freq (1)-generation of fixed frequency time series

The basic usage is as follows:

print(pd.date_range('2017/1/1','2017/1/4')) # default freq ='D': every calendar day
print(pd.date_range('2017/1/1','2017/1/4', freq ='B')) # B: Every working day
print(pd.date_range('2017/1/1','2017/1/2', freq ='H')) # H: every hour
print(pd.date_range('2017/1/1 12:00','2017/1/1 12:10', freq ='T')) # T/MIN: every minute
print(pd.date_range('2017/1/1 12:00:00','2017/1/1 12:00:10', freq ='S')) # S: every second
print(pd.date_range('2017/1/1 12:00:00','2017/1/1 12:00:10', freq ='L')) # L: every millisecond (thousandth of a second )
print(pd.date_range('2017/1/1 12:00:00','2017/1/1 12:00:10', freq ='U')) # U: Every microsecond (parts per million One second)

Advanced usage is as follows:

print(pd.date_range('2017/1/1','2017/2/1', freq ='W-MON'))  
# W-MON: Starting from the specified day of the week, every week
# Day of the week abbreviation: MON/TUE/WED/THU/FRI/SAT/SUN

print(pd.date_range('2017/1/1','2017/5/1', freq ='WOM-2MON'))  
# WOM-2MON: The first few weeks of the month are counted, here is the second Monday of the month
Use of freq (2)-diversify and generate required time series

Generate calendar days with specified frequency:

print(pd.date_range('2017','2018', freq ='M'))  
print(pd.date_range('2017','2020', freq ='Q-DEC'))  
print(pd.date_range('2017','2020', freq ='A-DEC'))
print('------')
# M: The last calendar day of each month
# Q-Month: Specify the month as the end of the quarter, the last calendar day of the last month at the end of each quarter
# A-Month: The last calendar day of the specified month each year
# Month abbreviation: JAN/FEB/MAR/APR/MAY/JUN/JUL/AUG/SEP/OCT/NOV/DEC
# So Q-month has only three situations: 1-4-7-10, 2-5-8-11, 3-6-9-12

Generate working days with specified frequency:

print(pd.date_range('2017','2018', freq ='BM'))  
print(pd.date_range('2017','2020', freq ='BQ-DEC'))  
print(pd.date_range('2017','2020', freq ='BA-DEC'))
print('------')
# BM: The last working day of each month
# BQ-Month: Specify the month as the end of the quarter, the last working day of the last month at the end of each quarter
# BA-Month: The last working day of the specified month each year

Generate a special time with a specified regularity:

print(pd.date_range('2017','2018', freq ='MS'))  
print(pd.date_range('2017','2020', freq ='QS-DEC'))  
print(pd.date_range('2017','2020', freq ='AS-DEC'))
print('------')
# M: The first calendar day of each month
# QS-Month: Specify the month as the end of the quarter, the first calendar day of the last month at the end of each quarter
# AS-Month: The first calendar day of the specified month each year

print(pd.date_range('2017','2018', freq ='BMS'))  
print(pd.date_range('2017','2020', freq ='BQS-DEC'))  
print(pd.date_range('2017','2020', freq ='BAS-DEC'))
print('------')
# BMS: The first working day of each month
# BQS-Month: Specify the month as the end of the quarter, the first working day of the last month at the end of each quarter
# BAS-Month: the first working day of the specified month each year
Use of freq (3)-Use of compound frequency

Generate a time series with a specified composite frequency:

print(pd.date_range('2017/1/1','2017/2/1', freq = '7D')) # 7 days
print(pd.date_range('2017/1/1','2017/1/2', freq = '2h30min')) # 2 hours 30 minutes
print(pd.date_range('2017','2018', freq = '2M')) # The first calendar day every 2 months
asfreq-Period frequency conversion

How to modify a time series with a frequency interval of days to a time series with a smaller unit interval?

ts = pd.Series(np.random.rand(4),
              index = pd.date_range('20170101','20170104'))
print(ts)
print(ts.asfreq('4H',method ='ffill'))
# Change the frequency, here is D changed to 4H
# method: Interpolation mode, None does not interpolate, ffill is filled with the previous value, and bfill is filled with the later value
How to advance/lag data?

The following data movement of chestnut lead/lag is numerical:

ts = pd.Series(np.random.rand(4),
              index = pd.date_range('20170101','20170104'))
print(ts)
print(ts.shift(2))
print(ts.shift(-2))
print('------')
# Positive number: the value moves backward (lagging); negative number: the value moves forward (leading)
>>>
2017-01-01 0.575076
2017-01-02 0.514981
2017-01-03 0.221506
2017-01-04 0.410396
Freq: D, dtype: float64
2017-01-01 NaN
2017-01-02 NaN
2017-01-03 0.575076
2017-01-04 0.514981
Freq: D, dtype: float64
2017-01-01 0.221506
2017-01-02 0.410396
2017-01-03 NaN
2017-01-04 NaN
Freq: D, dtype: float64

Adding the freq offset parameter will offset the previous index timestamp instead of the value:

print(ts.shift(2, freq ='D'))
print(ts.shift(2, freq ='T'))
# Add the freq parameter: shift the timestamp instead of shifting the value

Consolidation exercises

  1. Assignment 1: Please output the following time series
  1. Assignment 2: Create a time series ts1 as required and convert it to ts2
Reference: https://cloud.tencent.com/developer/article/1518483 Data Science | These time series show operations-Cloud + Community-Tencent Cloud