How to use Pandas aggregate DateTime objects to int - aggregate, DateTime, how to use, int, object, Pandas

I have this DataFrame structure

id time number
0 1 1970-01-01 00: 00:00 1
1 2 1970-01-02 00:00:00 2
2 1 1970-01-03 00:00:00 2

I want groupby id And aggregate the time of pd.Datetime dtype into int representing the time increment, and I have the following code:

def interval(a):
 return ( np.max(a)-np.min(a)).days

_df = df.groupby(['id'], as_index=False).agg(
 {
 "number": numpy.sum,
 "time": interval,
 }
 )

The column time source is dtype pd.Datetime but the aggregate data is int , Which causes the data in the time column of _df to be converted from int to pd.Datetime, such as 1970-01-01 00: 00:00.000000000

Can you tell me how to get the time of the aggregated data frame as The correct result of int

You can try to pass np.timedelta64(1,’D’) to timedelta Converted to days, and then float to an integer before astype:

def interval(a):
 a = (np .max(a)-np.min(a)) / np.timedelta64(1,'D')
 return a

_df = df.groupby(['id'], as_index=False).agg(
 {
 "number": np.sum,
 "time": interval,
 }
 )
_df[ 'time'] = _df['time'].astype(int) 
print _df 

 id number time
0 1 3 2
1 2 2 0< /pre>

I have this DataFrame structure

id time number
0 1 1970-01- 01 00:00:00 1
1 2 1970-01-02 00:00:00 2
2 1 1970-01-03 00:00:00 2

I want To groupby id and aggregate the time of pd.Datetime dtype into int representing the time increment, and I have the following code:

def interval(a):
 return (np.max(a)-np.min(a)).days

_df = df.groupby(['id'], as_index=False).agg(
 {
 "number": numpy.sum,
 "time": interval,
 }
 )

Column time source is dtype pd.Datetime but aggregate The data is int, which causes the data in the time column of _df to be converted from int to pd.Datetime, such as 1970-01-01 00: 00:00.000000000

Can you tell me how to get the aggregated data frame Time is listed as the correct result of int

You can try to pass np.timedelta64( 1,'D') Convert timedelta to days, and then float to an integer before astype:

def interval(a): 
 a = (np.max(a)-np.min(a)) / np.timedelta64(1,'D')
 return a

_df = df. groupby(['id'], as_index=False).agg(
 {
 "number": np.sum,
 "time": interval,
 }
 )
_df['time'] = _df['time'].astype(int) 
print _df 

 id number time
0 1 3 2< br />1 2 2 0

Leave a Comment Cancel reply