id time number
0 1 1970-01-01 00: 00:00 1
1 2 1970-01-02 00:00:00 2
2 1 1970-01-03 00:00:00 2
I want groupby id And aggregate the time of pd.Datetime dtype into int representing the time increment, and I have the following code:
def interval(a):
return ( np.max(a)-np.min(a)).days
_df = df.groupby(['id'], as_index=False).agg(
{
"number": numpy.sum,
"time": interval,
}
)
The column time source is dtype pd.Datetime but the aggregate data is int , Which causes the data in the time column of _df to be converted from int to pd.Datetime, such as 1970-01-01 00: 00:00.000000000
Can you tell me how to get the time of the aggregated data frame as The correct result of int
astype
:
def interval(a):
a = (np .max(a)-np.min(a)) / np.timedelta64(1,'D')
return a
_df = df.groupby(['id'], as_index=False).agg(
{
"number": np.sum,
"time": interval,
}
)
_df[ 'time'] = _df['time'].astype(int)
print _df
id number time
0 1 3 2
1 2 2 0< /pre>
I have this DataFrame structure
id time number
0 1 1970-01- 01 00:00:00 1
1 2 1970-01-02 00:00:00 2
2 1 1970-01-03 00:00:00 2
I want To groupby id and aggregate the time of pd.Datetime dtype into int representing the time increment, and I have the following code:
def interval(a):
return (np.max(a)-np.min(a)).days
_df = df.groupby(['id'], as_index=False).agg(
{
"number": numpy.sum,
"time": interval,
}
)
Column time source is dtype pd.Datetime but aggregate The data is int, which causes the data in the time column of _df to be converted from int to pd.Datetime, such as 1970-01-01 00: 00:00.000000000
Can you tell me how to get the aggregated data frame Time is listed as the correct result of int
You can try to pass np.timedelta64( 1,'D') Convert timedelta to days, and then float to an integer before astype
:
def interval(a):
a = (np.max(a)-np.min(a)) / np.timedelta64(1,'D')
return a
_df = df. groupby(['id'], as_index=False).agg(
{
"number": np.sum,
"time": interval,
}
)
_df['time'] = _df['time'].astype(int)
print _df
id number time
0 1 3 2< br />1 2 2 0