Leetcode #1142: User Activity for the Past 30 Days II
In this guide, we solve Leetcode #1142 User Activity for the Past 30 Days II in Python and focus on the core idea that makes the solution efficient.
You will see the intuition, the step-by-step method, and a clean Python implementation you can use in interviews.

Problem Statement
Table: Activity +---------------+---------+ | Column Name | Type | +---------------+---------+ | user_id | int | | session_id | int | | activity_date | date | | activity_type | enum | +---------------+---------+ This table may have duplicate rows. The activity_type column is an ENUM (category) of type ('open_session', 'end_session', 'scroll_down', 'send_message').
Quick Facts
- Difficulty: Easy
- Premium: Yes
- Tags: Database
Intuition
The task is relational in nature, which maps cleanly to DataFrame operations in Python.
By treating tables as DataFrames, joins and group-bys become concise and readable.
Approach
Load the inputs as DataFrames and apply the appropriate merge, filter, or group-by.
Select or rename the columns to match the required output.
Steps:
- Load inputs as DataFrames.
- Apply merge/groupby/filter operations.
- Select the output columns.
Example
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| user_id | int |
| session_id | int |
| activity_date | date |
| activity_type | enum |
+---------------+---------+
This table may have duplicate rows.
The activity_type column is an ENUM (category) of type ('open_session', 'end_session', 'scroll_down', 'send_message').
The table shows the user activities for a social media website.
Note that each session belongs to exactly one user.
Python Solution
import duckdb
import pandas as pd
def solution(activity: pd.DataFrame) -> pd.DataFrame:
con = duckdb.connect()
con.register("Activity", activity)
return con.execute("""WITH
T AS (
SELECT
COUNT(DISTINCT session_id) AS sessions
FROM Activity
WHERE activity_date <= '2019-07-27' AND DATEDIFF('2019-07-27', activity_date) < 30
GROUP BY user_id
)
SELECT IFNULL(ROUND(AVG(sessions), 2), 0) AS average_sessions_per_user
FROM T;""").df()
Complexity
The time complexity is O(n log n) (typical). The space complexity is O(n).
Edge Cases and Pitfalls
Watch for boundary values, empty inputs, and duplicate values where applicable. If the problem involves ordering or constraints, confirm the invariant is preserved at every step.
Summary
This Python solution focuses on the essential structure of the problem and keeps the implementation interview-friendly while meeting the constraints.