AI & MLdata-sciencepythonmini-project

Proactive Retention - Using Activity Logs to Predict Churn

By Blake Marterella
Picture of the author
Published on
Proactive Retention Cover

Executive Summary (TL;DR)

  • Objective: Predict customer satisfaction and renewal likelihood from on-site activity with a summary.
  • Key Results:
    1. A 2-3 sentence activity summary.
    2. A confidence score (0-1) for renewal likelihood.
    3. A pattern overview that surfaces likely pain points.
  • Why it matters: The customer success team gets a clear and early-warning of at-risk customers, reducing last-minute trial cancellations and improving retention.

1. Background

Free-trial churn was spiking: many users canceled during the trial or right before it ended. The pattern of actions users took (or didn't take) during the trial period could provide a measure of their engagement and satisfaction. By translating raw activity into readable stories, customer success could intervene earlier and more effectively and the overall site experience could be improved.

Objectives and Desired Outcomes

  • Predictive Clarity: Produce a confidence score indicating the likelihood a user will renew.
  • Actionable Narrative: Pair the score with a short, human-readable summary of what the user actually did.
  • Root-Cause Clues: Provide a pattern overview that highlights likely frustrations (e.g., repeated edits without approvals, content mismatched business description, etc.)
  • Business Impact: Enable targeted outreach, sharper demo coaching, and a measurable impact on trial-to-paid conversion rates.

2. Data Analysis

End-to-end overview of data processing steps
End-to-end overview of data processing steps

2.1 Data Pre-Processing

Cleaning

A .csv file was provided with the raw user activity logs. Each row represented a user action, with columns: UserID, Timestamp, Page, and Log (the raw text of the action that occurred). The initial file contained over 30,000 rows. The data cleaning process included:

  • Removing rows with essential missing values (UserID, Timestamp, or Log columns)
  • Populating missing values in non-essential columns with "Unknown" (such as Browser or Page)
  • Converting column data types
  • Ensuring that UserIDs were unique and consistent
  • Filtering out test-account activity

Additionally, there was a layer of row-level cleanup required. Due to an error in the logging system, some actions were logged back-to-back with a slight variety in timestamps (despite being the same action). To remove these redundant rows, I sorted the data by UserID and Timestamp, then removed any rows with the same Log and UserID that occurred within 5 seconds of each other, saving only the first instance. This cleanup step removed approximately 11,500 rows.

Categorizing Raw Logs

The Log column is easy for humans to read but this can mask important details that need to be analyzed independently. Think about the variety of different log messages that can fall under the umbrella of "User Sign Up"... enter coupon code, sign up with Google, confirm password, etc. Using a lookup table I created and regex patterns, I segmented the Log column into 3 separate columns: Action Category, Action Type, and Action Detail.

Example of how a raw log is segmented with regex pattern
Example of how a raw log is segmented with regex pattern

In total there were 9 action categories:

  • setup: User is setting up their account
  • navigation: User is moving between pages, modals, etc.
  • conversion: User is completing actions directly tied to signup, cancellation, or entering a coupon code
  • personalizing: User is changing features (topisc, ctas, etc.) or setting a preferance
  • learning: User is following tutorial or submitting feedback
  • posting: System is posting to social media
  • editing: User is editing a post
  • creating: User is creating a new post
  • scheduling: User is scheduling a post

2.2 Feature Engineering

Sessionization

To understand user behavior, I needed to group actions into sessions. A session is defined as a series of actions taken by a user within a specific time frame. I chose a 30-minute inactivity threshold to delineate sessions. If a user was inactive for more than 30 minutes, any subsequent action would be considered the start of a new session.

User Action Metrics

Sessions can also be used to determine how long a user spent performing a important tasks, such as customizing profile or the sign-up process. Beyond session-level metrics, I also calculated user-level metrics, such as:

  • Total number of sessions
  • Average session duration
  • Total number of actions
  • Actions per session
  • How many actions of each type/category

2.3 Training Dataset

The final training dataset had 27 features that captured user behavior, engagement, and encapsulated the user's journey through the trial period in a structured and quantifiable format.

Rows in raw dataset, transformed into the training dataset
Rows in raw dataset, transformed into the training dataset

The final step was labeled the training dataset with a binary Converted column, indicating whether the user renewed (1) or churned (0).

3. Model and Evaluation

The model used was a Linear Logistic Regression model, which is well-suited for binary classification tasks like predicting user churn. The model was trained on 80% of the dataset, with 20% held out for testing. The result was a model that could predict the likelihood of a user renewing their subscription based on their activity during the trial period with an accurary of 77%.

Confusion Matrix
Confusion Matrix
Using a sample set of users currently in trial, calculate each confidence score
Using a sample set of users currently in trial, calculate each confidence score

4. Human Readable Insight

The second deliverable was a human-readable summary of the user's activity during the trial period. This was accomplished by fine-tuning a ChatGPT O1 prompt to generate a 2-3 sentence summary based on the user's activity metrics and a confidence score from the model. The prompt was engineered to highlight key actions, engagement levels, and potential pain points. To interface with ChatGPT quicker, I used the openai python library and an assistant to manage the conversation and generate the summaries directly from my jupyter notebook.

User Case study likely to renew and likely to churn
User Case study likely to renew and likely to churn

Fun fact: During the weekly company all-hands meeting, the customer success team leads the meeting by describing an outstanding user experience from the past week. The week that I demoed this project, unbeknownst to me, they picked User X as their example of a satisfied user. The summary generated by the model was spot-on, capturing the user's journey and satisfaction perfectly.

5. Future Development

  • Query dataset in NotebookLM
  • NLP to classify raw logs
  • Process more logs for more training data, better model
  • Automate prompt engineering for user summaries