Feature/cppa slack tracker private by AuraMindNest · Pull Request #139 · cppalliance/boost-data-collector

AuraMindNest · 2026-04-08T21:21:17Z

Summary by CodeRabbit

Release Notes

New Features
- Added support for syncing private channels and direct messages using Slack OAuth user tokens.
- Introduced a Slack OAuth authentication server for secure user credential management.
Configuration
- Added environment variables for OAuth setup, including Slack credentials and token storage paths.
Admin Updates
- Added admin interface for managing private channel, message, and membership records.

coderabbitai · 2026-04-08T21:21:27Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 6b683f3c-aee2-4322-bf46-5fb551941284

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR introduces comprehensive support for non-public Slack channels (private channels, MPIMs, and direct messages/IMs) alongside a new Slack OAuth server for managing user-specific tokens. It adds database models for private channel tracking, enhanced message/channel syncing across both public and private spaces, user OAuth token persistence and loading, and configuration infrastructure for OAuth credentials and runtime parameters.

Changes

Cohort / File(s)	Summary
Configuration & Environment `.env.example`, `.gitignore`, `requirements.txt`, `requirements-dev.txt`	Added OAuth configuration variables (client credentials, redirect URI, scopes, token storage), ignored credential directory, and runtime dependencies (`httpx`, `starlette`, `uvicorn`, `python-dotenv`).
Models & Database `cppa_slack_tracker/models.py`, `cppa_slack_tracker/migrations/0004_alter_slackteam_team_id_slackchannelprivate_and_more.py`, `conftest.py`	Created new `SlackChannelPrivate`, `SlackMessagePrivate`, `SlackChannelMembershipPrivate`, and `SlackChannelMembershipChangeLogPrivate` models in `slack_private` schema; altered `SlackTeam.team_id` to be unique; added Django signal handler to create schema on PostgreSQL connection.
Service Layer `cppa_slack_tracker/services.py`, `cppa_slack_tracker/user_tokens.py`	Extended `get_or_create_slack_channel` to return both public and private channel records; added private membership and message persistence functions (`add_channel_membership_change_private`, `sync_channel_memberships_private`, `save_slack_message_private`); introduced user OAuth token loading and iteration utilities.
Sync & Channel Processing `cppa_slack_tracker/sync/sync_channel.py`, `cppa_slack_tracker/sync/sync_channel_user.py`, `cppa_slack_tracker/sync/sync_message.py`, `cppa_slack_tracker/sync/__init__.py`	Updated channel syncing to include private channels, MPIMs, and IMs by default; added `get_private_channels_to_sync` function; extended message sync to handle both public and private channels; widened type annotations to accept both channel variants; added optional client parameter for user-authenticated API calls.
Management Command `cppa_slack_tracker/management/commands/run_cppa_slack_tracker.py`	Extended `sync_messages` control flow to sync private channels and IM conversations; added `_sync_im_messages_user_tokens` helper to iterate user OAuth tokens and sync per-user direct message history; updated channel mapping to support private channel message persistence.
OAuth Server `cppa_slack_tracker/slack_oauth_server.py`	Implemented standalone Starlette-based OAuth helper server with routes for Slack redirect, callback handling, token persistence to JSON, and authorized user listing; supports CSRF state tracking and atomic token file updates.
Admin & API `cppa_slack_tracker/admin.py`, `cppa_slack_tracker/fetcher.py`	Added Django admin classes for private channel models with list/filter/search configurations; extended `fetch_channel_list` default types to include private channels, MPIMs, and IMs; added `fetch_im_channel_list_for_user` for user-token-based IM fetching.
Tests & Fixtures `cppa_slack_tracker/tests/fixtures.py`, `cppa_slack_tracker/tests/test_services.py`	Added test fixtures for private channels and IM channels; extended service tests to cover private channel creation, membership tracking, message persistence, and IM-specific exclusions.

Sequence Diagrams

sequenceDiagram
    participant User
    participant OAuthServer as OAuth Server
    participant SlackAPI as Slack API
    participant Database
    
    User->>OAuthServer: GET /slack/connect
    OAuthServer->>OAuthServer: Generate random state, store in memory
    OAuthServer->>User: Redirect to Slack OAuth authorize URL
    User->>SlackAPI: Authorize request with code
    SlackAPI->>User: Redirect to /slack/oauth/callback
    User->>OAuthServer: GET /slack/oauth/callback?code=...&state=...
    OAuthServer->>OAuthServer: Validate and consume state
    OAuthServer->>SlackAPI: POST oauth.v2.access with code
    SlackAPI->>OAuthServer: Return access_token + authed_user info
    OAuthServer->>Database: Persist user_id, team_id, access_token to JSON
    OAuthServer->>User: Success page or error page

sequenceDiagram
    participant Tracker as Sync Tracker
    participant FetchAPI as Fetch API
    participant TokenStore as Token Store
    participant SlackAPI as Slack API (User)
    participant Database
    
    Tracker->>TokenStore: Load user OAuth tokens for team
    loop For each user token
        Tracker->>FetchAPI: fetch_im_channel_list_for_user(token)
        FetchAPI->>SlackAPI: conversations.list with user token
        SlackAPI->>FetchAPI: IM channel list
        FetchAPI->>Tracker: List of IM channels
        Tracker->>Database: Upsert IM channels as SlackChannelPrivate
        Tracker->>SlackAPI: Fetch messages from IMs (user-authenticated)
        Tracker->>Database: Save messages to SlackMessagePrivate
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Team_id-based Slack bot tokens and test updates (#91) #95: Modifies cppa_slack_tracker/fetcher.py with changes to fetch_channel_list and Slack client handling, directly related to this PR's fetcher enhancements.

Suggested reviewers

jonathanMLDev
snowfox1003

Poem

🐰 Hopping through channels both public and dim,
Private messages now tracked to the brim,
With OAuth tokens and schemas so sleek,
User conversations are no longer weak!
From DMs to MPIMs, we gather them all, ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 72.73% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Feature/cppa slack tracker private' is vague and uses generic phrasing that doesn't clearly convey the specific changes implemented (private channels, DMs, OAuth, schema changes, etc.).	Revise the title to be more specific and descriptive, such as 'Add support for private Slack channels and direct messages' or 'Implement private channel tracking and OAuth-based DM syncing'.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

AuraMindNest · 2026-04-09T18:00:53Z

@coderabbitai, full review.

coderabbitai · 2026-04-09T18:01:08Z

@AuraMindNest, I'll redo the entire review of the PR now. Please wait while I analyze all the changes.

🧠 Learnings used

Learnt from: snowfox1003
Repo: CppDigest/boost-data-collector PR: 117
File: cppa_slack_tracker/management/commands/run_cppa_slack_tracker.py:390-396
Timestamp: 2026-03-17T20:30:16.963Z
Learning: In `cppa_slack_tracker/management/commands/run_cppa_slack_tracker.py`, the Pinecone namespace is constructed as `f"slack-{team.team_id}"` (using `team_id`, not `team_name`). The team has agreed on this convention. Do not flag `team_name` usage in Pinecone namespace construction as an issue; the fix is already agreed upon.

Learnt from: leostar0412
Repo: CppDigest/boost-data-collector PR: 66
File: cppa_slack_tracker/workspace.py:90-98
Timestamp: 2026-02-25T01:35:06.517Z
Learning: In `cppa_slack_tracker/workspace.py`, the function `iter_existing_message_jsons()` accepts optional `team_slug` and `channel_slug` parameters, but in practice `channel_slug` is never provided without `team_slug` due to business logic constraints enforced by calling code.

✅ Actions performed

Full review triggered.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (3)

cppa_slack_tracker/models.py (1)
130-141: Add documentation comments to all private models using the quote-injection pattern.

The db_table = 'slack_private"."...' pattern is used across 4 models (SlackChannelPrivate, SlackMessagePrivate, SlackChannelMembershipPrivate, and SlackChannelMembershipChangeLogPrivate), but the explanatory comment about Django's quoting behavior exists only in SlackChannelPrivate. This pattern is fragile and may break with Django version upgrades or certain management commands. Add the existing comment from SlackChannelPrivate to the other 3 models for consistency, and consider adding an integration test that validates the correct schema qualification in generated SQL.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cppa_slack_tracker/models.py` around lines 130 - 141, Copy the explanatory
comment that explains the db_table quote-injection workaround from the
SlackChannelPrivate model to the other three private models
(SlackMessagePrivate, SlackChannelMembershipPrivate,
SlackChannelMembershipChangeLogPrivate) directly above their db_table
assignments so the rationale is present alongside the db_table =
'slack_private"."...' lines; additionally, add a small integration test that
generates the SQL for these models (via Django migration SQL generation or
connection.schema_editor()/connection.ops methods) and asserts the
schema-qualified table name appears as expected to catch future Django changes
that might break the pattern.
cppa_slack_tracker/slack_oauth_server.py (2)
378-381: Redundant exception handler.

The try/except around _exchange_code is redundant since _exchange_code already handles all exceptions internally and returns {"ok": False, "error": ...}. The outer except at line 380 would only catch exceptions from await machinery, which is unlikely.

Consider removing or narrowing this handler to avoid masking unexpected errors.
♻️ Proposed simplification
-    try:
-        data = await _exchange_code(code)
-    except Exception as exc:
-        return _html_exchange_failed(f"Token exchange failed: {exc}")
+    data = await _exchange_code(code)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cppa_slack_tracker/slack_oauth_server.py` around lines 378 - 381, The outer
try/except around awaiting _exchange_code is redundant and may mask unexpected
errors; remove the try/except and directly await data = await
_exchange_code(code), then handle the returned result (check
data.get("ok")/data["error"]) and call _html_exchange_failed when the exchange
indicates failure; alternatively, if you must guard only for rare await-time
errors, catch only specific exceptions (e.g., asyncio.CancelledError) instead of
a broad Exception to avoid hiding bugs in _exchange_code or upstream.
46-50: In-memory state is not suitable for multi-process deployments.

_oauth_states is an in-memory dict. If the server runs with multiple workers (e.g., uvicorn --workers N), the state generated by one worker won't be visible to another, causing OAuth callbacks to fail with "Invalid or expired session."

This is fine for single-worker dev use but should be documented or addressed for production.

Would you like me to add a docstring note about single-worker limitations, or help implement a shared-state solution (e.g., Redis, file-based)?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cppa_slack_tracker/slack_oauth_server.py` around lines 46 - 50, The CSRF
state map _oauth_states is an in-memory dict and will break multi-process
deployments; replace it with a pluggable shared-state abstraction and/or
document the single-worker limitation. Create an OAuthStateStore interface with
methods like set_state(key, expiry), pop_state(key) -> expiry_or_none, and
cleanup_expired(); implement a RedisStateStore (using REDIS_URL env var) and
keep an InMemoryStateStore as a dev fallback, then change code that references
_oauth_states to call the store methods (e.g., where state is created and where
it is validated/popped). Alternatively, if you prefer a faster doc change, add a
clear module-level docstring near _oauth_states stating it is only safe for
single-worker/dev and recommend using the Redis-backed store in production.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cppa_slack_tracker/management/commands/run_cppa_slack_tracker.py`:
- Around line 425-468: The loop currently syncs the same IM channel once per
authorized user token, causing duplicate syncs; fix this by tracking which IM
channel IDs have already been scheduled/synced and skipping duplicates: after
calling fetch_im_channel_list_for_user(user_slack_id, access_token) (or when
iterating im_channels) record each channel id in a shared set (e.g.,
seen_channel_ids) keyed by team and channel id and skip calling
get_or_create_slack_channel and sync_messages for any ch whose id is already in
seen_channel_ids; ensure the first encountered token still performs the sync (so
keep existing flow of get_slack_client, get_or_create_slack_channel,
sync_messages) but add the membership check/insert to avoid repeated syncs
across different tokens and preserve channel_id_opt filtering logic.

In `@cppa_slack_tracker/services.py`:
- Around line 553-556: The current branch that handles subtypes sets text to
_message_text_for_subtype(slack_message, subtype) or "" which discards the raw
slack_message["text"] for unhandled subtypes; change the logic in the same
conditional handling (the elif subtype branch) to use the subtype-specific
helper result if present, otherwise fall back to slack_message.get("text", "")
so unhandled subtypes preserve the original text (i.e., call
_message_text_for_subtype(slack_message, subtype) and if it returns falsy, use
slack_message.get("text", "") instead).

In `@cppa_slack_tracker/slack_oauth_server.py`:
- Around line 222-226: The _load_tokens function currently calls json.load on
TOKENS_FILE without handling malformed JSON; update _load_tokens to catch
json.JSONDecodeError (and optionally ValueError) raised by json.load, log or
process the error (use existing logger or raise a controlled exception), and
return an empty dict fallback instead of letting the exception propagate;
reference the _load_tokens function and TOKENS_FILE constant, mirroring the
approach used in user_tokens.py::load_slack_user_tokens to safely handle corrupt
token files.

In `@requirements.txt`:
- Around line 27-32: The starlette dependency in requirements.txt is vulnerable;
update the requirement "starlette>=0.37" to "starlette>=0.47.2" to ensure fixes
for GHSA-f96h-pmfr-66vw and GHSA-2c2j-9gv5-cj73 are included, then regenerate
any lockfiles or pinned dependency artifacts and run the test suite/CI to verify
the OAuth server (multipart/form-data handling) still works as expected.

---

Nitpick comments:
In `@cppa_slack_tracker/models.py`:
- Around line 130-141: Copy the explanatory comment that explains the db_table
quote-injection workaround from the SlackChannelPrivate model to the other three
private models (SlackMessagePrivate, SlackChannelMembershipPrivate,
SlackChannelMembershipChangeLogPrivate) directly above their db_table
assignments so the rationale is present alongside the db_table =
'slack_private"."...' lines; additionally, add a small integration test that
generates the SQL for these models (via Django migration SQL generation or
connection.schema_editor()/connection.ops methods) and asserts the
schema-qualified table name appears as expected to catch future Django changes
that might break the pattern.

In `@cppa_slack_tracker/slack_oauth_server.py`:
- Around line 378-381: The outer try/except around awaiting _exchange_code is
redundant and may mask unexpected errors; remove the try/except and directly
await data = await _exchange_code(code), then handle the returned result (check
data.get("ok")/data["error"]) and call _html_exchange_failed when the exchange
indicates failure; alternatively, if you must guard only for rare await-time
errors, catch only specific exceptions (e.g., asyncio.CancelledError) instead of
a broad Exception to avoid hiding bugs in _exchange_code or upstream.
- Around line 46-50: The CSRF state map _oauth_states is an in-memory dict and
will break multi-process deployments; replace it with a pluggable shared-state
abstraction and/or document the single-worker limitation. Create an
OAuthStateStore interface with methods like set_state(key, expiry),
pop_state(key) -> expiry_or_none, and cleanup_expired(); implement a
RedisStateStore (using REDIS_URL env var) and keep an InMemoryStateStore as a
dev fallback, then change code that references _oauth_states to call the store
methods (e.g., where state is created and where it is validated/popped).
Alternatively, if you prefer a faster doc change, add a clear module-level
docstring near _oauth_states stating it is only safe for single-worker/dev and
recommend using the Redis-backed store in production.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d92e9ee6-77a3-43f7-8f18-2bad47e207ff

📥 Commits

Reviewing files that changed from the base of the PR and between b59fc99 and e9c5b1c.

📒 Files selected for processing (19)

.env.example
.gitignore
conftest.py
cppa_slack_tracker/admin.py
cppa_slack_tracker/fetcher.py
cppa_slack_tracker/management/commands/run_cppa_slack_tracker.py
cppa_slack_tracker/migrations/0004_alter_slackteam_team_id_slackchannelprivate_and_more.py
cppa_slack_tracker/models.py
cppa_slack_tracker/services.py
cppa_slack_tracker/slack_oauth_server.py
cppa_slack_tracker/sync/__init__.py
cppa_slack_tracker/sync/sync_channel.py
cppa_slack_tracker/sync/sync_channel_user.py
cppa_slack_tracker/sync/sync_message.py
cppa_slack_tracker/tests/fixtures.py
cppa_slack_tracker/tests/test_services.py
cppa_slack_tracker/user_tokens.py
requirements-dev.txt
requirements.txt

AuraMindNest added 3 commits April 8, 2026 15:20

Update the fetching for private channel.

a3a5ceb

Add schema migration for private data.

e18e69e

Implement some functions to save user im(dm) messages.

41b4f12

AuraMindNest mentioned this pull request Apr 9, 2026

Add cppa-slack-tracker(private) app in the project #87

Open

AuraMindNest added 2 commits April 9, 2026 11:08

Update due to CI failure.

11302e7

Update again

e9c5b1c

coderabbitai Bot reviewed Apr 9, 2026

View reviewed changes

Comment thread cppa_slack_tracker/management/commands/run_cppa_slack_tracker.py

Comment thread cppa_slack_tracker/services.py

Comment thread cppa_slack_tracker/slack_oauth_server.py Outdated

Comment thread requirements.txt

AuraMindNest added 2 commits April 9, 2026 14:18

Update due to the coderabbitai review.

f58b052

Fix lint errors.

c06aabf

AuraMindNest requested a review from snowfox1003 April 9, 2026 20:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/cppa slack tracker private#139

Feature/cppa slack tracker private#139
AuraMindNest wants to merge 7 commits intocppalliance:developfrom
AuraMindNest:feature/cppa_slack_tracker_private

AuraMindNest commented Apr 8, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 8, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

AuraMindNest commented Apr 9, 2026

Uh oh!

coderabbitai Bot commented Apr 9, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AuraMindNest commented Apr 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

AuraMindNest commented Apr 9, 2026

Uh oh!

coderabbitai Bot commented Apr 9, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

AuraMindNest commented Apr 8, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 8, 2026 •

edited

Loading