Games Social Listening Demo
AI-powered player feedback analysis from Steam, Google Play, and Reddit using sentiment extraction and natural language insights.
Maintainers: Thomas Xu, Brendan Byam
🚀 What is Games Social Listening?
Games Social Listening is an end-to-end platform that transforms player feedback into actionable insights using AI. It:
- Ingests reviews and feedback from Steam, Google Play, and Reddit
- Translates content to English using AI translation
- Analyzes sentiment across 12 gameplay categories using AI
- Generates AI-powered reports tailored for different personas (Community Manager, Marketer, Game Designer)
- Visualizes insights through interactive dashboards and Genie Space natural language queries
📦 Installation
This solution uses Databricks Asset Bundle with automated setup via the Demo_Setup.ipynb notebook:
Prerequisites
- A Databricks workspace with Unity Catalog enabled
- You must be a Workspace Admin to set up the demo (after setup, normal users can run and use it).
-
Add
*.databricksapps.comas a domain allowed to embed AI/BI Dashboards- Settings > Workspace Admin > Security > External Access:

- Settings > Workspace Admin > Security > External Access:
-
Databricks CLI installed (optional, for manual deployment)
- Serverless compute available
- SQL Warehouse for dashboard and app queries
Demo Quick Start (Recommended)
- Clone the Repository to your Databricks workspace using a Git Folder
- (Optional) - To enable Steam and Reddit ingestion, see docs/CONFIGURATION.md#enable-ingestion from-reddit-and-steam. This requires API secrets.
-
Access the
Demo_Setup.ipynbnotebook and populate the widgets at the top with your desired values.
- Create the catalog and schema if they do not already exist.
- Note: The prefix can only contain lowercase letters, numbers, and hyphens, and that hyphens cannot be at the beginning or end.
-
Select 'Run All' to execute all cells in the notebook using serverless compute

- The notebook will automatically:
- Deploy all resources (Jobs, Pipeline, Dashboard, App) via a Databricks Asset Bundle
- Configure with your workspace settings
- Load sample data and execute sentiment analysis (Pokemon Go from Google Play)
- Should take 10-15 minutes
- Access the deployed app in your Databricks workspace!
- Left sidebar > Compute > Apps
- The notebook will automatically:
Alternative: Configure DAB Deployment
For production or custom deployments, see docs/CONFIGURATION.md#dab-deployment for CLI-based deployment.
Note that if you want to have multiple demo apps in the same Databricks workspace, make sure to update at least the bundle name in databricks.yml to prevent subsequent deploys from overwriting each other. If you still encounter errors, try deleting the .databricks directory (autogenerated by DAB validation/deployment) and retrying.
🏗️ Project Structure
cmeg_player_feedback_app/
├── databricks.yml # Databricks Asset Bundle configuration
├── Demo_Setup.ipynb # Automated installation notebook
├── resources/ # DAB resource definitions
│ ├── Games Social Listening - Job.job.yml
│ ├── Games Social Listening - Pipeline.pipeline.yml
│ ├── Games Social Listening - Dashboard.dashboard.yml
│ └── Games Social Listening - App.app.yml
├── src/
│ ├── Abstracted_Ingestion.ipynb # Multi-platform ingestion notebook
│ ├── Summary_Report_Generator.ipynb # AI report generation
│ ├── app/ # FastAPI web application
│ │ ├── main.py # App entry point
│ │ ├── config.yaml # App configuration
│ │ ├── routers/ # API endpoints
│ │ ├── utils/ # Helper functions
│ │ └── templates/ # UI templates
│ ├── pipeline/ # Spark Declarative Pipeline transformations
│ │ └── transformations/
│ │ ├── 01_ai_translation.py
│ │ ├── 02_ai_sentiment_extraction.py
│ │ ├── 03_parse_sentiment.py
│ │ └── 04_reporting_layer.py
│ ├── ingestion_utils/ # Platform-specific ingestors
│ │ ├── steam_ingestor.py
│ │ ├── google_play_ingestor.py
│ │ └── reddit_ingestor.py
│ └── config/ # Configuration files
│ └── config.yaml # Sentiment categories & personas
└── docs/ # Documentation
└── CONFIGURATION.md # Customization & production guide
🔄 Demo Contents
The demo implements a 6-stage social listening analysis:
Stage 1: Ingestion
- Pulls user generated content/feedback from Steam, Google Play, or Reddit
- Sampling: Max 10K content records per source, sampled to 2K if exceeded
Stage 2: AI Translation
- Translates all content to English using
ai_translate() - Preserves original text for reference
Stage 3: Sentiment Extraction
- Uses Meta Llama 3.3 70B for AI sentiment analysis
- Extracts sentiment across categories and subtopics from user-generated content
Stage 4: Reporting Layer Data
- Creates gold tables optimized for analytics
- Powers dashboard, app, and Genie Space
Stage 5: Summary Report
- Generates summary reports of sentiment analysis for 3 personas (Community Manager, Marketer, Game Designer)
Stage 6: Consolidates Insight and Actions in the App
- Add new games for sentiment analysis
- Review Summary Reports
- Ask natual language questions with genie
- Drill into deeper insights with the dashboard embedded into the app
🎯 Deployed Components
| Component | Description | |-----------|-------------| | Spark Declarative Pipeline | 4-stage transformation: translation → sentiment extraction → parsing → gold tables | | Orchestration Job | Daily/New Game content ingestion + pipeline execution + dashboard refresh + generate summary report for new games | | AI/BI Dashboard | Interactive analytics with filters, visualizations, and drill into sub topic sentiment | | Genie Space | Natural language queries on player feedback data | | Weekly Summary Report Job | Update the AI-generated summary reports for all tracked games | | Databricks App | Add games and explore insights |
⚙️ Configuration and Customization
See docs/CONFIGURATION.md for: - Customizing the demo for your own needs - Productionalizing DAB and assets
📚 Documentation
Documentation Website:
https://databricks-industry-solutions.github.io/social-listening/
Documentation files in this repository: - docs/CONFIGURATION.md - Customization, production deployment, and API keys setup - src/app/README.md - Databricks App structure and configuration - src/pipeline/README.md - Pipeline development and transformations - src/ingestion_utils/README.md - Details on ingestion from platforms, sampling, and adding new platforms for ingestion
🎮 Supported Platforms
| Platform | Content Type | Identifier Format | API Key Required |
|----------|--------------|-------------------|------------------|
| Steam | Game Reviews | Steam App ID (e.g., 730 for CS:GO) | Yes |
| Google Play | App Reviews | Package name (e.g., com.nianticlabs.pokemongo) | No |
| Reddit | Subreddit Posts | Subreddit name (e.g., gaming) | Yes |
To enable Steam and Reddit ingestion, see CONFIGURATION.md.
You can also load data from your own bronze table in Unity Catalog, provided it is in the correct format. For more info see CONFIGURATION.md.
Want to add a new platform? The ingestion system uses an abstract DataIngestor class that makes it easy to add new sources (YouTube, TikTok, etc.). See src/ingestion_utils/README.md for a step-by-step guide.
Demo Teardown
To destroy all demo resources, uncomment the last two cells of the Demo_Setup.ipynb and run both to:
- Destroy resources managed by DAB
- Destroy Genie Space via API
⚠️ Disclaimer
Please note the code in this project is provided for your exploration only, and is not formally supported by Databricks with Service Level Agreements (SLAs). It is provided AS-IS and we do not make any guarantees of any kind. Please do not submit a support ticket relating to any issues arising from the use of this project.
📄 License
This project is licensed under the Databricks License. See licenses.md for more info.