Baselinr Quality Studio
Quality Studio is Baselinr's no-code web interface for configuring and managing your entire data quality setup. Configure connections, tables, profiling settings, validation rules, drift detection, and moreโall through an intuitive visual interface. The Quality Studio also provides comprehensive monitoring and analysis of profiling results, drift alerts, run history, and metrics across multi-warehouse environments.
๐ Demo Documentationโ
The Quality Studio supports a demo mode that runs entirely on Cloudflare Pages without database dependencies:
- Demo Documentation Hub - Complete demo deployment guide
- Demo Mode Quick Start - Enable demo mode locally
- Demo Deployment Guide - Phased deployment approach
๐ฎ Try the Demoโ
๐ Try Quality Studio Demo โ
Experience the Quality Studio with realistic sample data. The demo showcases all features including:
- Configuration management
- Profiling results visualization
- Drift detection alerts
- Validation results
- Root cause analysis
- Metrics dashboards
Note: The demo uses pre-generated sample data and runs in read-only mode.
๐ฏ Featuresโ
Core Featuresโ
- No-Code Configuration: Set up your entire data quality configuration through visual formsโno YAML or JSON required
- Configuration Management: Visual editors for connections, storage, tables, profiling, validation rules, drift detection, and more
- Visual & YAML Editor: Split-view editor with real-time sync between visual forms and YAML configuration
- Run History: View past profiling runs with filtering and search
- Profiling Results: Detailed table and column-level metrics visualization
- Drift Detection: Monitor data drift events with severity indicators
- Validation Results: View and manage data quality validation results
- Root Cause Analysis: AI-powered correlation of anomalies with pipeline runs and upstream issues
- Metrics Overview: Aggregate KPIs and trends
- Multi-Warehouse Support: PostgreSQL, Snowflake, MySQL, BigQuery, Redshift, SQLite
- Export Functionality: Export data in JSON/CSV formats
- AI Chat Assistant: Conversational interface for data quality investigation
Technical Stackโ
Frontend:
- Next.js 14 (App Router)
- React 18
- Tailwind CSS
- Recharts for visualizations
- TanStack Query for data fetching
- Lucide React for icons
Backend:
- FastAPI
- SQLAlchemy
- Pydantic
- PostgreSQL
๐ Project Structureโ
dashboard/
โโโ backend/ # FastAPI backend
โ โโโ main.py # API endpoints
โ โโโ models.py # Pydantic models
โ โโโ database.py # Database client
โ โโโ chat_models.py # Chat API models
โ โโโ chat_routes.py # Chat API routes
โ โโโ requirements.txt # Python dependencies
โ โโโ sample_data_generator.py
โโโ frontend/ # Next.js frontend
โ โโโ app/ # App router pages
โ โ โโโ page.tsx # Quality Studio overview
โ โ โโโ runs/ # Run history page
โ โ โโโ drift/ # Drift alerts page
โ โ โโโ tables/ # Table details page
โ โ โโโ chat/ # AI Chat page
โ โ โโโ metrics/ # Metrics page
โ โโโ components/ # Reusable components
โ โ โโโ Sidebar.tsx
โ โ โโโ KPICard.tsx
โ โ โโโ RunsTable.tsx
โ โ โโโ DriftAlertsTable.tsx
โ โ โโโ FilterPanel.tsx
โ โ โโโ chat/ # Chat components
โ โ โโโ ChatContainer.tsx
โ โ โโโ ChatInput.tsx
โ โ โโโ ChatMessage.tsx
โ โโโ types/ # TypeScript types
โ โ โโโ lineage.ts
โ โ โโโ chat.ts
โ โโโ lib/ # Utilities
โ โ โโโ api.ts # API client
โ โโโ package.json
โโโ README.md # This file
๐ Quick Startโ
Prerequisitesโ
- Node.js 18+ and npm/yarn
- Python 3.10+
- PostgreSQL database (Baselinr storage)
- Existing Baselinr installation (Phase 1)
1. Backend Setupโ
cd dashboard/backend
# Install dependencies
pip install -r requirements.txt
# Set environment variables (create .env file)
export BASELINR_DB_URL=postgresql://baselinr:baselinr@localhost:5433/baselinr
export API_HOST=0.0.0.0
export API_PORT=8000
# Generate sample data (optional)
python sample_data_generator.py
# Start the backend server
python main.py
# Or with uvicorn:
uvicorn main:app --reload --host 0.0.0.0 --port 8000
Backend will be available at: http://localhost:8000
2. Frontend Setupโ
cd dashboard/frontend
# Install dependencies
npm install
# or
yarn install
# Create .env.local file with:
# NEXT_PUBLIC_API_URL=http://localhost:8000
# Start the development server
npm run dev
# or
yarn dev
Frontend will be available at: http://localhost:3000
๐ API Endpointsโ
Quality Studio Metricsโ
GET /api/dashboard/metrics?warehouse=&days=30- Get aggregate metrics
Run Historyโ
GET /api/runs?warehouse=&schema=&table=&status=&days=30- List profiling runsGET /api/runs/{run_id}- Get detailed run results
Drift Detectionโ
GET /api/drift?warehouse=&table=&severity=&days=30- List drift alerts
Table Metricsโ
GET /api/tables/{table_name}/metrics?schema=&warehouse=- Get table metrics
Warehousesโ
GET /api/warehouses- List available warehouses
Exportโ
GET /api/export/runs?format=json&warehouse=&days=30- Export runsGET /api/export/drift?format=json&warehouse=&days=30- Export drift
Chat (AI Assistant)โ
GET /api/chat/config- Get chat configuration statusPOST /api/chat/message- Send a message to the chat agentGET /api/chat/history/{session_id}- Get chat history for a sessionDELETE /api/chat/session/{session_id}- Clear a chat sessionGET /api/chat/tools- List available chat toolsGET /api/chat/sessions- List active chat sessions
๐ Sample Dataโ
To populate the Quality Studio with sample data for testing:
cd dashboard/backend
python sample_data_generator.py
This generates:
- 100 profiling runs across all warehouse types
- Column-level metrics for each run
- Drift events for ~30% of runs
๐จ Customizationโ
Theme Colorsโ
Modify tailwind.config.ts to customize colors:
colors: {
primary: {
500: '#0ea5e9', // Main brand color
// ...
},
}
Adding New Pagesโ
- Create a new page in
frontend/app/your-page/page.tsx - Add navigation link in
components/Sidebar.tsx - Create API endpoint in
backend/main.pyif needed
๐ Integration with Baselinr Phase 1โ
The dashboard connects to the Baselinr storage database to read:
- baselinr_runs: Run history and metadata
- baselinr_results: Column-level metrics
- baselinr_events: Drift detection events
- baselinr_table_state: Incremental profiling metadata (snapshot IDs, last decisions)
Ensure your Baselinr Phase 1 installation has created these tables.
๐ณ Docker Setup (Optional)โ
TODO: Add Docker Compose configuration for easy deployment
๐ Roadmap / Future Enhancementsโ
- Real-time updates via WebSockets
- Advanced filtering and saved views
- Custom dashboards per user
- Alert notifications (email, Slack)
- Figma-based design refinements
- CSV export implementation
- Pagination for large datasets
- Dark mode support
- User authentication
๐ค Contributingโ
This is an internal MVP. For feature requests or bug reports, please contact the Baselinr team.
๐ Environment Variablesโ
Backend (.env)โ
BASELINR_DB_URL=postgresql://user:password@host:port/database
API_HOST=0.0.0.0
API_PORT=8000
CORS_ORIGINS=http://localhost:3000
# Chat/AI Configuration (optional)
LLM_ENABLED=true
LLM_PROVIDER=openai # or "anthropic"
LLM_MODEL=gpt-4o-mini # or "claude-3-5-sonnet-20241022"
OPENAI_API_KEY=sk-your-api-key
# ANTHROPIC_API_KEY=sk-ant-your-api-key # if using Anthropic
CHAT_MAX_ITERATIONS=5
CHAT_MAX_HISTORY=20
CHAT_TOOL_TIMEOUT=30
# Or use a config file
BASELINR_CONFIG=/path/to/config.yml
Frontend (.env.local)โ
NEXT_PUBLIC_API_URL=http://localhost:8000
NODE_ENV=development
๐ฌ Chat Featureโ
The Quality Studio includes an AI-powered chat assistant for data quality investigation.
Enabling Chatโ
- Set
LLM_ENABLED=truein your environment - Configure your LLM provider (OpenAI or Anthropic)
- Provide the appropriate API key
Chat Capabilitiesโ
The chat assistant can:
- Query recent profiling runs
- Investigate drift events and anomalies
- Get table profiles and column statistics
- Compare runs and analyze trends
- Explore data lineage relationships
- Search across tables
Example Queriesโ
- "What tables have been profiled recently?"
- "Show me high severity drift events"
- "Are there any anomalies I should investigate?"
- "Compare the last two runs for the customers table"
- "What's the trend for null rate in the email column?"
- "What are the upstream sources for orders table?"
๐ ๏ธ Developmentโ
Backend Developmentโ
cd backend
uvicorn main:app --reload --host 0.0.0.0 --port 8000
Frontend Developmentโ
cd frontend
npm run dev
Visit:
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- API Docs: http://localhost:8000/docs
๐ฆ Production Buildโ
Frontendโ
cd frontend
npm run build
npm start
Backendโ
cd backend
uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4
๐ Troubleshootingโ
Connection Errorsโ
- Ensure Baselinr database is running
- Check
BASELINR_DB_URLenvironment variable - Verify database tables exist (baselinr_runs, baselinr_results, baselinr_events)
No Data Showingโ
- Run the sample data generator:
python sample_data_generator.py - Or run Baselinr profiling:
baselinr profile --config config.yml
CORS Errorsโ
- Check
CORS_ORIGINSin backend includes frontend URL - Verify
NEXT_PUBLIC_API_URLin frontend points to backend
๐ Licenseโ
Internal use only - Baselinr Project