Databricks Security Basics β Tokens, Users & Groups
Youβre back at ShopWave, our fictional retail company.
Youβve set up your first notebooks, clusters, and dashboards. Everything seems perfectβuntil your manager asks:
βHow do we make sure only the right people can access sensitive data?β
Welcome to the world of Databricks Security.
π‘οΈ Why Security Matters in Databricksβ
Databricks houses valuable business data, including:
- Customer PII
- Sales transactions
- Payment info
- ML models
- Inventory forecasts
Without proper security:
- Analysts might accidentally access restricted tables
- Notebooks could be shared outside the team
- Jobs and pipelines could be modified by unauthorized users
Security in Databricks ensures access is controlled, data is protected, and compliance is maintained.
π€ Users β Who Can Log In?β
A user is anyone with a Databricks account.
Each user has:
- Login credentials (email/password, SSO)
- Assigned roles
- Permissions to access workspace resources
At ShopWave:
- Alice is a data engineer
- Bob is a data scientist
- Carol is a business analyst
Each has different privileges according to their role.
π₯ Groups β Organize Users Efficientlyβ
Instead of assigning permissions individually, Databricks uses groups:
- Engineers group β full cluster and notebook access
- Analysts group β read access to dashboards and tables
- Data scientists group β access to ML features and Delta tables
Benefits:
- Easier management for large teams
- Consistent access policies
- Quick onboarding of new employees
ShopWave creates groups for each department to simplify security management.
π Personal Access Tokens β Programmatic Accessβ
Sometimes, scripts or notebooks need to access Databricks without a password.
Enter personal access tokens:
- Used for API access
- Can be time-limited
- Can be revoked at any time
Example use cases at ShopWave:
- CI/CD pipelines fetching notebooks
- Automated ETL jobs reading Delta tables
- External apps running queries via the Databricks REST API
ποΈ Access Control Levelsβ
Databricks provides layered access control:
| Level | Description |
|---|---|
| Workspace | Who can see notebooks, folders, repos |
| Cluster | Who can start, edit, or terminate clusters |
| Data / Tables | Who can read, write, or manage Delta tables |
| Jobs | Who can create, schedule, or run jobs |
| Account-level | Admins controlling global workspace settings |
ShopWave enforces least privilege principle: each user only gets access needed for their job.
π Security Best Practicesβ
- Use groups instead of individual permissions
- Enable SSO (Single Sign-On) for authentication
- Rotate personal access tokens regularly
- Audit workspace activity using Unity Catalog logs
- Enforce multi-factor authentication (MFA)
- Apply table-level and row-level security for sensitive data
Following these practices prevents accidental leaks and ensures compliance.
π§ Story Recap β ShopWave Security in Actionβ
- Alice (Engineer) runs ETL jobs on clusters β belongs to Engineers Group
- Bob (Data Scientist) trains ML models β belongs to Data Scientists Group
- Carol (Analyst) queries dashboards β belongs to Analysts Group
- Tokens are issued for API automation β securely revoked when done
- Admin monitors access β ensures everyone follows least privilege
Result: ShopWave keeps data safe, while teams remain productive.
π Quick Summaryβ
- Users are individual Databricks accounts; Groups manage access collectively.
- Personal Access Tokens allow secure programmatic access.
- Access control layers include workspace, clusters, tables, and jobs.
- Security best practices: SSO, MFA, auditing, least privilege, and token rotation.
- Proper security ensures data protection, compliance, and team productivity.
π Coming Next
π Databricks DBFS β Internal File System Explained