How to Organize Projects in Databricks β Best Folder Strategy
How to Organize Projects in Databricks β Best Folder Strategyβ
Welcome back to ShopWave, our fictional retail company.
Your manager asks:
βOur workspace is messy! How do we organize projects so everyone can find things easily?β
Letβs walk through best practices for organizing Databricks projects in a story-based, beginner-friendly way.
ποΈ Why Project Organization Mattersβ
Without a proper structure:
- Notebooks get lost
- Teams overwrite each otherβs work
- Jobs and pipelines become hard to maintain
- Collaboration slows down
With a good structure, ShopWave:
- Finds ETL notebooks quickly
- Tracks ML experiments
- Shares dashboards efficiently
- Maintains clear permissions for sensitive data
ποΈ Recommended Folder Structureβ
Hereβs a proven structure for Databricks projects:
/Workspace
βββ /Users
β βββ /<username>
β βββ /personal_notebooks
βββ /Shared
β βββ /ETL
β βββ /ML
β βββ /SQL
β βββ /Dashboards
βββ /Repos
β βββ /git_repos
βββ /Projects
βββ /Project_A
β βββ /Data
β βββ /Notebooks
β βββ /Models
β βββ /Jobs
βββ /Project_B
βββ /Data
βββ /Notebooks
βββ /Models
βββ /Jobs
πΉ Folder Explanationβ
1οΈβ£ /Users/<username>/personal_notebooksβ
- Personal experiments and practice notebooks
- Safe to try new code without affecting team projects
2οΈβ£ /Sharedβ
- Common notebooks and resources for the team
- Subfolders by function: ETL, ML, SQL, Dashboards
- Everyone can collaborate, but with controlled permissions
3οΈβ£ /Reposβ
- Git-integrated folders for version-controlled projects
- Sync notebooks with GitHub, GitLab, or Bitbucket
- Ideal for reproducibility and CI/CD pipelines
4οΈβ£ /Projects/<Project_Name>β
- Full project-level structure
- Includes data, notebooks, models, and jobs
- Keeps production-ready code organized
- Easy to assign RBAC and monitor activity
π§© Best Practices for Project Organizationβ
- Use descriptive folder names β avoids confusion
- Separate personal vs shared work β prevents accidental edits
- Organize by project β ETL, ML, BI dashboards
- Integrate with Git β version control and collaboration
- Set access permissions at folder level β least privilege principle
- Archive old projects β reduces clutter and storage cost
ShopWave Tip: Assign one project lead to maintain folder consistency.
π’ Real Business Example β ShopWaveβ
- ETL Team: Saves notebooks in
/Shared/ETL - ML Team: Stores trained models in
/Projects/RecommendationEngine/Models - Analytics Team: Dashboards in
/Shared/Dashboards - New Employees: Start in
/Users/<username>/personal_notebooksbefore moving notebooks to shared folders
Result: Teams work efficiently without overwriting each other, and admins can manage access easily.
π Quick Summaryβ
- Organize Databricks projects by personal, shared, and project folders
- Use
/Users,/Shared,/Repos, and/Projectsfor structure - Best practices: descriptive names, separate personal vs shared, Git integration, access control, archive old projects
- Helps teams collaborate, maintain reproducibility, and reduce clutter
π Coming Next
π Databricks Serverless Compute β When & Why to Use