How to Organize Projects in Databricks — Best Folder Strategy

Welcome back to ShopWave, our fictional retail company.
Your manager asks:

“Our workspace is messy! How do we organize projects so everyone can find things easily?”

Let’s walk through best practices for organizing Databricks projects in a story-based, beginner-friendly way.

🏗️ Why Project Organization Matters

Without a proper structure:

Notebooks get lost
Teams overwrite each other’s work
Jobs and pipelines become hard to maintain
Collaboration slows down

With a good structure, ShopWave:

Finds ETL notebooks quickly
Tracks ML experiments
Shares dashboards efficiently
Maintains clear permissions for sensitive data

🗂️ Recommended Folder Structure

Here’s a proven structure for Databricks projects:

/Workspace
├── /Users
│    └── /<username>
│         └── /personal_notebooks
├── /Shared
│    ├── /ETL
│    ├── /ML
│    ├── /SQL
│    └── /Dashboards
├── /Repos
│    └── /git_repos
└── /Projects
├── /Project_A
│    ├── /Data
│    ├── /Notebooks
│    ├── /Models
│    └── /Jobs
└── /Project_B
├── /Data
├── /Notebooks
├── /Models
└── /Jobs

🔹 Folder Explanation

1️⃣ `/Users/<username>/personal_notebooks`

Personal experiments and practice notebooks
Safe to try new code without affecting team projects

2️⃣ `/Shared`

Common notebooks and resources for the team
Subfolders by function: ETL, ML, SQL, Dashboards
Everyone can collaborate, but with controlled permissions

3️⃣ `/Repos`

Git-integrated folders for version-controlled projects
Sync notebooks with GitHub, GitLab, or Bitbucket
Ideal for reproducibility and CI/CD pipelines

4️⃣ `/Projects/<Project_Name>`

Full project-level structure
Includes data, notebooks, models, and jobs
Keeps production-ready code organized
Easy to assign RBAC and monitor activity

🧩 Best Practices for Project Organization

Use descriptive folder names → avoids confusion
Separate personal vs shared work → prevents accidental edits
Organize by project → ETL, ML, BI dashboards
Integrate with Git → version control and collaboration
Set access permissions at folder level → least privilege principle
Archive old projects → reduces clutter and storage cost

ShopWave Tip: Assign one project lead to maintain folder consistency.

🏢 Real Business Example — ShopWave

ETL Team: Saves notebooks in /Shared/ETL
ML Team: Stores trained models in /Projects/RecommendationEngine/Models
Analytics Team: Dashboards in /Shared/Dashboards
New Employees: Start in /Users/<username>/personal_notebooks before moving notebooks to shared folders

Result: Teams work efficiently without overwriting each other, and admins can manage access easily.

🏁 Quick Summary

Organize Databricks projects by personal, shared, and project folders
Use /Users, /Shared, /Repos, and /Projects for structure
Best practices: descriptive names, separate personal vs shared, Git integration, access control, archive old projects
Helps teams collaborate, maintain reproducibility, and reduce clutter

🚀 Coming Next

👉 Databricks Serverless Compute — When & Why to Use