AI Professionals Bootcamp | Week 1
2025-12-14
Tip
If you get stuck: write down the exact error, the command you ran, and the file you edited.
Warning
The point is skill-building. Using GenAI to do the work breaks the learning loop.
Goal: Set up your environment, use the shell confidently, and write your first Python scripts that read a CSV and produce a basic profile.
Bootcamp • SDAIA Academy
uvBy the end of today, you can:
uvYou will build a small app called “CSV Profiler” with two interfaces:
report.md and report.jsonInput: a CSV file
Output:
report.json → machine-readable profiling statsreport.md → human-readable reportYour code will handle:
Setup + Shell essentials + uv
uvls, python, git)Command
pwd → print working directoryls (mac/linux) or dir (Windows) → list filesTry it
pwdls / dircd <path> → change directorycd .. → go up one foldercd . → current folder (rarely useful)cd ~ → home foldercd - → previous folder (super useful)You’ll use these today to set up your project.
macOS/Linux
Windows PowerShell
Absolute path starts from the root.
/Users/<name>/...C:\Users\<name>\...Relative path starts from your current folder.
./data/sample.csv (inside current folder)../data/sample.csv (one level up)cd "My Files"Data ≠ data)~/bootcamp/cd ~bootcamp (use your file explorer if needed)cd bootcamppwd and ls / dirCheckpoint: your terminal shows you are inside bootcamp.
PATH (where your shell looks for commands)Try:
echo $PATH (mac/linux)echo $env:PATH (PowerShell)which python (mac/linux)where python (Windows)Interpretation:
.venv/ → you are in a virtual environmentDifferent projects need different packages.
uv: our tool for environments + installsToday we’ll use:
uv runpython and pip point to the envuv run ... still uses the envRecommended habit: use uv run for anything you want to be reproducible
From inside bootcamp/:
Expected result: a folder named .venv/
python points to .venv automaticallyuv run ... to guarantee the envTip
If you ever wonder “which python am I using?”, run which python / where python.
mac/linux:
Windows:
Check: your prompt usually changes.
csv, json)typer, streamlit)uv pip install ... downloads third-party packages into your project’s env so you can import them later.
Tip
If installation fails: copy the full error + your OS info and ask the instructor.
uv runCreate hello.py:
Run:
What is the main difference?
uv run hello.pypython hello.pyAnswer: uv run ensures the command runs inside the project environment.
hello.py to print your nameCheckpoint: you can explain what line the error points to.
pwd, ls/dir, cduv venv + uv run20 minutes
Python values, containers, operators
None, True, False0, -2, 1_000_000, 0x1f1.5, 1e6, -2.5e-3'hi', "hi", """multi"""| Type | Example | Mutable? | Typical use |
|---|---|---|---|
list |
[1, 2, 3] |
✅ | ordered items |
tuple |
(1, 2) |
❌ | fixed group |
set |
{1, 2} |
✅ | unique items |
dict |
{ "a": 1 } |
✅ | key → value |
You’ll often see something.do_this(...).
do_this is a method: a function that belongs to that value(...) mean “call the function”Example:
Use cases:
x in my_set)x becomes [1, 2, 3, 4] because x and y point to the same list.
+ - * / // % **< <= == != >= >and or notin, not inis, is not (usually with None)in vs == vs isx in container → membership (lists/strings/sets/dicts)x == y → value equalityx is y → same object in memory (use for None)Rule of thumb:
(...)*** / // %+ -== < > ...not → and → orWhen in doubt: add parentheses.
(...)*** / // %+ -< == >not, and, orWhen in doubt: add parentheses.
What do these evaluate to?
5 // 25 / 25 % 22 ** 3Answers: 2, 2.5, 1, 8
These are False:
None, 0, 0.0, "", [], {}, set()Most other things are True.
Warning
bool("False") is True because it’s a non-empty string.
Decide without running:
bool([])bool([0])int(1.9)float(3)Answers: False, True, 1, 3.0
Why we care:
csv.DictReader gives you dicts (column → value)You’ll often need the column names and to check if a key exists.
When you build reports, you’ll print and write lots of text.
Three common ways:
f-strings let you put values inside a string:
We’ll use this style a lot in reports.
Create a dictionary with:
"city""temp_c""is_weekend"Then print a sentence using an f-string.
Checkpoint: Your output includes all three values.
20 minutes
Control flow + strings/lists/dicts + files
if/elif/else and conditional expressionsfor and whiletry/exceptwith open(...)In Python, whitespace is part of the syntax.
: starts a blockComments start with #.
if in practicefor loops: your default loopPattern you’ll use for CSV: loop over rows, update counters.
A compact way to build a new list from a loop.
Use when it’s short and clear. Otherwise, use a normal for loop.
range, enumerate, zipwhile loops: when you don’t know “how many”continue → skip to next iterationbreak → stop the loopWhat prints?
Answer: 1 then 3
match: clean branching (Python 3.10+)Use this to catch “impossible states” early.
with open(...)When code gets longer, put pieces into functions.
Two key ideas:
def starts a functionreturn sends a value back to the callerWhile drafting, you might also see:
pass → “do nothing for now” (a placeholder)If you create a file math_tools.py:
You can use it from main.py in the same folder:
A folder can also be a package if it contains an __init__.py file:
Then you can import from it like:
In many projects, main.py can be:
python main.py)This pattern makes sure code runs only when executed as a script:
len(rows) → row countmin(numbers), max(numbers) → extremessum(numbers) / len(numbers) → meansorted(items) → orderingenumerate(items) → index + valueYou will build a report like:
Given this list of rows:
Write code that counts missing values per column.
Rule: treat empty string "" as missing.
with open(...)20 minutes
Build the project: CSV Profiler (Part 1)
Warning
Do not ask GenAI to write your solution code. Ask it to explain concepts or errors.
By the end of the day, you should have:
.venv/csv_profiler/report.json and report.mdFor Day 1, we’ll keep things simple so imports “just work”.
bootcamp/, create a folder csv-profiler/cd csv-profileruv venv -p 3.11Checkpoint: you have .venv/ inside csv-profiler/.
Create data/sample.csv with this content:
Checkpoint: you can open it in VS Code.
Create these files (match the minimal layout slide):
csv_profiler/__init__.pycsv_profiler/io.pycsv_profiler/profile.pycsv_profiler/render.pymain.py (entrypoint for today)main.pyPaste this first:
from csv_profiler.io import read_csv_rows
from csv_profiler.profile import basic_profile
from csv_profiler.render import write_json, write_markdown
def main():
rows = read_csv_rows("data/sample.csv")
report = basic_profile(rows)
write_json(report, "outputs/report.json")
write_markdown(report, "outputs/report.md")
print("Wrote outputs/report.json and outputs/report.md")
if __name__ == "__main__":
main()csv.DictReaderPython’s standard library has a csv module.
Notes:
""In csv_profiler/io.py implement:
Rules:
with open(..., newline="")csv.DictReader to parse rowsread_csv_rows (example)In csv_profiler/profile.py, implement:
Definition of missing today:
If there is at least one row:
Then loop rows and update counts.
basic_profile (day-1 version)def basic_profile(rows):
if not rows:
return {"rows": 0, "n_cols": 0, "columns": [], "missing": {}, "non_empty": {}}
columns = list(rows[0].keys())
missing = {}
non_empty = {}
for c in columns:
missing[c] = 0
non_empty[c] = 0
for row in rows:
for c in columns:
v = row[c].strip()
if v == "": # DictReader gives empty string for missing cells
missing[c] += 1
else:
non_empty[c] += 1
return {
"rows": len(rows), "n_cols": len(columns), "columns": columns,
"missing": missing, "non_empty": non_empty
}Goal: infer a simple type label per column:
number if all non-empty values can be parsed as floattext otherwisePseudo-steps:
float(value) in a try/except ValueErrortextExample helper:
In csv_profiler/render.py implement:
Requirements:
write_jsonImplement:
Markdown should include:
write_markdown (simple)import os
def write_markdown(report, path):
folder = os.path.dirname(path)
if folder:
os.makedirs(folder, exist_ok=True)
cols = report.get("columns", [])
missing = report.get("missing", {})
lines = []
lines.append("# CSV Profiling Report\n")
lines.append(f"- Rows: **{report.get('rows', 0)}**")
lines.append(f"- Columns: **{report.get('n_cols', 0)}**\n")
lines.append("## Missing Values\n")
lines.append("| column | missing |")
lines.append("|---|---:|")
for c in cols:
lines.append(f"| {c} | {missing.get(c, 0)} |")
with open(path, "w", encoding="utf-8") as f:
f.write("\n".join(lines) + "\n")From the project root:
Then open:
outputs/report.jsonoutputs/report.mdCheckpoint: both files exist and match the sample CSV.
print(rows[0]))Add one more section to the Markdown:
Bonus:
top_values list for text columns (most common values)In 1–2 sentences:
What was the most confusing part today: paths, environments, or Python control flow?
Due: before Day 2 starts
data/sample.csvreport.md to include a short “Notes” sectionDeliverable: a zip or folder with your csv-profiler/ project.
Tip
Tomorrow we’ll refactor into functions + modules and add better type inference.