> Source URL: /unit-3/project-paths/nate-m/nate-m-2026-04-18.guide
# Nate's Project Guide

**Project:** Portfolio Tracker
**Category:** Data Science
**Last updated:** April 18

---

> Note: This guide reflects the latest state of your project repo. It may not match the most up-to-date version if you've worked since.

## Where You Are

You have a working starting slice: `main.py` reads `data/portfolio.csv` and prints a clean table with ticker / shares / avg cost / industry. Good news — your CSV already has the columns you need (`current_price`, `sector`, `beta`, `dividend_yield`, `target_price`). You just aren't using most of them yet.

This week: clean up the spec, initialize `uv`, and move your analysis functions into a `portfolio.py` business-logic module.

---

## Project Structure

Your project splits into two kinds of code:

- **Business logic — you handwrite this.** Loading the portfolio, computing value, computing gain/loss, computing allocation, the concentration warning rule. These are the rules of your tracker.
- **View / CLI — agent-assisted is fine.** Pretty table printing, formatting numbers, column widths. Nothing financial happens here.

Target layout by Thursday:

```
final-project-NateMalkin/
├── main.py                 ← CLI driver (printing) — agent-assisted OK
├── portfolio.py            ← business logic — handwrite (yours to own)
├── pyproject.toml
└── data/
    └── portfolio.csv       ← data
```

Why the split? On demo day you'll be asked "how does your tracker decide what's over-concentrated?" The answer is a rule in `portfolio.py`. The rule is yours. The pretty-printed table is not.

**`portfolio.py` should not print anything.** It returns data. `main.py` does the printing.

You already have an empty `portfolio_analysis.py` — rename it to `portfolio.py` (or just reuse it).

---

## Phase 1: Clean Up the Spec

> **Handwrite this yourself.** Your spec is what you're building. No code in this phase.

### Objective

`project.spec.md` currently has headings mixed up with content. Make it readable.

### Instructions

- [ ] Open `project.spec.md`
- [ ] Make sure each section heading (`## Project Name`, `## Category`, etc.) has its content **below** the heading, not in it
- [ ] Fix the typos ("theexpected", "Poject")
- [ ] Confirm MVP features and stretch goals are clearly separated

### Sample Output

```markdown
## Project Name

Portfolio Tracker

## Category

Data Science

## Description

This project tracks your securities portfolio. It reports portfolio
value, allocation, gain/loss per security, and expected return
(CAPM). It is for people interested in investing.

## Features

**Must have (MVP):**
- Total portfolio value
- Portfolio allocation by sector (with warning if one sector is too large)
- Monetary gain/loss for each security

**Nice to have (stretch):**
- Expected return (CAPM)
- Best and worst performers
```

> **Optional — get help from your agent:**
>
> ```text
> Read my @project.spec.md. The headings and content are mixed up.
> Rewrite it so each section has its content directly below. Don't
> change my ideas — just the formatting and typos.
> ```

---

## Phase 2: Initialize `uv`

> **Agent-assisted is fine here.** `uv init` is identical for every project.

### Objective

Your project doesn't have a `pyproject.toml` yet.

### Instructions

- [ ] From your project root, run `uv init`
- [ ] You don't strictly need pandas (your CSV reader is pure Python), but you can `uv add pandas` now if you plan to use it later
- [ ] Confirm `uv run python main.py` still prints your portfolio table

### Hints

**If `uv init` complains because `main.py` already exists**, that's fine — it'll skip creating one and just create `pyproject.toml`.

> **Optional — get help from your agent:**
>
> Skip — `uv init` is one command.

---

## Phase 3: Move Loading into `portfolio.py`

> **Handwrite this yourself.** This is how your tracker knows what stocks exist and what they're worth.

### Objective

Move `load_portfolio` from `main.py` into `portfolio.py`. While you're at it, read the CSV columns you weren't using yet (`current_price`, `sector`). Delete the hardcoded `industry_map`.

### Instructions

- [ ] Create `portfolio.py` (or reuse `portfolio_analysis.py`)
- [ ] Move `load_portfolio` into it
- [ ] Update it to read `current_price` and `sector` from the CSV
- [ ] Delete the old `add_industries` function and `industry_map` dict
- [ ] In `main.py`, update to `from portfolio import load_portfolio`

### Sample CSV (yours)

```
ticker,shares,purchase_price,current_price,sector,beta,dividend_yield,target_price
AAPL,10,150,175,Technology,1.20,0.005,190
MSFT,5,320,350,Technology,0.95,0.007,375
```

### Hints

**Header-based reading (flexible and handles the ```csv fence in your file):**

```python
def load_portfolio(filename):
    portfolio = []
    with open(filename, "r") as file:
        header = None
        for line in file:
            line = line.strip()
            if not line or line.startswith("```"):
                continue
            parts = [p.strip() for p in line.split(",")]
            if header is None:
                header = parts
                continue
            row = dict(zip(header, parts))
            portfolio.append({
                "ticker": row["ticker"],
                "shares": int(row["shares"]),
                "avg_cost": float(row["purchase_price"]),
                "current_price": float(row["current_price"]),
                "sector": row["sector"],
            })
    return portfolio
```

**Why `dict(zip(header, parts))`?** It pairs each column name with its value. So if the CSV order changes or a new column is added, your code still works — you just pick the fields you care about.

> **Optional — get help from your agent:**
>
> ```text
> Walk me through how dict(zip(header, parts)) works in load_portfolio.
> Don't change the code — I want to understand it before I put it in.
> ```

---

## Phase 4: Compute Total Value, Gain/Loss, and Allocation

> **Handwrite this yourself.** These functions ARE your project. Every line matters on demo day.

### Objective

Three small functions in `portfolio.py` — each does one thing, returns data, doesn't print.

### Instructions

- [ ] Add `total_value(portfolio)` → sum of `shares * current_price`
- [ ] Add `gain_loss(portfolio)` → returns a list of dicts, one per stock (ticker, gain, percent)
- [ ] Add `allocation(portfolio)` → dict of `sector → percent of total value`
- [ ] In `main.py`, import and call them, then print the results

### Sample Output

```
PORTFOLIO
------------------------------------------------------------
Ticker    Shares    Avg Cost    Sector
AAPL      10        150.00      Technology
MSFT      5         320.00      Technology
JPM       8         140.00      Financials
JNJ       6         156.00      Healthcare
------------------------------------------------------------

Total value: $6,548.00

Gain / Loss:
  AAPL   +$250.00  (+16.7%)
  MSFT   +$150.00   (+9.4%)
  JPM    +$120.00  (+10.7%)
  JNJ     +$42.00   (+4.5%)

Allocation:
  Technology   53.5%
  Financials   18.9%
  Healthcare   14.9%
```

### Hints

**`total_value`:**

```python
def total_value(portfolio):
    return sum(s["shares"] * s["current_price"] for s in portfolio)
```

**`gain_loss`:**

```python
def gain_loss(portfolio):
    results = []
    for s in portfolio:
        gain = (s["current_price"] - s["avg_cost"]) * s["shares"]
        percent = (s["current_price"] - s["avg_cost"]) / s["avg_cost"] * 100
        results.append({
            "ticker": s["ticker"],
            "gain": gain,
            "percent": percent,
        })
    return results
```

**`allocation`:**

```python
def allocation(portfolio):
    total = total_value(portfolio)
    by_sector = {}
    for s in portfolio:
        value = s["shares"] * s["current_price"]
        by_sector[s["sector"]] = by_sector.get(s["sector"], 0) + value
    return {sector: (v / total) * 100 for sector, v in by_sector.items()}
```

**Why `.get(key, 0)`?** The first time a sector is seen, it's not in the dict yet. `.get(key, 0)` returns `0` instead of raising a KeyError, so `0 + value` works cleanly.

Type these out — don't copy-paste. They're small enough to retype, and retyping is how they stick.

> **Optional — get help from your agent:**
>
> ```text
> Walk me through my total_value function's generator expression:
> sum(s["shares"] * s["current_price"] for s in portfolio). What's
> the difference between that and a list comprehension? Don't change
> my code.
> ```

---

## Phase 5: Concentration Warning

> **Handwrite this yourself.** The threshold and message are your product decisions.

### Objective

Your spec calls for a warning if too much of your portfolio is in one sector. Pick a threshold and add the check.

### Instructions

- [ ] In `portfolio.py`, add `check_concentration(alloc, threshold=40.0)` that returns a list of warning strings
- [ ] In `main.py`, loop through the result and print each warning

### Hints

```python
CONCENTRATION_THRESHOLD = 40.0

def check_concentration(alloc, threshold=CONCENTRATION_THRESHOLD):
    warnings = []
    for sector, pct in alloc.items():
        if pct > threshold:
            warnings.append(
                f"Warning: {sector} is {pct:.1f}% of your portfolio — consider diversifying."
            )
    return warnings
```

Then in `main()`:

```python
for w in check_concentration(allocation(portfolio)):
    print(w)
```

> **Optional — get help from your agent:**
>
> Skip — this is about 10 lines.

---

## Phase 6: Pretty Printing in `main.py`

> **Agent-assisted is fine here.** Column widths, dollar formatting, right-alignment — library code. None of it is financial logic.

### Objective

Make the table and the reports look clean in the terminal. This is the one phase where the agent does most of the work.

### Instructions

- [ ] Tidy `display_portfolio` to use padding widths you like
- [ ] Add separate helpers like `display_gain_loss` and `display_allocation` that take the data (already computed in Phase 4) and print it

### Hints

**Number formatting with f-strings:**

```python
print(f"Total value: ${value:,.2f}")     # commas + 2 decimals
print(f"  {ticker:<6} {gain:>+10,.2f}  ({percent:>+5.1f}%)")
```

The `:<6` means left-align in 6 chars. `:>10` means right-align in 10. `+` shows the sign. `,` adds thousand separators.

> **Optional — get help from your agent:**
>
> ```text
> Here's my `display_gain_loss(results)` function. Help me make the
> columns align cleanly with f-string formatting. Don't touch the
> math functions in portfolio.py. Show me the before/after.
> ```

---

## Checkpoint 2 Readiness

By Thursday April 23 at 3pm:

- [ ] `project.spec.md` is cleaned up and readable
- [ ] `pyproject.toml` exists
- [ ] `portfolio.py` exists with `load_portfolio`, `total_value`, `gain_loss`, `allocation`, `check_concentration`
- [ ] `portfolio.py` does **not** print anything
- [ ] Old hardcoded `industry_map` removed
- [ ] Concentration warning prints when a sector is > 40%
- [ ] Checkpoint 2 entry in `project.journal.md`
- [ ] Committed and pushed

## Helpful Resources

- [Checkpoint 2 Instructions](../../projects/final-project-checkpoint-2.project.md)
- [Lecture 1: The MVP](../../lectures/01-the-mvp/01-the-mvp.lecture.md)
- [Data Science Setup Guide](../../resources/data-science-setup.guide.md)


---

## Backlinks

The following sources link to this document:

- [April 18 -- Checkpoint 2 (Working MVP)](/unit-3/project-paths/nate-m/nate-m.path.llm.md)
