Nate's Project Guide

Project: Portfolio Tracker Category: Data Science Last updated: April 18

Note: This guide reflects the latest state of your project repo. It may not match the most up-to-date version if you've worked since.

Where You Are

You have a working starting slice: main.py reads data/portfolio.csv and prints a clean table with ticker / shares / avg cost / industry. Good news — your CSV already has the columns you need (current_price, sector, beta, dividend_yield, target_price). You just aren't using most of them yet.

This week: clean up the spec, initialize uv, and move your analysis functions into a portfolio.py business-logic module.

Project Structure

Your project splits into two kinds of code:

Business logic — you handwrite this. Loading the portfolio, computing value, computing gain/loss, computing allocation, the concentration warning rule. These are the rules of your tracker.
View / CLI — agent-assisted is fine. Pretty table printing, formatting numbers, column widths. Nothing financial happens here.

Target layout by Thursday:

final-project-NateMalkin/
├── main.py                 ← CLI driver (printing) — agent-assisted OK
├── portfolio.py            ← business logic — handwrite (yours to own)
├── pyproject.toml
└── data/
    └── portfolio.csv       ← data

Why the split? On demo day you'll be asked "how does your tracker decide what's over-concentrated?" The answer is a rule in portfolio.py. The rule is yours. The pretty-printed table is not.

portfolio.py should not print anything. It returns data. main.py does the printing.

You already have an empty portfolio_analysis.py — rename it to portfolio.py (or just reuse it).

Phase 1: Clean Up the Spec

Handwrite this yourself. Your spec is what you're building. No code in this phase.

Objective

project.spec.md currently has headings mixed up with content. Make it readable.

Instructions

Open project.spec.md
Make sure each section heading (## Project Name, ## Category, etc.) has its content below the heading, not in it
Fix the typos ("theexpected", "Poject")
Confirm MVP features and stretch goals are clearly separated

Sample Output

## Project Name

Portfolio Tracker

## Category

Data Science

## Description

This project tracks your securities portfolio. It reports portfolio
value, allocation, gain/loss per security, and expected return
(CAPM). It is for people interested in investing.

## Features

**Must have (MVP):**
- Total portfolio value
- Portfolio allocation by sector (with warning if one sector is too large)
- Monetary gain/loss for each security

**Nice to have (stretch):**
- Expected return (CAPM)
- Best and worst performers

Optional — get help from your agent:

Read my @project.spec.md. The headings and content are mixed up.
Rewrite it so each section has its content directly below. Don't
change my ideas — just the formatting and typos.

Phase 2: Initialize `uv`

Agent-assisted is fine here. uv init is identical for every project.

Objective

Your project doesn't have a pyproject.toml yet.

Instructions

From your project root, run uv init
You don't strictly need pandas (your CSV reader is pure Python), but you can uv add pandas now if you plan to use it later
Confirm uv run python main.py still prints your portfolio table

Hints

If uv init complains because main.py already exists, that's fine — it'll skip creating one and just create pyproject.toml.

Optional — get help from your agent:

Skip — uv init is one command.

Phase 3: Move Loading into `portfolio.py`

Handwrite this yourself. This is how your tracker knows what stocks exist and what they're worth.

Objective

Move load_portfolio from main.py into portfolio.py. While you're at it, read the CSV columns you weren't using yet (current_price, sector). Delete the hardcoded industry_map.

Instructions

Create portfolio.py (or reuse portfolio_analysis.py)
Move load_portfolio into it
Update it to read current_price and sector from the CSV
Delete the old add_industries function and industry_map dict
In main.py, update to from portfolio import load_portfolio

Sample CSV (yours)

ticker,shares,purchase_price,current_price,sector,beta,dividend_yield,target_price
AAPL,10,150,175,Technology,1.20,0.005,190
MSFT,5,320,350,Technology,0.95,0.007,375

Hints

Header-based reading (flexible and handles the ```csv fence in your file):

def load_portfolio(filename):
    portfolio = []
    with open(filename, "r") as file:
        header = None
        for line in file:
            line = line.strip()
            if not line or line.startswith("```"):
                continue
            parts = [p.strip() for p in line.split(",")]
            if header is None:
                header = parts
                continue
            row = dict(zip(header, parts))
            portfolio.append({
                "ticker": row["ticker"],
                "shares": int(row["shares"]),
                "avg_cost": float(row["purchase_price"]),
                "current_price": float(row["current_price"]),
                "sector": row["sector"],
            })
    return portfolio

Why dict(zip(header, parts))? It pairs each column name with its value. So if the CSV order changes or a new column is added, your code still works — you just pick the fields you care about.

Optional — get help from your agent:

Walk me through how dict(zip(header, parts)) works in load_portfolio.
Don't change the code — I want to understand it before I put it in.

Phase 4: Compute Total Value, Gain/Loss, and Allocation

Handwrite this yourself. These functions ARE your project. Every line matters on demo day.

Objective

Three small functions in portfolio.py — each does one thing, returns data, doesn't print.

Instructions

Add total_value(portfolio) → sum of shares * current_price
Add gain_loss(portfolio) → returns a list of dicts, one per stock (ticker, gain, percent)
Add allocation(portfolio) → dict of sector → percent of total value
In main.py, import and call them, then print the results

Sample Output

PORTFOLIO
------------------------------------------------------------
Ticker    Shares    Avg Cost    Sector
AAPL      10        150.00      Technology
MSFT      5         320.00      Technology
JPM       8         140.00      Financials
JNJ       6         156.00      Healthcare
------------------------------------------------------------

Total value: $6,548.00

Gain / Loss:
  AAPL   +$250.00  (+16.7%)
  MSFT   +$150.00   (+9.4%)
  JPM    +$120.00  (+10.7%)
  JNJ     +$42.00   (+4.5%)

Allocation:
  Technology   53.5%
  Financials   18.9%
  Healthcare   14.9%

Hints

total_value:

def total_value(portfolio):
    return sum(s["shares"] * s["current_price"] for s in portfolio)

gain_loss:

def gain_loss(portfolio):
    results = []
    for s in portfolio:
        gain = (s["current_price"] - s["avg_cost"]) * s["shares"]
        percent = (s["current_price"] - s["avg_cost"]) / s["avg_cost"] * 100
        results.append({
            "ticker": s["ticker"],
            "gain": gain,
            "percent": percent,
        })
    return results

allocation:

def allocation(portfolio):
    total = total_value(portfolio)
    by_sector = {}
    for s in portfolio:
        value = s["shares"] * s["current_price"]
        by_sector[s["sector"]] = by_sector.get(s["sector"], 0) + value
    return {sector: (v / total) * 100 for sector, v in by_sector.items()}

Why .get(key, 0)? The first time a sector is seen, it's not in the dict yet. .get(key, 0) returns 0 instead of raising a KeyError, so 0 + value works cleanly.

Type these out — don't copy-paste. They're small enough to retype, and retyping is how they stick.

Optional — get help from your agent:

Walk me through my total_value function's generator expression:
sum(s["shares"] * s["current_price"] for s in portfolio). What's
the difference between that and a list comprehension? Don't change
my code.

Phase 5: Concentration Warning

Handwrite this yourself. The threshold and message are your product decisions.

Objective

Your spec calls for a warning if too much of your portfolio is in one sector. Pick a threshold and add the check.

Instructions

In portfolio.py, add check_concentration(alloc, threshold=40.0) that returns a list of warning strings
In main.py, loop through the result and print each warning

Hints

CONCENTRATION_THRESHOLD = 40.0

def check_concentration(alloc, threshold=CONCENTRATION_THRESHOLD):
    warnings = []
    for sector, pct in alloc.items():
        if pct > threshold:
            warnings.append(
                f"Warning: {sector} is {pct:.1f}% of your portfolio — consider diversifying."
            )
    return warnings

Then in main():

for w in check_concentration(allocation(portfolio)):
    print(w)

Optional — get help from your agent:

Skip — this is about 10 lines.

Phase 6: Pretty Printing in `main.py`

Agent-assisted is fine here. Column widths, dollar formatting, right-alignment — library code. None of it is financial logic.

Objective

Make the table and the reports look clean in the terminal. This is the one phase where the agent does most of the work.

Instructions

Tidy display_portfolio to use padding widths you like
Add separate helpers like display_gain_loss and display_allocation that take the data (already computed in Phase 4) and print it

Hints

Number formatting with f-strings:

print(f"Total value: ${value:,.2f}")     # commas + 2 decimals
print(f"  {ticker:<6} {gain:>+10,.2f}  ({percent:>+5.1f}%)")

The :<6 means left-align in 6 chars. :>10 means right-align in 10. + shows the sign. , adds thousand separators.

Optional — get help from your agent:

Here's my `display_gain_loss(results)` function. Help me make the
columns align cleanly with f-string formatting. Don't touch the
math functions in portfolio.py. Show me the before/after.

Checkpoint 2 Readiness

By Thursday April 23 at 3pm:

project.spec.md is cleaned up and readable
pyproject.toml exists
portfolio.py exists with load_portfolio, total_value, gain_loss, allocation, check_concentration
portfolio.py does not print anything
Old hardcoded industry_map removed
Concentration warning prints when a sector is > 40%
Checkpoint 2 entry in project.journal.md
Committed and pushed