Nate's Project Guide
Project: Portfolio Tracker Category: Data Science Last updated: April 18
Note: This guide reflects the latest state of your project repo. It may not match the most up-to-date version if you've worked since.
Where You Are
You have a working starting slice: main.py reads data/portfolio.csv and prints a clean table with ticker / shares / avg cost / industry. Good news — your CSV already has the columns you need (current_price, sector, beta, dividend_yield, target_price). You just aren't using most of them yet.
This week: clean up the spec, initialize uv, and move your analysis functions into a portfolio.py business-logic module.
Project Structure
Your project splits into two kinds of code:
- Business logic — you handwrite this. Loading the portfolio, computing value, computing gain/loss, computing allocation, the concentration warning rule. These are the rules of your tracker.
- View / CLI — agent-assisted is fine. Pretty table printing, formatting numbers, column widths. Nothing financial happens here.
Target layout by Thursday:
final-project-NateMalkin/
├── main.py ← CLI driver (printing) — agent-assisted OK
├── portfolio.py ← business logic — handwrite (yours to own)
├── pyproject.toml
└── data/
└── portfolio.csv ← data
Why the split? On demo day you'll be asked "how does your tracker decide what's over-concentrated?" The answer is a rule in portfolio.py. The rule is yours. The pretty-printed table is not.
portfolio.py should not print anything. It returns data. main.py does the printing.
You already have an empty portfolio_analysis.py — rename it to portfolio.py (or just reuse it).
Phase 1: Clean Up the Spec
Handwrite this yourself. Your spec is what you're building. No code in this phase.
Objective
project.spec.md currently has headings mixed up with content. Make it readable.
Instructions
Sample Output
## Project Name
Portfolio Tracker
## Category
Data Science
## Description
This project tracks your securities portfolio. It reports portfolio
value, allocation, gain/loss per security, and expected return
(CAPM). It is for people interested in investing.
## Features
**Must have (MVP):**
- Total portfolio value
- Portfolio allocation by sector (with warning if one sector is too large)
- Monetary gain/loss for each security
**Nice to have (stretch):**
- Expected return (CAPM)
- Best and worst performers
Optional — get help from your agent:
Read my @project.spec.md. The headings and content are mixed up. Rewrite it so each section has its content directly below. Don't change my ideas — just the formatting and typos.
Phase 2: Initialize uv
Agent-assisted is fine here.
uv initis identical for every project.
Objective
Your project doesn't have a pyproject.toml yet.
Instructions
Hints
If uv init complains because main.py already exists, that's fine — it'll skip creating one and just create pyproject.toml.
Optional — get help from your agent:
Skip —
uv initis one command.
Phase 3: Move Loading into portfolio.py
Handwrite this yourself. This is how your tracker knows what stocks exist and what they're worth.
Objective
Move load_portfolio from main.py into portfolio.py. While you're at it, read the CSV columns you weren't using yet (current_price, sector). Delete the hardcoded industry_map.
Instructions
Sample CSV (yours)
ticker,shares,purchase_price,current_price,sector,beta,dividend_yield,target_price
AAPL,10,150,175,Technology,1.20,0.005,190
MSFT,5,320,350,Technology,0.95,0.007,375
Hints
Header-based reading (flexible and handles the ```csv fence in your file):
def load_portfolio(filename):
portfolio = []
with open(filename, "r") as file:
header = None
for line in file:
line = line.strip()
if not line or line.startswith("```"):
continue
parts = [p.strip() for p in line.split(",")]
if header is None:
header = parts
continue
row = dict(zip(header, parts))
portfolio.append({
"ticker": row["ticker"],
"shares": int(row["shares"]),
"avg_cost": float(row["purchase_price"]),
"current_price": float(row["current_price"]),
"sector": row["sector"],
})
return portfolio
Why dict(zip(header, parts))? It pairs each column name with its value. So if the CSV order changes or a new column is added, your code still works — you just pick the fields you care about.
Optional — get help from your agent:
Walk me through how dict(zip(header, parts)) works in load_portfolio. Don't change the code — I want to understand it before I put it in.
Phase 4: Compute Total Value, Gain/Loss, and Allocation
Handwrite this yourself. These functions ARE your project. Every line matters on demo day.
Objective
Three small functions in portfolio.py — each does one thing, returns data, doesn't print.
Instructions
Sample Output
PORTFOLIO
------------------------------------------------------------
Ticker Shares Avg Cost Sector
AAPL 10 150.00 Technology
MSFT 5 320.00 Technology
JPM 8 140.00 Financials
JNJ 6 156.00 Healthcare
------------------------------------------------------------
Total value: $6,548.00
Gain / Loss:
AAPL +$250.00 (+16.7%)
MSFT +$150.00 (+9.4%)
JPM +$120.00 (+10.7%)
JNJ +$42.00 (+4.5%)
Allocation:
Technology 53.5%
Financials 18.9%
Healthcare 14.9%
Hints
total_value:
def total_value(portfolio):
return sum(s["shares"] * s["current_price"] for s in portfolio)
gain_loss:
def gain_loss(portfolio):
results = []
for s in portfolio:
gain = (s["current_price"] - s["avg_cost"]) * s["shares"]
percent = (s["current_price"] - s["avg_cost"]) / s["avg_cost"] * 100
results.append({
"ticker": s["ticker"],
"gain": gain,
"percent": percent,
})
return results
allocation:
def allocation(portfolio):
total = total_value(portfolio)
by_sector = {}
for s in portfolio:
value = s["shares"] * s["current_price"]
by_sector[s["sector"]] = by_sector.get(s["sector"], 0) + value
return {sector: (v / total) * 100 for sector, v in by_sector.items()}
Why .get(key, 0)? The first time a sector is seen, it's not in the dict yet. .get(key, 0) returns 0 instead of raising a KeyError, so 0 + value works cleanly.
Type these out — don't copy-paste. They're small enough to retype, and retyping is how they stick.
Optional — get help from your agent:
Walk me through my total_value function's generator expression: sum(s["shares"] * s["current_price"] for s in portfolio). What's the difference between that and a list comprehension? Don't change my code.
Phase 5: Concentration Warning
Handwrite this yourself. The threshold and message are your product decisions.
Objective
Your spec calls for a warning if too much of your portfolio is in one sector. Pick a threshold and add the check.
Instructions
Hints
CONCENTRATION_THRESHOLD = 40.0
def check_concentration(alloc, threshold=CONCENTRATION_THRESHOLD):
warnings = []
for sector, pct in alloc.items():
if pct > threshold:
warnings.append(
f"Warning: {sector} is {pct:.1f}% of your portfolio — consider diversifying."
)
return warnings
Then in main():
for w in check_concentration(allocation(portfolio)):
print(w)
Optional — get help from your agent:
Skip — this is about 10 lines.
Phase 6: Pretty Printing in main.py
Agent-assisted is fine here. Column widths, dollar formatting, right-alignment — library code. None of it is financial logic.
Objective
Make the table and the reports look clean in the terminal. This is the one phase where the agent does most of the work.
Instructions
Hints
Number formatting with f-strings:
print(f"Total value: ${value:,.2f}") # commas + 2 decimals
print(f" {ticker:<6} {gain:>+10,.2f} ({percent:>+5.1f}%)")
The :<6 means left-align in 6 chars. :>10 means right-align in 10. + shows the sign. , adds thousand separators.
Optional — get help from your agent:
Here's my `display_gain_loss(results)` function. Help me make the columns align cleanly with f-string formatting. Don't touch the math functions in portfolio.py. Show me the before/after.