Miranda's Project Guide

Project: PlainText Pal Category: Web Development (Flask) + Anthropic Last updated: April 18

Note: This guide reflects the latest state of your project repo. It may not match the most up-to-date version if you've worked since.

Where You Are

Your MVP is mostly working. app.py routes are clean. pal.py has a real Anthropic integration with JSON parsing, fenced-block extraction, and fallbacks — the thoughtfulness in _extract_json_object (three parsing strategies) is impressive.

What's left: the readability stats your spec mentions, some edge-case polish, and a nicer results page. We'll also split pal.py a bit so your handwritten logic is separate from the library call.

Project Structure

Your project splits into two kinds of code:

Business logic — you handwrite this. The JSON parser in pal.py (three fallback strategies — your design), the readability stats in stats.py (your metric choices: average sentence length, long-sentence count, complex-word count). This is what makes PlainText Pal different from raw Claude.
Library / view code — agent-assisted is fine. The Anthropic API call itself, Flask routes, HTML templates.

Target layout by Thursday:

final-project-MirandaMireles/
├── app.py                  ← Flask routes — agent-assisted OK
├── pal.py                  ← LLM call + JSON parser (mixed — see below)
├── stats.py                ← readability stats — handwrite (NEW, yours to own)
├── pyproject.toml
├── templates/              ← HTML — agent-assisted OK
└── static/

About pal.py — it's mixed on purpose. The call to Anthropic is library code. The prompt content and the JSON parsing are yours. A reasonable rule: if you need to look up Anthropic SDK docs to understand a line, that's library code. If you designed the line, that's business logic.

Why the split? From Lecture 1: The MVP — on demo day the interesting question is "how do you handle Claude returning malformed JSON?" or "what readability metrics do you show and why?" Those answers live in pal.py's parser and stats.py.

stats.py should not import flask or anthropic. Pure text processing.

Phase 1: Build `stats.py` (readability metrics)

Handwrite this yourself. Which metrics you show and how you compute long sentences and complex words — those are your product decisions. Readers will ask.

Objective

Your spec lists readability metrics (average sentence length, long sentence count, flagging complex words). Put them in a new stats.py file.

Instructions

Run uv add textstat
Create stats.py at the project root
Write get_readability_stats(text) that returns a dict with: flesch_ease, avg_sentence_length, long_sentences, complex_words
Write a helper _count_long_sentences(text, threshold) — textstat doesn't do this directly

Sample Output (what shows on the result page)

Readability Stats
  • Flesch Reading Ease: 62.3 (plain English)
  • Average Sentence Length: 18.4 words
  • Long sentences (>25 words): 3
  • Complex words: 12

Hints

The textstat functions:

# stats.py
import re
import textstat


def _count_long_sentences(text, threshold=25):
    sentences = re.split(r"[.!?]+", text)
    return sum(1 for s in sentences if len(s.split()) > threshold)


def get_readability_stats(text):
    sentences = max(textstat.sentence_count(text), 1)   # avoid division by zero
    words = textstat.lexicon_count(text, removepunct=True)
    return {
        "flesch_ease": round(textstat.flesch_reading_ease(text), 1),
        "avg_sentence_length": round(words / sentences, 1),
        "long_sentences": _count_long_sentences(text),
        "complex_words": textstat.difficult_words(text),
    }

Pick the threshold yourself. I suggested 25 words for "long sentence", but 20 or 30 are defensible — pick one and be ready to explain why.

Optional — get help from your agent:

Walk me through textstat.flesch_reading_ease — what does the
score mean? What range is "plain English"? I want to explain the
number to users.

Phase 2: Wire Stats Into the `/analyze` Route

Agent-assisted is fine here. Adding a variable to a Flask render_template call is library code.

Objective

Call get_readability_stats alongside get_suggestions in the route, pass both to the template.

Instructions

In app.py, import get_readability_stats from stats
In /analyze, compute stats after getting suggestions
Pass stats to results.html

Hints

from stats import get_readability_stats

@app.route("/analyze", methods=["POST"])
def analyze():
    text = request.form.get("text", "").strip()
    pal_results = get_suggestions(text)
    stats = get_readability_stats(text)
    return render_template("results.html",
                           text=text,
                           pal_results=pal_results,
                           stats=stats)

Optional — get help from your agent:

Skip — adding one function call to a route is trivial.

Phase 3: Stress-Test the JSON Parser

Handwrite this yourself. Your three parsing strategies need proof. Write tests that actually exercise them.

Objective

_extract_json_object in pal.py handles three parsing strategies in theory. Confirm each actually works.

Instructions

Create test_pal.py at the project root
Write 4 cases calling _extract_json_object directly with:
1. Valid JSON
2. JSON inside a markdown fence
3. JSON surrounded by prose
4. Pure prose (should return None)
Run with uv run python test_pal.py and print each result
If any case is wrong, you've found a real bug in your parser

Sample Output

1. Valid JSON:       {'intent': 'x', 'suggestions': ['a']}
2. Fenced:           {'intent': 'y', 'suggestions': ['b']}
3. Embedded:         {'intent': 'z', 'suggestions': ['c']}
4. Pure prose:       None

Hints

Test file skeleton:

# test_pal.py
from pal import _extract_json_object

cases = [
    ("Valid JSON", '{"intent": "x", "suggestions": ["a"]}'),
    ("Fenced", '```json\n{"intent": "y", "suggestions": ["b"]}\n```'),
    ("Embedded", 'Here is my answer: {"intent": "z", "suggestions": ["c"]} Hope that helps!'),
    ("Pure prose", "I'm not sure what you want."),
]

for label, raw in cases:
    result = _extract_json_object(raw)
    print(f"{label}: {result}")

Optional — get help from your agent:

My test_pal.py case [N] returned unexpected output. Walk me through
what's happening in _extract_json_object for that input. Don't fix
it yet — I want to understand the issue first.

Phase 4: Polish the Results Page

Agent-assisted is fine here. HTML + Bootstrap classes are pure view code.

Objective

Suggestions probably render as a raw list right now. Make the page feel like a real tool: numbered suggestions, prominent intent, clean layout for the stats.

Instructions

Add Bootstrap via CDN if not there already
Display detected intent prominently (header or badge)
Number the suggestions (<ol> gives you this for free)
Show the original text in a quoted/highlighted box
Add a "Readability Stats" section that displays the dict nicely

Hints

Bootstrap CDN:

<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet">

Numbered suggestions:

<ol>
  {% for s in pal_results.suggestions %}
    <li>{{ s }}</li>
  {% endfor %}
</ol>

Stats section:

<div class="card mt-4">
  <div class="card-body">
    <h5>Readability Stats</h5>
    <ul>
      <li>Flesch Reading Ease: {{ stats.flesch_ease }}</li>
      <li>Average Sentence Length: {{ stats.avg_sentence_length }} words</li>
      <li>Long sentences (&gt;25 words): {{ stats.long_sentences }}</li>
      <li>Complex words: {{ stats.complex_words }}</li>
    </ul>
  </div>
</div>

Optional — get help from your agent:

Style results.html so detected intent shows as a Bootstrap badge,
suggestions are a numbered list, original text is in a quoted box,
and stats are in a card. Keep the HTML simple enough for me to edit.

Phase 5: Guard Against Empty Input

Handwrite this yourself. Deciding what's "too short to bother" is a product decision. Don't waste API tokens — and don't waste the user's time.

Objective

If someone hits "Analyze" with an empty textbox, skip the LLM call and send them back to the form with a message.

Instructions

In /analyze, after stripping the text, check if it's empty
If empty, use flash to show a message and redirect back to /
Show flash messages in home.html

Hints

app.py:

from flask import flash, redirect, url_for

app.secret_key = "dev"   # needed for flash messages

@app.route("/analyze", methods=["POST"])
def analyze():
    text = request.form.get("text", "").strip()
    if not text:
        flash("Please paste some text to analyze.")
        return redirect(url_for("home"))
    # ... rest of the handler

home.html:

{% with messages = get_flashed_messages() %}
  {% if messages %}
    {% for msg in messages %}
      <div class="alert alert-warning">{{ msg }}</div>
    {% endfor %}
  {% endif %}
{% endwith %}

Also check your Anthropic API key — confirm it's loaded from an environment variable, not hardcoded in pal.py. (It's the shared class key, not urgent, but worth confirming.)

Optional — get help from your agent:

Skip — the flash pattern is about 10 lines total.

Checkpoint 2 Readiness

By Thursday April 23 at 3pm:

stats.py exists with get_readability_stats (no flask or anthropic imports)
Readability stats shown on results page
test_pal.py runs and confirms all 4 parsing cases
Results page numbers suggestions + shows intent prominently
Empty input returns a flash message, no API call
Checkpoint 2 entry in project.journal.md
Committed and pushed