CSV Files
CSV (comma-separated values) is a plain-text file format for table-like data. Every spreadsheet app can export to CSV, and most datasets you'll download for projects come as CSV files. Python has a built-in csv module that knows how to read and write them, so you rarely need to split lines yourself.
This guide covers the patterns you'll actually use: reading rows as dictionaries, reading rows as lists, filtering, and writing CSVs back out.
Example CSV
We'll use this file, students.csv, throughout the guide:
name,age,major
Alice,20,CS
Bob,21,Math
Carol,19,CS
Dan,22,Biology
The first line is the header — it names each column. Every line below is one row of data.
Reading CSVs with DictReader
csv.DictReader reads each row as a dictionary keyed by the column names from the header. This is usually what you want.
import csv
with open("students.csv", newline="") as file:
reader = csv.DictReader(file)
for row in reader:
print(row["name"] + " is studying " + row["major"])
Output:
Alice is studying CS
Bob is studying Math
Carol is studying CS
Dan is studying Biology
Each row is a regular Python dict:
{"name": "Alice", "age": "20", "major": "CS"}
Loading every row into a list
A DictReader is a one-pass iterator. If you want to loop over the data more than once (or return it from a function), convert it to a list:
import csv
with open("students.csv", newline="") as file:
students = list(csv.DictReader(file))
print(len(students))
print(students[0]["name"])
Output:
4
Alice
Values are always strings
CSV files don't know about types — every value comes back as a string, even numbers.
import csv
with open("students.csv", newline="") as file:
for row in csv.DictReader(file):
age = int(row["age"])
if age >= 21:
print(row["name"] + " can vote, drive, and rent a car.")
Convert with int(...), float(...), etc. when you need to do math or comparisons.
Reading CSVs with csv.reader
If you don't care about column names, csv.reader returns each row as a list of strings.
import csv
with open("students.csv", newline="") as file:
reader = csv.reader(file)
for row in reader:
print(row)
Output:
['name', 'age', 'major']
['Alice', '20', 'CS']
['Bob', '21', 'Math']
['Carol', '19', 'CS']
['Dan', '22', 'Biology']
Notice the header line shows up as just another row. Skip it with next():
import csv
with open("students.csv", newline="") as file:
reader = csv.reader(file)
next(reader) # skip header
for row in reader:
name, age, major = row
print(name, major)
Filtering and Transforming Rows
Because DictReader gives you a list of dicts, the usual list tools work.
Pick just the CS students:
import csv
with open("students.csv", newline="") as file:
students = list(csv.DictReader(file))
cs_students = [s for s in students if s["major"] == "CS"]
print(cs_students)
Output:
[{'name': 'Alice', 'age': '20', 'major': 'CS'},
{'name': 'Carol', 'age': '19', 'major': 'CS'}]
Turn rows into a simpler shape:
names = [s["name"] for s in students]
print(names)
Output:
['Alice', 'Bob', 'Carol', 'Dan']
Writing CSVs with DictWriter
csv.DictWriter writes dicts back out to a file. You tell it the column names up front with fieldnames.
import csv
students = [
{"name": "Alice", "age": 20, "major": "CS"},
{"name": "Bob", "age": 21, "major": "Math"},
]
with open("output.csv", "w", newline="") as file:
writer = csv.DictWriter(file, fieldnames=["name", "age", "major"])
writer.writeheader()
writer.writerows(students)
output.csv now contains:
name,age,major
Alice,20,CS
Bob,21,Math
Use writer.writerow(one_dict) to write a single row, or writer.writerows(list_of_dicts) to write many.
Writing CSVs with csv.writer
For plain lists of values, use csv.writer:
import csv
rows = [
["Alice", 20, "CS"],
["Bob", 21, "Math"],
]
with open("output.csv", "w", newline="") as file:
writer = csv.writer(file)
writer.writerow(["name", "age", "major"])
writer.writerows(rows)
Common Gotchas
Always pass newline="" when opening
with open("students.csv", newline="") as file:
...
Without it, you can get blank lines in your output on Windows. It's harmless to always include it.
Values are strings
Convert them when you need numbers:
age = int(row["age"])
gpa = float(row["gpa"])
Column names are case-sensitive
row["Name"] is not the same as row["name"]. Match the header exactly, or you'll get a KeyError.
File paths are relative to where you run the script
If your CSV lives in a data/ folder, open it as "data/students.csv". See file-io.guide for more on paths and file modes.
Don't forget import csv
The csv module is built into Python, but you still have to import it at the top of your file.
Related
- File I/O — reading and writing files in general, including CSV basics.