[Python] The Complete Guide to the csv Module! A Thorough Explanation of Common Pitfalls for Beginners in Reading and Writing CSVs
To run Python from the command prompt or PowerShell on your PC, you need to download and install Python.
If you havenโt installed it yet, please refer to the article Setting Up Python and Development Environment to install Python.
"I want to handle CSV files in my programming, but I don't know where to start..."
Hello! I'm the admin of this site, and just a few months ago, I was a complete beginner just like you. Let me tell you from my experience of launching two websites on my own in a month and a half with the help of AI: handling CSV files is an unavoidable and incredibly useful skill in web development!
In fact, this CSV knowledge was a huge help to me when I was creating features to manage data from contact forms or to bulk upload product lists.
In this article, based on my real experiences (and many mistakes...), I'll explain how to freely read and write CSV files using Python's `csv` module in the simplest way possible. I'll avoid jargon as much as I can. By the time you finish reading this article, you'll be a CSV master!
Warm-up: Let's Prepare a Sample CSV File
First, let's get a CSV file to practice with. Copy the content below and save it to your PC as `members.csv`. The key is to save it with UTF-8 encoding. (When you save in Notepad, there should be an option for encoding.)
This file represents a member list for a fictional website.
id,name,email
1,John Smith,john.smith@example.com
2,Emily Jones,emily.jones@example.com
3,Michael Brown,michael.brown@example.com
The Basics: Let's Read a CSV File (`csv.reader`)
The first thing to do is to read the contents of a CSV file with Python. Let's read the `members.csv` file we just created and display its contents in the console.
Alright, let's start with the code! Copy and paste this and run it. The easiest way is to create a Python file in the same directory as `members.csv` and run it.
<!-- HTML-escaped Python code -->
import csv
# --- My Sticking Point #1: Always specify the character encoding! ---
# On Windows, if you don't specify an encoding, it might try to open
# the file with Shift_JIS, which can cause an error.
# Keep `encoding='utf-8'` as a good luck charm.
with open('members.csv', mode='r', encoding='utf-8') as f:
# Passing the file to csv.reader returns a reader object that parses the CSV
reader = csv.reader(f)
# You can process it line by line in a loop
for row in reader:
print(row)
# Execution Result:
# ['id', 'name', 'email']
# ['1', 'John Smith', 'john.smith@example.com']
# ['2', 'Emily Jones', 'emily.jones@example.com']
# ['3', 'Michael Brown', 'michael.brown@example.com']
How did it go? Did the contents of the CSV appear in your terminal?
The key point to notice is that each row is retrieved as a list (the thing enclosed in `[]`). Data separated by commas, like `['1', 'John Smith', 'john.smith@example.com']`, becomes elements of the list.
A Quick Explanation: What is `with open(...) as f:`?
This is the standard "best practice" for handling files. By writing it this way, Python automatically closes the file for you when you exit the `with` block. Forgetting to close a file can lead to unexpected problems like memory leaks, so always use `with`!
[My Failure Story #1] The Hell of Garbled Text and `encoding`
The first time I did this, especially on a Windows PC, my console was filled with meaningless symbols (garbled text). The cause was the character encoding. You have to tell the PC which language rule (encoding) the file is written in, or it can't read it correctly.
The solution is to add `encoding='utf-8'` to the `open()` function's arguments. `UTF-8` is the global standard for character encoding, and using it will solve most problems. It's essential knowledge in the web world, so make it a habit!
The Basics Part 2: Let's Write to a New CSV File (`csv.writer`)
Next up is writing data. Let's try saving list data created in Python to a new CSV file. For example, imagine writing new member data to a file called `new_members.csv`.
<!-- HTML-escaped Python code -->
import csv
# Data to write (a list of lists)
new_data = [
['4', 'David Wilson', 'david.wilson@example.com'],
['5', 'Sarah Taylor', 'sarah.taylor@example.com']
]
# --- My Sticking Point #2: Use newline='' for mysterious blank rows ---
# mode='w' for write mode. If the file doesn't exist, it will be created.
# Without newline='', you'll get a mysterious blank row between each line on Windows!
with open('new_members.csv', mode='w', encoding='utf-8', newline='') as f:
writer = csv.writer(f)
# For writing a single row, use writerow
writer.writerow(['id', 'name', 'email']) # Header row
# For writing multiple rows at once, use writerows
writer.writerows(new_data)
print('new_members.csv has been created!')
When you run this code, a file named `new_members.csv` should be created in the same directory. Open it up. The data is written correctly, right?
[My Failure Story #2] Who Are You?! The Mystery of Blank Rows and `newline=''`
This is another problem that cost me half a day. Try running the code above after removing `newline=''`. How did it go? Wasn't your CSV file filled with annoying blank rows between every line?
This is a common trap caused by differences in how operating systems handle newline characters. I'll skip the detailed explanation (I asked an AI and was like, "I see, but I don't get it!"), but the solution is simple.
When writing to a CSV, always include `newline=''` in the `open()` function!
This is a super important practice that's even written in Python's official documentation, which says, "If you are using the csv module, you should always specify it." Just remember it like a magic spell.
Advanced Tip #1: Using Dictionaries Makes Your Code God-Level Readable (`DictReader` & `DictWriter`)
With the basic `reader`, you had to manage data by its index number, like `row[1]` or `row[2]`. When the number of columns increases, you end up wondering, "Wait, which index was the email address again...?"
That's where `DictReader` comes in! It lets you get each row of data as a "dictionary". This means you can access data by the header name, like `row['name']` or `row['email']`. Genius, right?
Reading: `DictReader`
<!-- HTML-escaped Python code -->
import csv
with open('members.csv', mode='r', encoding='utf-8') as f:
# Just use DictReader! Easy!
reader = csv.DictReader(f)
for row in reader:
# You can access the name with the 'name' key and the email with the 'email' key!
print(f"Name: {row['name']}, Email: {row['email']}")
# Execution Result:
# Name: John Smith, Email: john.smith@example.com
# Name: Emily Jones, Email: emily.jones@example.com
# Name: Michael Brown, Email: michael.brown@example.com
See? `row['name']` is overwhelmingly easier to understand than `row[1]`, isn't it? This way, when you look back at your code later, it's immediately obvious what it's doing.
Writing: `DictWriter`
If you can read with a dictionary, you'll want to write with one too. Of course, there's `DictWriter`. If you prepare a list of dictionaries, you can write it to a CSV.
<!-- HTML-escaped Python code -->
import csv
# Prepare the data as a list of dictionaries
dict_data = [
{'id': '6', 'name': 'Chris Martinez', 'email': 'chris.martinez@example.com'},
{'id': '7', 'name': 'Jessica Anderson', 'email': 'jessica.anderson@example.com'}
]
# The list of headers (important!)
fieldnames = ['id', 'name', 'email']
with open('dict_members.csv', mode='w', encoding='utf-8', newline='') as f:
# Pass the file and the headers (fieldnames) to DictWriter
writer = csv.DictWriter(f, fieldnames=fieldnames)
# Don't forget to write the header first!
writer.writeheader()
# Write the list of dictionaries all at once
writer.writerows(dict_data)
print('dict_members.csv has been created!')
There are two things to be careful about when using `DictWriter`:
- When you initialize `csv.DictWriter()`, you need to pass it the `fieldnames` (the list of headers).
- Before writing the data, you need to write the header row with `writer.writeheader()`.
If you forget these, it won't write correctly, so be careful! (Of course, I made that mistake.)
Advanced Tip #2: Appending Data to an Existing CSV
"I want to add new member information to the member list every time a contact form is submitted." In that case, you open the file in "append mode."
It's super easy. Just change the `mode` in `open()` from `'w'` (write) to `'a'` (append).
<!-- HTML-escaped Python code -->
import csv
# New member data to append
new_member = ['8', 'Ashley Thomas', 'ashley.thomas@example.com']
# Just change the mode to 'a' (append)!
# Since we're appending, don't forget newline='' and encoding='utf-8'.
with open('members.csv', mode='a', encoding='utf-8', newline='') as f:
writer = csv.writer(f)
writer.writerow(new_member)
print('Appended data to members.csv!')
After running this code, open `members.csv` again. You should see Ashley Thomas's information added at the very bottom. When appending with `DictWriter`, be careful not to call `writeheader()`. Otherwise, a header will be added every time you append data.
[Experience It Live] Let's Play Around with CSV in the Browser!
Okay, we've looked at a lot of code so far. But the best way to learn is to run it and tinker with it yourself.
So this time, I've prepared an interactive demo using "PyScript," a magical technology that lets you run Python in your browser, so you can manipulate CSV data in real time!
Copy the entire HTML code below and save it to your PC with a name like `csv_test.html`. Then, open that file in your web browser. Just like that, your browser will turn into a CSV editor!
<!-- HTML-escaped HTML code -->
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Python CSV Module Interactive Demo</title>
<!-- Loading PyScript -->
<link rel="stylesheet" href="https://pyscript.net/releases/2024.1.1/core.css" />
<script type="module" src="https://pyscript.net/releases/2024.1.1/core.js"></script>
<!-- Simple Styles -->
<style>
body { font-family: sans-serif; background-color: #202124; color: #e8eaed; padding: 2em; line-height: 1.6; }
h1, h2 { color: #8ab4f8; }
textarea, button {
border: 1px solid #5f6368;
background-color: #3c4043;
color: #e8eaed;
border-radius: 4px;
padding: 0.5em 1em;
}
textarea { width: 100%; height: 150px; margin-bottom: 1em; font-family: monospace; }
button { cursor: pointer; margin-right: 1em; }
button:hover { background-color: #5f6368; }
table { border-collapse: collapse; width: 100%; margin-top: 1em; }
th, td { border: 1px solid #5f6368; padding: 8px; text-align: left; }
th { background-color: #3c4043; }
pre { background-color: #1e1e1e; padding: 1em; border-radius: 4px; white-space: pre-wrap; }
</style>
</head>
<body>
<h1>Experience the Python CSV Module in Your Browser!</h1>
<p>Enter or edit the CSV data in the text area below, then press the "Load CSV and Display as Table" button.</p>
<textarea id="csv-input">Name,Age,City
John Smith,32,Tokyo
Emily Jones,28,Osaka
Michael Brown,45,Fukuoka</textarea>
<button py-click="read_csv">Load CSV and Display as Table</button>
<h2>Load Result</h2>
<div id="table-output"></div>
<py-script>
# The Python code below runs in your browser
import csv
import io # A module to treat strings like files
from pyscript import document
def read_csv(*args, **kwargs):
# Get the CSV data from the textarea
csv_data = document.querySelector("#csv-input").value
# Use io.StringIO to treat the string like a file.
# This allows the csv module to be used just as if
# it were reading from a file.
csv_file = io.StringIO(csv_data)
# Read with csv.reader
reader = csv.reader(csv_file)
# Generate an HTML table
html = "<table>"
try:
# Process the first row (header)
header = next(reader)
html += "<thead><tr>"
for col in header:
html += f"<th>{col}</th>"
html += "</tr></thead>"
# Process the second row onwards (data)
html += "<tbody>"
for row in reader:
html += "<tr>"
for cell in row:
html += f"<td>{cell}</td>"
html += "</tr>"
html += "</tbody>"
except StopIteration:
# Handling for when the data is empty
html += "<tr><td>No data available</td></tr>"
html += "</table>"
# Display the result in the DIV element
output_div = document.querySelector("#table-output")
output_div.innerHTML = html
</py-script>
</body>
</html>
Feel free to change the content of the text area and press the button. You'll see the comma-separated text transform into a neat table. This is the power of the `csv` module! Please play around with it.
Conclusion: You've Now Mastered CSV!
Great job! It was a long journey, but you've now completely mastered everything from the basics to the advanced uses of CSV manipulation with Python. Let's quickly review the most important points we learned today.
- Reading: Use `csv.reader` for basics, and `csv.DictReader` if you want to handle data by column name.
- Writing: Use `csv.writer` for basics, and `csv.DictWriter` if you want to write from a dictionary.
- Appending: Set the `open()` mode to `'a'`.
- The 2 Unforgettable Magic Spells: Use `encoding='utf-8'` for reading/writing, and `newline=''` for writing.
If you keep these points in mind, you'll rarely have trouble handling CSVs in your work. You can analyze data downloaded from websites, or bulk register large amounts of data into a system. The possibilities are endless, depending on your ideas.
The trick to getting better at programming is to accumulate these "It worked!" moments, one by one. If this article has helped you in that journey, I couldn't be happier.
Next Steps
Another data format that's used just as much as CSV in web development is "JSON". It's essential knowledge for more advanced development, such as API integration. Master it in the next article to level up even further!
How to Handle JSON Data with the json Module