If you use Notion for writing and organizing your blog posts, and you use a static site generator like Hugo as your blog engine, you might want an easy way to export them as Markdown files. In this post, I’ll introduce my Notion Exporter Tool, a Python script that interacts with the Notion API to fetch Notion pages tagged as blog posts, convert them to Markdown format, and save them locally for further use in Hugo. This automation helps me to streamlines content management, making it easier to publish Notion-written posts on platforms like GitHub Pages or other static site generators. It also help me to centralize my content management on one single platform reducing the number of tool and costs. The following sections explain the process flow and implementation details, along with key information about configuring and running the script.
Process Overview
Here’s what this tool does in simple terms: it takes blog posts you write in Notion and turns them into files that work with Hugo (a static site generator). The process is automatic and makes sure your posts look the same on your website as they do in Notion, with all the formatting and content exactly as you wrote it.ow
Here’s how the script works: Let’s examine each step in detail.
Get Posts From Notion
The initial step of the script is to query the Notion database for new pages marked as “Post” that need to be exported. The script uses the configured database ID to retrieve all posts.
pages_to_post = notion_service.query_posts(Config.Notion.database_id)
if is_empty_or_no_results(pages_to_post):
logging.error("Failed to query posts from Notion. The database ID might be invalid.")
return
if "results" not in pages_to_post or len(pages_to_post["results"]) == 0:
logging.info("No pages to post.")
return
logging.info(f"Found {len(pages_to_post['results'])} pages to post.")
The query_posts function connects to the Notion API and searches for pages with a specific tag. Through Notion database filtering, it finds pages marked as blog posts, simplifying the identification of content ready for export.
def query_posts(self, database_id: str) -> dict:
"""
Queries posts from a Notion database.
Args:
database_id (str): The ID of the Notion database.
Returns:
dict: The response from the Notion API.
"""
if not database_id:
logging.error("Missing database ID. Provide a valid ID and try again.")
return None
url = f"{self.api_url}/{self.api_version}/databases/{database_id}/query"
payload = {
"filter": {
"property": "Tags",
"multi_select": {
"contains": "Post"
}
}
}
response = requests.post(url, headers=self.request_header, json=payload)
if response.status_code == 200:
return response.json()
else:
logging.error(f"Failed to retrieve database: {response.status_code}")
return None
Extracting the post title
Each of the pages returned from Notion will have in their name the Title of the blog post.
The code demonstrates how to extract a page’s title from Notion’s properties structure. It uses nested dictionary access with the get()
method for safe property retrieval. Here’s what each line does:
- Gets the properties dictionary from the page object, defaulting to empty if not found
- Retrieves the ‘Name’ property, which contains the title information
- Extracts the title list from the name property
properties = page.get('properties', {})
name = properties.get('Name', {})
title_list = name.get('title', [])
if title_list and 'text' in title_list[0] and 'content' in title_list[0]['text']:
title = title_list[0]['text']['content']
logging.info(f"Title found: {title}")
else:
logging.info("Title not found.")
continue
Retrieving content blocks
Using the Notion API, the script fetches all content blocks from the page, including text, images, and other media.
The get_page_blocks
method retrieves all blocks (content elements) from a Notion page. It accepts a page ID, makes a GET request to the Notion API endpoint, and fetches the page’s block children. The method validates the page ID and checks the API response status. On success, it returns the JSON response with all page blocks; on failure, it logs an error and returns None.
def get_page_blocks(self, page_id: str) -> dict:
"""
Retrieves the blocks of a Notion page.
Args:
page_id (str): The ID of the Notion page.
Returns:
dict: The response from the Notion API.
"""
if not page_id:
logging.error("Missing page ID. Provide a valid ID and try again.")
return None
url = f"{self.api_url}/{self.api_version}/blocks/{page_id}/children"
response = requests.get(url, headers=self.request_header)
if response.status_code == 200:
return response.json()
else:
logging.error(f"Failed to retrieve page block children: {response.status_code}")
return None
Creating local folder structure
When exporting a page in markdown format we need to have a folder structure capable of storing all the necessary files and, at the same time, be compatible with the Hugo folder structure. There are few ways in Hugo to handle this, in our case we will have the following folder structure
post-name-encoded/
images/
image1.png
image2.png
index.md
Here below an example from a previous post
To replicate the directory structure described above, the code performs the following file system operations to manage blog post folders:
- Deletes any existing folder with the same name to ensure a clean slate
- Creates a folder name by replacing spaces with hyphens in the post title
- Creates a new root folder for the post
- Creates a separate ‘images’ subfolder to store media files This organization ensures each blog post has its own structured directory for content and media files, making it compatible with Hugo’s file handling system.
post_folder_name = title.replace(' ', '-')
file_service.delete_folder(post_folder_name)
file_service.create_post_root_folder(post_folder_name)
file_service.create_post_images_folder(post_folder_name)
This code snippet demonstrates three file system operations handled by the FileService class:
- delete_folder(): Recursively removes a folder and all its contents, ensuring a clean slate for new content
- create_post_root_folder(): Creates the main directory for a blog post, using the post’s title as the folder name
- create_post_images_folder(): Creates a subdirectory specifically for storing images associated with the blog post These functions work together to set up the necessary folder structure for storing exported blog post content and its associated media files.
def delete_folder(self, folder_name: str) -> None:
"""
Delete a folder and its contents.
:param folder_name: str representing the folder name
"""
folder_path = os.path.join(self.base_path, folder_name)
if os.path.exists(folder_path) and os.path.isdir(folder_path):
for root, dirs, files in os.walk(folder_path, topdown=False):
for name in files:
os.remove(os.path.join(root, name))
for name in dirs:
os.rmdir(os.path.join(root, name))
os.rmdir(folder_path)
def create_post_root_folder(self, folder_name: str) -> str:
"""
Create the root folder for a post.
:param folder_name: str representing the folder name
:return: str representing the path of the created folder
"""
folder_path = os.path.join(self.base_path, folder_name)
os.makedirs(folder_path, exist_ok=True)
return folder_path
def create_post_images_folder(self, folder_name: str) -> str:
"""
Create the images folder for a post.
:param folder_name: str representing the folder name
:return: str representing the path of the created folder
"""
folder_path = os.path.join(self.base_path, folder_name, 'images')
os.makedirs(folder_path, exist_ok=True)
return folder_path
Converting content to Markdown
The script converts Notion’s block structure into clean, formatted Markdown while handling image downloads and references. This process transforms Notion’s complex block structure into clean Markdown that Hugo can easily process.
def convert_to_markdown(self, notion_blocks: list, folder_name: str) -> str:
"""
Convert a list of Notion blocks to standard Markdown.
:param notion_blocks: list of dicts representing Notion blocks
:param folder_name: str representing the folder name to save images
:return: str containing the converted Markdown text
"""
markdown = []
for block in notion_blocks:
if hasattr(block, 'get'):
block_type = block.get('type')
if block_type == 'text':
markdown.append(self.convert_text_block(block))
elif block_type == 'heading_1':
markdown.append(self.convert_heading_block(block, 1))
elif block_type == 'heading_2':
markdown.append(self.convert_heading_block(block, 2))
elif block_type == 'heading_3':
markdown.append(self.convert_heading_block(block, 3))
elif block_type == 'bulleted_list':
markdown.append(self.convert_bulleted_list_block(block))
elif block_type == 'numbered_list':
markdown.append(self.convert_numbered_list_block(block))
elif block_type == 'paragraph':
markdown.append(self.convert_paragraph_block(block))
elif block_type == 'image':
markdown.append(self.convert_image_block(block, folder_name))
elif block_type == 'bulleted_list_item':
markdown.append(self.convert_bulleted_list_item_block(block))
elif block_type == 'code':
markdown.append(self.convert_code_block(block))
elif block_type == 'callout':
markdown.append(self.convert_callout_block(block))
return '\n'.join(markdown)
Downloading images
The script handles image downloads from Notion pages, ensuring all visual content is properly transferred to your Hugo site. Here’s how it works:
- Detects image blocks in the Notion content
- Downloads the image files to the local ‘images’ folder
- Updates image references in the Markdown to point to the correct local paths
- Provide a unique name Here’s the code that manages image downloads:
def download_image(self, image_url: str, folder_name: str) -> str:
"""
Download an image from a URL and save it to the images folder.
:param image_url: str representing the URL of the image
:param folder_name: str representing the folder name to save the image
:return: str representing the filename of the downloaded image
"""
response = requests.get(image_url)
if response.status_code == 200:
parsed_url = urlparse(image_url)
original_filename = os.path.basename(parsed_url.path)
self.image_index += 1
new_filename = f"{self.image_index}_{original_filename}"
full_path = os.path.join(self.base_path, folder_name, 'images', new_filename)
with open(full_path, 'wb') as file:
file.write(response.content)
return new_filename
else:
logging.error(f"Failed to download image from {image_url}")
raise Exception(f"Failed to download image from {image_url}")
File Saving
The converted content is saved as a Markdown file in the designated folder, ready for use in your static site generator.
The code below shows the create_post
method that saves the converted Markdown content. Here’s how it works:
- Accepts two parameters: folder_name for the save location and an optional content parameter for the Markdown text
- Creates an index.md file in the specified folder
- Uses a context manager (with statement) to safely write the content to the file
- Returns the complete path of the new Markdown file This method saves the content in a Hugo-compatible structure, with each post in its own directory containing an index.md file.
def create_post(self, folder_name: str, content: str = "") -> str:
"""
Create a markdown file for a post.
:param folder_name: str representing the folder name
:param content: str representing the content to write to the file
:return: str representing the path of the created file
"""
file_path = os.path.join(self.base_path, folder_name, 'index.md')
with open(file_path, 'w') as file:
file.write(content)
return file_path
Update Tags in Notion
After successful export, the script updates the original Notion page by adding an “Exported” tag, helping you track which posts have been processed and are ready for publication.
def update_page_tags(self, page_id: str) -> bool:
"""
Updates the tags of a Notion page.
Args:
page_id (str): The ID of the Notion page.
Returns:
bool: True if the update was successful, False otherwise.
"""
if not page_id:
logging.error("Missing page ID. Provide a valid ID and try again.")
return False
url = f"{self.api_url}/{self.api_version}/pages/{page_id}"
payload = {
"properties": {
"Tags": {
"multi_select": [
{"name": "Exported"}
]
}
}
}
response = requests.patch(url, headers=self.request_header, json=payload)
if response.status_code == 200:
return True
else:
logging.error(f"Failed to update page tags: {response.status_code}")
return False
Configuration
To use the script, you’ll need to set up a few configuration parameters in a .env
file. The essential settings include your Notion API key, the database ID containing your blog posts, and the desired output directory for the exported Markdown files. You can also customize the tag names used to identify posts and their export status.
Here’s an example of a basic configuration file: