My workflow for uploading posts is a bit complicatated, let me explain that here.
I use hugo (official site) to generate my site. It requries me to put the text files and images in a certain structure so that it can convert them to the website code. Then I can upload the whole piece of code to the hosting service.
Each section of the website, like posts and gallery, are represented as a folder on my computer. If I need to have multiple language versions for the same article, I also need to put them inside the same folder.
- blog
- content
- gallery
- posts
- cardinal-ordinal
- index.en.md
- index.zh.md
- cardinal-ordinal
- content
I don’t add files directly inside the content folder. I first prepare the conten in Joplin and use code to generate the required content structure.
I mimic the same folder structure in Joplin using nested notebooks. Each section is a represented as a notebook in Joplin. Each article stored inside the section notebook is represented as a notebook too. It stores the notes storing the content for each language version.
- posts
- cardinal-ordinal
- 量數和序數
- Cardinals and ordinals
- cardinal-ordinal
When an note is ready to pubilsh, I add a language code tag on it. In this way, I won’t accidentally publish drafts.
Then I use python to generate the files and shell script to upload the website.
Now for the actual code to execute the procedure. Joppy (github) is the tool for me to output the files from Joplin. It implements the Joplin API in python.
Due to its database nature, the way how Joplin stores folders is not intuitive. The ’nested’ notebooks in Joplin are not actually nested. Unlike the common tree-like file structure, in Joplin, only the chlid notebooks store who their parent is, while the parent notebook has no access to its child. For instance, if I want to know which notebooks are the child of ‘posts’, I have to search through all the notebooks to check whether its parent is ‘post’. It seems that this is the ‘parent list’ data structure, where each child is pointing to its parent.
It would be very time consuming if I have to check through my few hundreds unrelated notes in Joplin. I only want to check the notebooks that represent my blog sections. However, due to Joplin’s database nature, the way how Joplin stores folders is not intuitive. Unlike the common tree-like file structure, in Joplin, only the chlid notebooks store who their parent is, while the parent notebook has no access to its child. For instance, if I want to know which notebooks are the child of ‘posts’, I have to search through all the notebooks to check whether its parent is ‘post’. It seems that this is the ‘parent list’ data structure, where each child is pointing to its parent.
Now I know how to filter the notebooks, all that’s left is the hassle of generating the actual markdown files. Let’s take a look at the final code:
import os
from joppy.client_api import ClientApi as Api
# Use Joppy API
api = Api(token="YOUR_API_TOKEN")
def output_posts_in_certain_notebook_id(all_notebooks, notebook_id):
category = api.get_notebook(notebook_id).title
for notebook_data in all_notebooks:
# Continue only when the notebook (article) is inside the correct parent notebook (section)
if notebook_data.parent_id != None and notebook_data.parent_id == notebook_id:
# Get all the notes (articles in various language versions) inside the notebooks
post_list = api.get_notes(notebook_data.id, fields='created_time,updated_time,body,order,title,id')
folder_name = notebook_data.title
print(notebook_data.title)
zh_posts = []
en_posts = []
tok_posts = []
for post_language_version in post_list.items:
title = post_language_version.title
'''
Skip the note if the note has no tags
'''
tag_list = get_tag_list(post_language_version.id)
if not tag_list:
continue
# Group the notes according to language
if 'zh' in tag_list:
language_code = 'zh'
zh_posts.append(post_language_version)
elif 'en' in tag_list:
language_code = 'en'
en_posts.append(post_language_version)
elif 'tok' in tag_list:
language_code = 'tok'
tok_posts.append(post_language_version)
else:
continue
combine_and_output_posts_in_same_language(category, folder_name, 'zh', zh_posts)
combine_and_output_posts_in_same_language(category, folder_name, 'en', en_posts)
combine_and_output_posts_in_same_language(category, folder_name, 'tok', tok_posts)
def combine_and_output_posts_in_same_language(category, folder_name, language_code, posts):
if posts:
'''
This part combines the notes if multiple notes exist.
It was for combining paragraphs in a long story but it has yet to
be useful.
'''
posts.sort(key=lambda x: x.order, reverse=True)
title = posts[0].title
tag_list = get_tag_list(posts[0].id)
tag_list.remove(language_code)
body = ''
created_time = None
updated_time = None
# Use the latest created/update time if multiple notes exist
for post in posts:
if (created_time is None) or (post.created_time < created_time):
created_time = post.created_time
if (updated_time is None) or (post.updated_time > updated_time):
updated_time = post.updated_time
body += post.body + '\n\n'
'''
Extract the note data for generating front matter for the markdown file
'''
date = created_time.isoformat()[:19] + '+08:00'
lastmod = updated_time.isoformat()[:19] + '+08:00'
home_dir = os.environ['HOME']
'''
If there's a image with a correct file name in target folder, use that as feature image
'''
feature_png_directory = home_dir + '/Blog/blog/evthronblog/content/' + category + '/' + folder_name + '/' + folder_name + '.' + 'png'
feature_jpg_directory = home_dir + '/Blog/blog/evthronblog/content/' + category + '/' + folder_name + '/' + folder_name + '.' + 'jpg'
if os.path.exists(feature_png_directory):
image = folder_name + '.png'
front_matter = generate_front_matter(title, date, lastmod=lastmod, tags=repr(tag_list), categories=category, image=image)
elif os.path.exists(feature_jpg_directory):
image = folder_name + '.jpg'
front_matter = generate_front_matter(title, date, lastmod=lastmod, tags=repr(tag_list), categories=category, image=image)
else:
front_matter = generate_front_matter(title, date, lastmod=lastmod, tags=repr(tag_list), categories=category)
'''
Output the generated markdown file to the target folder
'''
folder_directory = home_dir + '/Blog/blog/evthronblog/content/' + category + '/' + folder_name
if not os.path.exists(folder_directory):
os.makedirs(folder_directory)
path = folder_directory + '/index.' + language_code + '.md'
with open(path, 'w', encoding='utf-8') as fout:
fout.write(front_matter + '\n' + body)
print("done in", language_code)
else:
print("no post in", language_code)
def generate_front_matter(title, date, description = '', lastmod = '', image = '', categories = '', tags = '', slug = '', layout = '') -> str:
if lastmod == '':
lastmod = strftime("%Y-%m-%dT%H:%M:%S+08:00", localtime())
front_matter = '---\n'
front_matter += 'title: "' + title + '"\n'
front_matter += 'description: ' + description + '\n'
front_matter += 'date: ' + date + '\n'
front_matter += 'lastmod: ' + lastmod + '\n'
front_matter += 'image: ' + image + '\n'
front_matter += 'categories: ' + categories + '\n'
front_matter += 'tags: ' + tags + '\n'
front_matter += 'math: \nlicense: \nhidden: false\ncomments: true\n'
if slug:
front_matter += 'slug: "' + slug + '"\n'
if layout:
front_matter += 'layout: "' + layout + '"\n'
front_matter += '---\n'
return front_matter
def get_tag_list(note_id):
info = api.get_tags(note_id).items
tag_list = []
if info:
for tagdata in info:
tag = tagdata.title
tag_list.append(tag)
return tag_list
def main():
all_notebooks = api.get_all_notebooks()
post_notebook_id = 'PUT_NOTEBOOK_ID_HERE'
output_posts_in_certain_notebook_id(all_notebooks, post_notebook_id)
gallery_notebook_id = 'PUT_NOTEBOOK_ID_HERE'
output_posts_in_certain_notebook_id(all_notebooks, gallery_notebook_id)
if __name__ == '__main__':
main()
This workflow can be simpified if I ditch Joplin completely, but since seperating the writing place and the finished stuff has some benefits and I made lots of effort to finish the code, I don’t really want to change for now.