The Beauty of Notion
You can also view this report in a nicer format, with examples, in Notion itself.
Introduction
This guide is a culmination of 4 weeks of research I did after falling in love with Notion. As a software engineer, I'm always curious about how an app works. Notion's impressive functionalities and intuitive UX captivated me, so I decided to learn it from first principles and document it publicly.
This guide assumes you know Notion's functionalities and are curious about how they work. This guide also assumes you have a basic understanding of a data structure.
Let's get started.
Blocks
The core foundation of Notion is built on the block model. If there's one thing to take away from this guide, it's how the block model works.
Text, headings, bullet list items, pages, embeds, a row in a database, an item in a Kanban board, etc. are all blocks. Blocks can contain other blocks, which form a hierarchy. Let's illustrate with examples.
Block attributes
- Each block is identified by a globally unique ID.
- Each block has a type. It determines what to render.
- Each block has a list of properties. They are data associated with the block.
- Each block has a list of content pointers to its child blocks.
- Each block has a list of format attributes. They determine how the block looks.
Block operations
- To add a block, append a pointer to content.
- To insert a block below another block, insert a pointer after the other block's pointer.
- To duplicate a block, recursively duplicate child blocks, and insert duplicated block's pointer below the current block's pointer.
- To delete a block, remove the block's pointer from content.
- To move blocks, reorder the pointers in content.
- To move a block to another page, remove pointer from old parent's content, append pointer to new parent's content.
Styled text
- The format attributes determine how the block looks.
- To change a block's color, update the block color attribute.
- Text color and background color use the same attribute, so you can't have both at the same time.
Rich text
- Strings in Notion allow rich text formatting.
- Each rich text string is represented by a list of segments.
- Each segment contains the text and a list of formatting.
- Each formatting contains the type and any associated information.
- For bold, italic, strikethrough, and code, only the type is stored.
- For link, the URL is stored.
- For style, the text or background color is stored.
- For date, people, page mention, a special character ‣ is used as a placeholder. This indicates that mention will be displayed.
- For date mention, the date information is stored.
- For people mention, the user id is stored.
Block ID
- The ID can uniquely identify a block.
- To copy a page URL, just get the ID of the page block to form the URL
https://notion.so/PAGE_ID
. - To copy a block URL, get the ID of the block and the closest parent page block to form the URL with an anchor
https://notion.so/PAGE_ID#BLOCK_ID
. - On page load, Notion will scroll to the block and highlight it.
Block types
- To update the text, just update the title attribute.
- To complete a to-do, just update the checked attribute.
- To turn a block into something else, just update the type attribute.
Block hierarchy
- To indent a block, remove the block's pointer, look for the previous sibling block, and append it as a child.
- To un-indent a block, remove the block's pointer and insert it below the old parent block.
- The beauty of Notion is all structures follow this hierarchy, whether if it's content in a page, indented text, sub-bullet list item, or toggled content.
- This allows a child to be any type.
- The toggled content can be text, heading, page, or anything else since it's just another block.
- This also allows a seamless transition between types.
- If you turn the toggled list into a page, the toggled content will just become the page content.
- To display the sidebar page tree, just recursively get a list of blocks that are pages.
Block types & attributes
- Many of the format attributes are shared between types.
- This allows you to maintain styling even if the type changed.
- If a heading has an orange background, changing to a callout keeps the same background.
- If a callout has an icon, changing to a page keeps the same icon.
Media
- All media and embed blocks also use properties and format attributes.
- When you drag the image to resize, the block width and height get updated.
- When you replace the image with a new one, the source gets updated.
Embed
- Notion uses embed.ly for bookmarks and embeds.
- For link, the title, description, and thumbnail are fetched asynchronously.
- For embed, an iframe is added with the source URL.
Code
- Each format attribute is configurable via the menu dropdown.
- Syntax highlighting happens on the client side based on the language.
Table of content
- Table of content block is rendered by filtering for the header, sub-header, and sub-sub-header blocks on a page.
Collections and Views
Collections and views give blocks superpowers. They allow you to create and see blocks in a structured way.
When you add a database, a collection is created. The chosen view is also created. A collection provides the schema for the blocks. A view defines how to query and display the blocks.
Let's again illustrate with examples. We will first look at the data structure of a collection. Then we will examine the data structure for a page in the collection (hint: it's just like any other page blocks.) Lastly, we will take a look at the data structure for the five different types of views.
Collection
- Each collection is identified by a unique global ID.
- Each collection has a name.
- Each collection has a schema. It contains the definition of each property.
- The schema is unordered.
- Each schema has a title field, just like blocks.
- Each property definition besides the title uses a randomly-generated 4-character key.
- Each property definition has a name, a type, and any type-specific information.
- For a select/multi-select property, the definition contains a list of options with colors randomly chosen at creation.
- For a number property, the definition contains the number format.
- For a formula property, the definition contains a recursive formula composition in the order of precedence.
- Each sub-formula contains the arguments and the operator/function for the arguments.
- Collection does not contain information about views or blocks under the collection.
Block in collection
- It is just a regular page block.
- This allows you to add content to the page, just like any other page.
- This also allows you to drag a block out of a collection. It will simply turn into a page, with all content preserved.
- The only difference is the parent table points to a collection. For a page under another page, the parent table points to a block.
- The properties map to the schema of the collection.
- As you can see, the property keys are 4 randomly-generated characters. This is another way you can tell it belongs in a collection.
- The property value is structured differently based on the type.
- For date or person property, it uses the same formatting as inline date or person mentions.
- For multi-select property, it is a comma-separated string.
- For checkbox property, it is "Yes" or "No".
- For file property, it is a link to the file.
- The beauty of storing the value as strings is when you change the type of a property, the values can migrate gracefully. If you changed a property from checkbox to text, the new properties will just render the text "Yes" or "No".
- Formula property doesn't have values. It is calculated based on the dependencies.
- A property is only set if it contains a value.
Table collection view
- It contains the type of table and a name.
- The table properties include information about the width and visibility of each property in the table.
- When you finish dragging the column separator, the width attribute will be updated.
- When you toggle the display status for each property, the visible attribute will be updated.
- You can manually sort pages in a view. The info is stored in the page sort attribute.
- If you choose to sort automatically, the manual sort will be discarded.
- If you choose to sort manually and an automatic sort exists, Notion will warn you that the automatic sort will be discarded.
- A view contains query information for how to query data under the collection. It consists sort, filter, aggregation, and group by information.
- In this example, you can aggregate any properties across all pages in the collection.
- This translates to aggregations like COUNT(*) in SQL.
- For all property, you can aggregate by count all, count values, count unique values, count empty/not empty, percent empty/not empty.
- For number property, you can also aggregate by sum, average, median, min, max, range.
- For date property, you can also aggregate by earliest/latest date and range.
- All aggregation calculations happen on the client side.
- The query attribute is called query2 most likely because Notion migrates to a new version of the query schema.
Board collection view
- A board view contains a list of groups that can be reordered.
- Each group contains which value of the grouped by attribute is used (or none) and whether it is shown.
- A board view also contains a list of properties to display.
- The width attribute shouldn't be applicable.
- The group by attribute specifies which property is used to group all the pages in the collection.
- This translates to GROUP BY in SQL.
- You can drag an item in the board to a different column, and the appropriate property will be updated automatically. This is a great touch.
Calendar collection view
- The calendar view is very straightforward data-wise.
- It only contains which calendar property to group by.
- If a date property doesn't exist when you create a calendar view, one will be added automatically and chosen as the group by property.
- All the rendering magic happens on the client side.
List collection view
- The list properties are similar to table properties for determining which property to show/hide.
- Each sorting rule contains the property and the direction.
- This translates to ORDER BY in SQL.
- Each filtering rule contains which property to filter by and how to filter (e.g. equal, less than, not checked).
- The filtering rules are combined together with ANDs and ORs.
- This translates to WHERE in SQL.
- When you create a new page when a view is selected, the corresponding filters are applied automatically to the new page. This is a great touch.
Gallery collection view
- This is the data structure of a gallery collection view.
- Gallery cover is used to decide what image to use as the cover (e.g. a file property, the page cover, the page content).
- Gallery cover aspect and size are used to decide how the image cover is displayed.
- Gallery properties are used to decide what properties to show in the gallery card.
- The beauty of keeping the display status for each property separate is when you switch between view type, the display status is preserved.
- For example, if you have two properties hidden in the gallery view, you change it to a list view and change it back to a gallery view, the display status is preserved.
Conclusion
Hopefully, this guide gave you a glimpse into how Notion works, why its data model is so elegant and flexible, and how some the impressive functionalities and intuitive UX get built.
The flexibility does come with its downsides. Because you have so much freedom, you may end up spending more time optimizing their setup better rather than doing actual work. Often times, constraints are a good thing. Additionally, because there are so many things you can do to its data model, Notion relies on the client code not to break things. This is why releasing an API can be hard because such safeguards need to be in place.
With that said, I have not been so excited about a piece of software since Slack came out. Let's see where Notion heads next.
Future Works
I've only scratched the surface of what Notion can do. The fact that there is plenty more to cover shows how powerful Notion is. Future additions:
- Database templates
- Template buttons
- Operations
- Columns
- Drag-and-drop
- Omnibar search
- Markdown support
- Relations
- Rollups
- Updates
- Page history
- Permission models
- Reminders
- Comments
- Workspaces
- Import
Feature Suggestions
In my opinion, there are two types of blocks that will make Notion even more powerful - a form block and a backlink block.
A form block can be used for surveys, applications, and polls. Notion already has the building blocks for this. Each form will connect to a collection, so the type of each field can be inferred automatically. A user can choose the order, visibility, and required-ness of each field. Submission turns into a page in the collection, with an additional Submitter person field. I can see many potential use cases - a survey for customer research, an application for job applicants, a poll for Q&A, etc.
A backlink block can be used to see a list of pages that link to the current page. This is powerful because you can view all related knowledge to the current page. Backlink is why a lot of people switch to Roam. Technically, this requires storing back links to the current page, which I don't think Notion does right now. This can be achieved by creating a map from page ID to a list of block IDs that link to page ID. When a user links block A to page B, add block A's ID to page B's list. To render the backlinks for a page, simply grab the backlink block IDs and display the title as well as the closest parent page block for context.