The schema defines the structure of documents in ProseKit. It specifies what types of nodes and marks exist, their attributes, and how they can be composed.
What is a Schema?
A schema is a ProseMirror Schema object that defines:
Nodes : The types of content blocks (paragraph, heading, image, etc.)
Marks : Formatting that can be applied to text (bold, italic, link, etc.)
Content Rules : What content is allowed where
Attributes : Data attached to nodes and marks
Serialization : How to convert to/from HTML/DOM
In ProseKit, schemas are built automatically from extensions.
Schema Construction
Schemas are constructed through the facet system:
1. Schema Spec Facet
Extensions contribute to the schema through schemaSpecFacet (packages/core/src/facets/schema-spec.ts):
import type { SchemaSpec } from '@prosekit/pm/model'
type SchemaSpec = {
nodes ?: OrderedMap < NodeSpec > | { [ name : string ] : NodeSpec }
marks ?: OrderedMap < MarkSpec > | { [ name : string ] : MarkSpec }
topNode ?: string
}
2. Schema Facet
The schema facet creates the actual schema (packages/core/src/facets/schema.ts):
import { Schema , type SchemaSpec } from '@prosekit/pm/model'
import { defineFacet , type Facet } from './facet.ts'
import { rootFacet , type RootPayload } from './root.ts'
export const schemaFacet : Facet < SchemaSpec , RootPayload > = defineFacet ({
reducer : ( specs ) => {
assert ( specs . length <= 1 )
const spec = specs [ 0 ]
const schema = spec ? new Schema ( spec ) : null
return { schema }
},
parent: rootFacet ,
singleton: true ,
})
The schema facet is a singleton - only one schema can exist per editor. This is because changing the schema requires recreating the entire editor.
Node Specifications
Nodes are defined using defineNodeSpec() (packages/core/src/extensions/node-spec.ts):
export interface NodeSpecOptions <
NodeName extends string = string ,
Attrs extends AnyAttrs = AnyAttrs ,
> extends NodeSpec {
/** The name of the node type */
name : NodeName
/** Whether this is the top-level node type */
topNode ?: boolean
/** The attributes that nodes of this type get */
attrs ?: {
[ key in keyof Attrs ] : AttrSpec < Attrs [ key ]>
}
// From ProseMirror NodeSpec:
content ?: string // Content expression
marks ?: string // Allowed marks
group ?: string // Node groups
inline ?: boolean // Inline vs block
atom ?: boolean // Atomic node
selectable ?: boolean // Can be selected
draggable ?: boolean // Can be dragged
code ?: boolean // Preserves whitespace
defining ?: boolean // Structure boundary
isolating ?: boolean // Isolates content
parseDOM ?: ParseRule [] // HTML parsing rules
toDOM ?: ( node : Node ) => DOMOutputSpec // DOM rendering
// ... and more
}
Content Expressions
Content expressions define what content is allowed inside a node:
// Basic expressions
'text*' // Zero or more text nodes
'inline*' // Zero or more inline nodes
'block+' // One or more block nodes
'paragraph+' // One or more paragraphs
// Sequences
'heading paragraph+' // Heading followed by one or more paragraphs
// Alternatives
'(paragraph | heading)+' // One or more paragraphs or headings
// Groups
'block*' // Any node in the 'block' group
// Attributes
'paragraph{id}' // Paragraph with id attribute
Example: Document Node
import { defineNodeSpec } from '@prosekit/core'
export function defineDoc () {
return defineNodeSpec ({
name: 'doc' ,
topNode: true , // This is the document root
content: 'block+' , // Contains one or more block nodes
})
}
Example: Paragraph Node
import { defineNodeSpec } from '@prosekit/core'
export function defineParagraph () {
return defineNodeSpec ({
name: 'paragraph' ,
content: 'inline*' , // Contains inline content
group: 'block' , // Is a block node
parseDOM: [{ tag: 'p' }],
toDOM () {
return [ 'p' , 0 ] // Render as <p> tag, 0 is content slot
},
})
}
Example: Image Node
import { defineNodeSpec } from '@prosekit/core'
export function defineImage () {
return defineNodeSpec ({
name: 'image' ,
inline: true , // Inline node
atom: true , // Cannot be directly edited
group: 'inline' , // Is an inline node
draggable: true , // Can be dragged
attrs: {
src: { default: '' },
alt: { default: '' },
title: { default: null },
},
parseDOM: [
{
tag: 'img[src]' ,
getAttrs ( dom ) {
if ( typeof dom === 'string' ) return false
return {
src: dom . getAttribute ( 'src' ),
alt: dom . getAttribute ( 'alt' ),
title: dom . getAttribute ( 'title' ),
}
},
},
],
toDOM ( node ) {
return [ 'img' , node . attrs ]
},
})
}
Mark Specifications
Marks are defined using defineMarkSpec() (packages/core/src/extensions/mark-spec.ts):
export interface MarkSpecOptions <
MarkName extends string = string ,
Attrs extends AnyAttrs = AnyAttrs ,
> extends MarkSpec {
/** The name of the mark type */
name : MarkName
/** The attributes that marks of this type get */
attrs ?: { [ K in keyof Attrs ] : AttrSpec < Attrs [ K ]> }
// From ProseMirror MarkSpec:
inclusive ?: boolean // Mark extends when typing
excludes ?: string // Marks that can't coexist
group ?: string // Mark groups
spanning ?: boolean // Can span nodes
parseDOM ?: ParseRule [] // HTML parsing rules
toDOM ?: ( mark : Mark , inline : boolean ) => DOMOutputSpec
// ... and more
}
Example: Bold Mark
import { defineMarkSpec } from '@prosekit/core'
export function defineBold () {
return defineMarkSpec ({
name: 'bold' ,
parseDOM: [
{ tag: 'strong' },
{ tag: 'b' },
{
style: 'font-weight' ,
getAttrs : ( value ) => / ^ ( bold ( er ) ? | [ 5-9 ] \d {2,} ) $ / . test ( value as string ) && null ,
},
],
toDOM () {
return [ 'strong' , 0 ] // Render as <strong> tag
},
})
}
Example: Link Mark
import { defineMarkSpec } from '@prosekit/core'
export function defineLink () {
return defineMarkSpec ({
name: 'link' ,
attrs: {
href: { default: '' },
title: { default: null },
},
inclusive: false , // Doesn't extend when typing
parseDOM: [
{
tag: 'a[href]' ,
getAttrs ( dom ) {
if ( typeof dom === 'string' ) return false
return {
href: dom . getAttribute ( 'href' ),
title: dom . getAttribute ( 'title' ),
}
},
},
],
toDOM ( mark ) {
return [ 'a' , mark . attrs , 0 ]
},
})
}
Node and Mark Attributes
Attributes store data on nodes and marks:
Defining Attributes
import { defineNodeSpec } from '@prosekit/core'
const heading = defineNodeSpec ({
name: 'heading' ,
content: 'inline*' ,
group: 'block' ,
attrs: {
level: {
default: 1 ,
validate : ( value ) => value >= 1 && value <= 6 ,
},
},
parseDOM: [
{ tag: 'h1' , attrs: { level: 1 } },
{ tag: 'h2' , attrs: { level: 2 } },
{ tag: 'h3' , attrs: { level: 3 } },
{ tag: 'h4' , attrs: { level: 4 } },
{ tag: 'h5' , attrs: { level: 5 } },
{ tag: 'h6' , attrs: { level: 6 } },
],
toDOM ( node ) {
return [ 'h' + node . attrs . level , 0 ]
},
})
Adding Attributes to Existing Types
Use defineNodeAttr() or defineMarkAttr() to add attributes:
import { defineNodeAttr } from '@prosekit/core'
const paragraphId = defineNodeAttr ({
type: 'paragraph' ,
attr: 'id' ,
default: null ,
parseDOM : ( node ) => node . getAttribute ( 'id' ),
toDOM : ( value ) => value ? [ 'id' , value ] : null ,
})
Attribute extensions must be defined after the base node/mark. They modify existing schema entries rather than creating new ones.
Splittable Attributes
Node attributes can be marked as splittable to persist when splitting:
import { defineNodeAttr } from '@prosekit/core'
const paragraphAlign = defineNodeAttr ({
type: 'paragraph' ,
attr: 'align' ,
default: 'left' ,
splittable: true , // Preserves alignment when splitting paragraph
parseDOM : ( node ) => node . style . textAlign || 'left' ,
toDOM : ( value ) => [ 'style' , `text-align: ${ value } ` ],
})
Schema Merging
When multiple extensions define the same node or mark, their specs are merged:
const base = defineNodeSpec ({
name: 'paragraph' ,
content: 'inline*' ,
group: 'block' ,
})
const withId = defineNodeAttr ({
type: 'paragraph' ,
attr: 'id' ,
default: null ,
})
const withAlign = defineNodeAttr ({
type: 'paragraph' ,
attr: 'align' ,
default: 'left' ,
})
// Resulting schema has paragraph with both id and align attributes
const extension = union ( base , withId , withAlign )
Merge Process
From packages/core/src/extensions/node-spec.ts:142-197:
Collect Specs : All node/mark specs are collected
Merge Specs : Specs with the same name are merged using mergeSpecs()
Add Attributes : Attribute extensions modify existing specs
Wrap DOM Methods : toDOM and parseDOM are wrapped to handle new attributes
Build Schema : Final merged specs create the schema
The order matters: later extensions override earlier ones. Use the priority system to control precedence.
Node Groups
Node groups allow content expressions to reference multiple node types:
// Define nodes in groups
const paragraph = defineNodeSpec ({
name: 'paragraph' ,
group: 'block' ,
// ...
})
const heading = defineNodeSpec ({
name: 'heading' ,
group: 'block' ,
// ...
})
// Reference the group in content expressions
const doc = defineNodeSpec ({
name: 'doc' ,
topNode: true ,
content: 'block+' , // Allows any node in 'block' group
})
Common groups:
block: Block-level nodes (paragraph, heading, etc.)
inline: Inline nodes (text, image, etc.)
list: List nodes (bullet list, ordered list, etc.)
Schema Validation
ProseMirror validates documents against the schema:
const doc = editor . schema . node ( 'doc' , null , [
editor . schema . node ( 'paragraph' , null , [
editor . schema . text ( 'Hello' ),
]),
])
// Validate the document
doc . check () // Throws if invalid
Validation checks:
Content matches content expressions
Attributes have valid values
Marks are allowed on the nodes they’re applied to
Structure is internally consistent
Invalid documents can cause editor crashes. Always validate user-provided content before setting it.
Working with Schemas
Accessing the Schema
const editor = createEditor ({ extension })
// Access schema from editor
const schema = editor . schema
// Access node types
const paragraphType = schema . nodes . paragraph
const headingType = schema . nodes . heading
// Access mark types
const boldType = schema . marks . bold
const linkType = schema . marks . link
Creating Nodes
const schema = editor . schema
// Create a paragraph
const para = schema . node ( 'paragraph' , null , [
schema . text ( 'Hello world' ),
])
// Create a heading with attributes
const heading = schema . node ( 'heading' , { level: 1 }, [
schema . text ( 'Title' ),
])
// Or use node actions
const para2 = editor . nodes . paragraph ( 'Hello world' )
const heading2 = editor . nodes . heading ({ level: 1 }, 'Title' )
Creating Marks
const schema = editor . schema
// Create a bold mark
const bold = schema . mark ( 'bold' )
// Create a link mark with attributes
const link = schema . mark ( 'link' , { href: 'https://example.com' })
// Apply to text
const text = schema . text ( 'Hello' , [ bold , link ])
// Or use mark actions
const [ text2 ] = editor . marks . bold ( 'Hello' )
Schema Immutability
Schemas are immutable after editor creation. You cannot:
Add new node or mark types
Remove existing types
Change content expressions
Modify attributes
To change the schema, you must create a new editor:
// ❌ Bad - can't add nodes dynamically
const editor = createEditor ({ extension: basicExtension })
editor . use ( defineTable ()) // Error! Schema cannot be changed
// ✅ Good - include all nodes upfront
const fullExtension = union (
basicExtension ,
defineTable (),
)
const editor = createEditor ({ extension: fullExtension })
Common Schema Patterns
Minimal Document Schema
import { union , defineDoc , defineText , defineParagraph } from '@prosekit/core'
const extension = union (
defineDoc (), // Root node
defineText (), // Text node
defineParagraph (), // At least one block type
)
Rich Text Schema
import { union } from '@prosekit/core'
const extension = union (
// Structure
defineDoc (),
defineText (),
defineParagraph (),
defineHeading (),
defineBlockquote (),
defineCodeBlock (),
defineHorizontalRule (),
// Lists
defineBulletList (),
defineOrderedList (),
defineListItem (),
// Inline
defineImage (),
defineHardBreak (),
// Marks
defineBold (),
defineItalic (),
defineCode (),
defineLink (),
defineStrike (),
defineUnderline (),
)
Nested Content Schema
const table = defineNodeSpec ({
name: 'table' ,
content: 'tableRow+' ,
group: 'block' ,
// ...
})
const tableRow = defineNodeSpec ({
name: 'tableRow' ,
content: 'tableCell+' ,
// ...
})
const tableCell = defineNodeSpec ({
name: 'tableCell' ,
content: 'block+' , // Cells contain block content
// ...
})
Best Practices
Always define doc and text : These are required for any schema
Use content expressions carefully : Invalid expressions can break the editor
Group related nodes : Use groups to simplify content expressions
Validate attributes : Use the validate option to prevent invalid values
Test schema changes : Schema changes can break existing documents
Document your schema : Clearly document custom nodes and their purpose
Next Steps
Architecture Understand ProseKit’s overall architecture
Extensions Learn more about the extension system