mirror of https://gitlab.com/curben/blog
post: "Override smartypants in marked.js renderer"
This commit is contained in:
parent
4f69beea5f
commit
b889e3d13c
|
@ -0,0 +1,130 @@
|
||||||
|
---
|
||||||
|
title: Override smartypants in marked.js renderer
|
||||||
|
excerpt: marked is a Markdown renderer
|
||||||
|
date: 2020-08-30
|
||||||
|
tags:
|
||||||
|
- javascript
|
||||||
|
---
|
||||||
|
|
||||||
|
``` js
|
||||||
|
const marked = require('marked')
|
||||||
|
const { escape } = require('marked/src/helpers')
|
||||||
|
const { Tokenizer: MarkedTokenizer } = marked
|
||||||
|
|
||||||
|
class Tokenizer extends MarkedTokenizer {
|
||||||
|
// Override smartypants
|
||||||
|
inlineText (src, inRawBlock) {
|
||||||
|
const { options, rules } = this
|
||||||
|
const { smartypants: smartypantsCfg } = options
|
||||||
|
|
||||||
|
// https://github.com/markedjs/marked/blob/b6773fca412c339e0cedd56b63f9fa1583cfd372/src/Lexer.js#L8-L24
|
||||||
|
const smartypants = str => {
|
||||||
|
return str
|
||||||
|
// em-dashes
|
||||||
|
.replace(/---/g, '\u2014')
|
||||||
|
// en-dashes
|
||||||
|
.replace(/--/g, '\u2013')
|
||||||
|
// opening singles
|
||||||
|
.replace(/(^|[-\u2014/([{"\s])'/g, '$1\u2018')
|
||||||
|
// closing singles & apostrophes
|
||||||
|
.replace(/'/g, '\u2019')
|
||||||
|
// opening doubles
|
||||||
|
.replace(/(^|[-\u2014/([{\u2018\s])"/g, '$1\u201c')
|
||||||
|
// closing doubles
|
||||||
|
.replace(/"/g, '\u201d')
|
||||||
|
// ellipses
|
||||||
|
.replace(/\.{3}/g, '\u2026')
|
||||||
|
}
|
||||||
|
|
||||||
|
// https://github.com/markedjs/marked/blob/b6773fca412c339e0cedd56b63f9fa1583cfd372/src/Tokenizer.js#L643-L658
|
||||||
|
const cap = rules.inline.text.exec(src)
|
||||||
|
if (cap) {
|
||||||
|
let text
|
||||||
|
if (inRawBlock) {
|
||||||
|
text = cap[0]
|
||||||
|
} else {
|
||||||
|
text = escape(smartypantsCfg ? smartypants(cap[0]) : cap[0])
|
||||||
|
}
|
||||||
|
return {
|
||||||
|
type: 'text',
|
||||||
|
raw: cap[0],
|
||||||
|
text
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
marked.setOptions({
|
||||||
|
smartypants: true
|
||||||
|
})
|
||||||
|
|
||||||
|
const tokenizer = new Tokenizer()
|
||||||
|
|
||||||
|
marked('input', { tokenizer })
|
||||||
|
```
|
||||||
|
|
||||||
|
A year ago, a user requested an option to override the behaviour of marked's smartypants, particularly the user wondered if it is possible to replace `"` with `«»` instead of `“”`. Another Markdown renderer, markdown-it (utilised by hexo-renderer-markdown-it), also offers smartypants feature and you can easily customise the quotes substitution using "quotes:" option. But marked doesn't offer that option and I was not familiar with marked API, I couldn't implement the user's request.
|
||||||
|
|
||||||
|
Recently after working on [hexojs/hexo-renderer-marked#159](https://github.com/hexojs/hexo-renderer-marked/pull/159), I became (slightly) more familiar with [marked](https://marked.js.org/), particularly in overriding its rendering methods. I noticed [`inlineText`](https://marked.js.org/#/USING_PRO.md#inline-level-tokenizer-methods) tokenizer passes smartypants function in one of its arguments:
|
||||||
|
|
||||||
|
> - inlineText(_string_ src, _bool_ inRawBlock, _function_ smartypants)
|
||||||
|
|
||||||
|
It seemed it is possible to bring-your-own smartypants function. Indeed after a few trial-and-error (there was no clear example), I finally figured it out and add a new `quotes:` option in hexo-renderer-marked ([hexojs/hexo-renderer-marked#161](https://github.com/hexojs/hexo-renderer-marked/pull/161)). I attached a sample code at the beginning of this post. If you are already using marked, that code should be quite easy to understand and you just need to modify the `smartypants()` function. Otherwise, here is my explanation.
|
||||||
|
|
||||||
|
``` js
|
||||||
|
const { escape } = require('marked/src/helpers')
|
||||||
|
```
|
||||||
|
|
||||||
|
marked uses this function to escape unsafe content related to HTML tag (e.g. `<` to [`<`](https://github.com/markedjs/marked/blob/b6773fca412c339e0cedd56b63f9fa1583cfd372/src/helpers.js#L10). I initially wanted to hexo-util's [`escapeHTML()`](https://github.com/hexojs/hexo-util#escapehtmlstr) since they seem to serve similar purpose and `escapeHTML()` does escape more potentially unsafe character. But then I noticed the regex search pattern is slightly different, so I retain marked's `escape()` to avoid any undesired rendering change.
|
||||||
|
|
||||||
|
``` js
|
||||||
|
// https://github.com/markedjs/marked/blob/b6773fca412c339e0cedd56b63f9fa1583cfd372/src/Lexer.js#L8-L24
|
||||||
|
const smartypants = str => {
|
||||||
|
return str
|
||||||
|
// em-dashes
|
||||||
|
.replace(/---/g, '\u2014')
|
||||||
|
// en-dashes
|
||||||
|
.replace(/--/g, '\u2013')
|
||||||
|
// opening singles
|
||||||
|
.replace(/(^|[-\u2014/([{"\s])'/g, '$1\u2018')
|
||||||
|
// closing singles & apostrophes
|
||||||
|
.replace(/'/g, '\u2019')
|
||||||
|
// opening doubles
|
||||||
|
.replace(/(^|[-\u2014/([{\u2018\s])"/g, '$1\u201c')
|
||||||
|
// closing doubles
|
||||||
|
.replace(/"/g, '\u201d')
|
||||||
|
// ellipses
|
||||||
|
.replace(/\.{3}/g, '\u2026')
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This is smartypants function as implemented by marked, just comment out any `.replace()` line that you don't want. Note the ordering of the replace function, you may need to comment out other related replacement; if you remove em-dash replace but still retain en-dash, any triple-dash "---" will become en-dash + dash "–-". It's also possible to add _more_ substitutions, like "=>" becomes "⇒".
|
||||||
|
|
||||||
|
|
||||||
|
``` js
|
||||||
|
if (inRawBlock) {
|
||||||
|
text = cap[0]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`inRawBlock` will be true whenever marked encounters (safe) raw HTML element like `<kbd>lorem ipsum</kbd>` in the markdown content; in this case, there is no need to escape and it will be retained as is.
|
||||||
|
|
||||||
|
``` js
|
||||||
|
return {
|
||||||
|
type: 'text',
|
||||||
|
raw: cap[0],
|
||||||
|
text
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This is what I initially struggled the most to understand, I didn't know which `type:` should I return. At first, I thought the type should be itself (`inlineText`) since that was the `codespan` [example](https://marked.js.org/#/USING_PRO.md#tokenizer) showed, but that didn't work (it didn't make sense anyway, since the function shouldn't need to identify itself).
|
||||||
|
|
||||||
|
It turned out to be one of the [inline renderer](https://marked.js.org/#/USING_PRO.md#inline-level-renderer-methods) methods, in this case, it should be `text`.
|
||||||
|
|
||||||
|
``` js
|
||||||
|
marked.setOptions({
|
||||||
|
smartypants: true
|
||||||
|
})
|
||||||
|
```
|
||||||
|
|
||||||
|
This option is available as `this.options.smartypants` property in the method.
|
Loading…
Reference in New Issue