Hardcore spell checking in Emacs

  |   Source

This article is not introduction of Emacs spell checking basics. It requires deep knowledge of Emacs Lisp and Fly Spell.

You could read my article What's the best spell check setup in emacs for basic knowledge.

This article introduces new techniques to make Fly Spell more powerful and faster.

The CLI program aspell and hunspell can only parse plain text. They don't know any programming language syntax.

Fly Spell feeds the output of CLI program into its own Lisp predicate named flyspell-generic-check-word-predicate whose default value is nil.

When executing (flyspell-mode 1), the per mode predicate is assigned to flyspell-generic-check-word-predicate.

For example, you can run (get major-mode 'flyspell-mode-predicate) to get predicate of current major mode, (get 'web-mode 'flyspell-mode-predicate) to get predicate of web-mode.

The predicate is a simple function without parameter. Here is my predicate for web-mode,

(defun my-web-mode-flyspell-verify ()
  "Fly Spell predicate of `web-mode`."
  (let* ((font-face-at-point (get-text-property (- (point) 1) 'face))
         rlt)
    ;; If rlt is t, the word at point is POSSIBLY a typo, continue checking.
    (setq rlt t)
    ;; if rlt is nil, the word at point is definitely NOT a typo.
    ;; (setq rlt nil)
    rlt))
;; Attach my predicate to `web-mode`
(put 'web-mode 'flyspell-mode-predicate 'my-web-mode-flyspell-verify)

If you read code of flyspell-prog-mode, you will find it set flyspell-generic-check-word-predicate to its own predicate flyspell-generic-progmode-verify,

(defvar flyspell-prog-text-faces
  '(font-lock-string-face font-lock-comment-face font-lock-doc-face)
  "Faces corresponding to text in programming-mode buffers.")

(defun flyspell-generic-progmode-verify ()
  "Used for `flyspell-generic-check-word-predicate' in programming modes."
  (unless (eql (point) (point-min))
    ;; (point) is next char after the word. Must check one char before.
    (let ((f (get-text-property (1- (point)) 'face)))
      (memq f flyspell-prog-text-faces))))

As you can see, flyspell-generic-progmode-verify is very simple. If the word at point is not inside comment or string, the predicate returns nil which means the word is not a typo.

So in theory I can write my own predicate by following flyspell-generic-progmode-verify.

But in reality it's not as simple as it seems. The predicate is written in Lisp so it's slow. If it contains too much code, Fly Spell process might block other actions in Emacs. Emacs could be un-responsive when editing text.

The solution is not to start Fly Spell process too frequently.

The flyspell-mode starts checking when text in current buffer is modified.

My solution is not to turn on flyspell-mode. Instead, I manage the spell checking by myself using APIs from flyspell.

I only spell check when user saving current buffer. The interval between spell check should not be less than 5 minutes. Spell check is done by calling API flyspell-buffer

Checking the whole buffer is still slow. Instead, we can check the text region in current window by calling flyspell-region instead. The api window-total-height returns the height of current Windows. So I can use below code to get the region to check,

(let* (beg end (orig-pos (point)))
  (save-excursion
    (forward-line (- (window-total-height)))
    (setq beg (line-beginning-position))
    (goto-char orig-pos)
    (forward-line (window-total-height))
    (setq end (line-end-position)))
  (flyspell-region beg end))

I also need respect the predicate embedded in the major mode in my own generic predicate. Since per mode predicate has already checked the font face, I should skip the font face check in generic predicate if per mode predicate exists.

Above algorithms are implemented in wucuo. Here is its usage,

(defun prog-mode-hook-setup ()
  ;; (setq wucuo-flyspell-start-mode "lite")
  ;; (setq wucuo-flyspell-start-mode "ultra")
  (wucuo-start t))
(add-hook 'prog-mode-hook 'prog-mode-hook-setup)

If wucuo-flyspell-start-mode is "full" (default value), flyspell-mode is enabled. In this case, wucuo is only the advanced version of flyspell-prog-mode,

If wucuo-flyspell-start-mode is "lite", flyspell-buffer is used, checking is done when user saves current buffer.

If wucuo-flyspell-start-mode is "ultra", flyspell-region is used, checking is done when user saves current buffer.

Comments powered by Disqus