Hardcore spell checking in Emacs

This article is not introduction of Emacs spell checking basics. It requires deep knowledge of Emacs Lisp and Fly Spell.

You could read my article What's the best spell check setup in emacs for basic knowledge.

This article introduces new techniques to make Fly Spell more powerful and faster.

The CLI programs aspell and hunspell can only parse plain text. They don't know any programming language syntax.

Fly Spell feeds the output of CLI program into its own Lisp predicate named flyspell-generic-check-word-predicate whose default value is nil.

When executing (flyspell-mode 1), the per mode predicate is assigned to flyspell-generic-check-word-predicate.

For example, you can run (get major-mode 'flyspell-mode-predicate) to get predicate of current major mode, (get 'web-mode 'flyspell-mode-predicate) to get predicate of web-mode.

The predicate is a simple function without parameter. Here is my predicate for web-mode,

(defun my-web-mode-flyspell-verify ()
  "Fly Spell predicate of `web-mode`."
  (let* ((font-face-at-point (get-text-property (- (point) 1) 'face))
         rlt)
    ;; If rlt is t, the word at point is POSSIBLY a typo, continue checking.
    (setq rlt t)
    ;; if rlt is nil, the word at point is definitely NOT a typo.
    ;; (setq rlt nil)
    rlt))
;; Attach my predicate to `web-mode`
(put 'web-mode 'flyspell-mode-predicate 'my-web-mode-flyspell-verify)

If you read code of flyspell-prog-mode, you will find it set flyspell-generic-check-word-predicate to its own predicate flyspell-generic-progmode-verify,

(defvar flyspell-prog-text-faces
  '(font-lock-string-face font-lock-comment-face font-lock-doc-face)
  "Faces corresponding to text in programming-mode buffers.")

(defun flyspell-generic-progmode-verify ()
  "Used for `flyspell-generic-check-word-predicate' in programming modes."
  (unless (eql (point) (point-min))
    ;; (point) is next char after the word. Must check one char before.
    (let ((f (get-text-property (1- (point)) 'face)))
      (memq f flyspell-prog-text-faces))))

As you can see, flyspell-generic-progmode-verify is very simple. If the word at point is not inside comment or string, the predicate returns nil which means the word is not a typo.

So in theory I can write my own predicate by following flyspell-generic-progmode-verify.

But in reality it's not as simple as it seems. The predicate is written in Lisp so it's slow. If it contains too much code, Fly Spell process might block other actions in Emacs. Emacs could be un-responsive when editing text.

The solution is not to start Fly Spell process too frequently.

The flyspell-mode starts checking when text in current buffer is modified.

My solution is not to turn on flyspell-mode. Instead, I manage the spell checking by myself using APIs from flyspell.

I only spell check when user saving current buffer. The interval between spell check should not be less than 5 minutes. Spell check is done by calling API flyspell-buffer

Checking the whole buffer is still slow. Instead, we can check the text region in current window by calling flyspell-region instead. The api window-total-height returns the height of current Windows. So I can use below code to get the region to check,

(let* (beg end (orig-pos (point)))
  (save-excursion
    (forward-line (- (window-total-height)))
    (setq beg (line-beginning-position))
    (goto-char orig-pos)
    (forward-line (window-total-height))
    (setq end (line-end-position)))
  (flyspell-region beg end))

I also need respect the predicate embedded in the major mode in my own generic predicate. Since per mode predicate has already checked the font face, I should skip the font face check in generic predicate if per mode predicate exists.

Above algorithms are implemented in wucuo.

Usage,

(add-hook 'prog-mode-hook 'wucuo-start)
(add-hook 'text-mode-hook 'wucuo-start)

If wucuo-flyspell-start-mode is "fast" (default value), flyspell-region is used, visible region is checked when user saves current file.

If wucuo-flyspell-start-mode is "normal", flyspell-buffer is used, current buffer is checked when user saves current file.

Audio recording on Linux

  • Run sudo alsamixer and turn off mic to reduce the noise
  • Run alsamixer to double check pulse setup
  • Make sure correct device is selected in audacity
  • Restart audacity and test

My alsamixer setup, alsamixer-nq8.png

Use Magit API to rebase to closest branch

My workflow in Git is,

  • Create a new feature branch based on main branch
  • Add some small commits into feature branch
  • Rebase feature branch interactively

The final rebase step happens a lot.

So I could use Magit api magit-rebase-interactive to speed up it.

The key is to analyze output of git log --decorate --oneline to find the main branch commit.

Code,

(defun my-git-extract-based (target)
  "Extract based version from TARGET."
  (replace-regexp-in-string "^tag: +"
                            ""
                            (car (nreverse (split-string target ", +")))))

(defun my-git-rebase-interactive (&optional user-select-branch)
  "Rebase interactively on the closest branch or tag in git log output.
If USER-SELECT-BRANCH is not nil, rebase on the tag or branch selected by user."
  (interactive "P")
  (let* ((log-output (shell-command-to-string "git --no-pager log --decorate --oneline -n 1024"))
         (lines (split-string log-output "\n"))
         (targets (delq nil
                        (mapcar (lambda (e)
                                  (when (and (string-match "^[a-z0-9]+ (\\([^()]+\\)) " e)
                                             (not (string-match "^[a-z0-9]+ (HEAD " e)))
                                    (match-string 1 e))) lines)))
         based)
    (cond
     ((or (not targets) (eq (length targets) 0))
      (message "No tag or branch is found to base on."))
     ((or (not user-select-branch)) (eq (length targets) 1)
      ;; select the closest/only tag or branch
      (setq based (my-git-extract-based (nth 0 targets))))
     (t
      ;; select the one tag or branch
      (setq based (my-git-extract-based (completing-read "Select based: " targets)))))

    ;; start git rebase
    (when based
      (magit-rebase-interactive based nil))))

Screencast:

magit-rebase-api.gif

Make Emacs faster than Vim in "git mergetool"

My article Emacs is the best merge tool for Git explains how to combine git mergetool with ediff-mode in Emacs.

Harrison McCullough suggested the work flow can be faster if emacs is replaced with emacsclient.

I did some research and found a perfect solution. It's even faster than Vim.

Initial solution

Please note emacsclient is only use for resolving conflicts.

Step 1, start emacs server by running emacs -Q --daemon --eval "(setq startup-now t)" -l "/home/my-username/.emacs.d/init.el" --eval "(progn (require 'server) (server-start))" in shell.

Step 2, insert below code into ~/.emacs.d/init.el (see the comment why this advice is required):

(defadvice server-save-buffers-kill-terminal (after server-save-buffers-kill-terminal-after-hack activate)
  ;; kill all buffers, so new ediff panel is re-created and `ediff-startup-hook-setup' is called again
  ;; besides, remove the buffers whose binding files are alredy merged in `buffer-list'
  (mapc 'kill-buffer (buffer-list)))

Step 3, insert below code into ~/.gitconfig:

[mergetool.ediff]
cmd = emacsclient -nw --eval \"(progn (setq ediff-quit-hook 'kill-emacs) (if (file-readable-p \\\"$BASE\\\") (ediff-merge-files-with-ancestor \\\"$LOCAL\\\" \\\"$REMOTE\\\" \\\"$BASE\\\" nil \\\"$MERGED\\\") (ediff-merge-files \\\"$LOCAL\\\" \\\"$REMOTE\\\" nil \\\"$MERGED\\\")))\"

My real world solution

It's similar to initial solution. But some scripts are created for automation.

Step 1, read Using Emacs as a Server in the manual and create ~/.config/systemd/user/emacs.service for Systemd:

[Unit]
Description=Emacs text editor
Documentation=info:emacs man:emacs(1) https://gnu.org/software/emacs/

[Service]
Type=forking
ExecStart=emacs -Q --daemon --eval "(setq startup-now t)" -l "/home/my-username/.emacs.d/init.el" --eval "(progn (require 'server) (server-start))" 
ExecStop=emacsclient --eval "(kill-emacs)"
Environment=SSH_AUTH_SOCK=%t/keyring/ssh
Restart=on-failure

[Install]
WantedBy=default.target

Step 2, set up in ~/.gitconfig:

[mergetool.emacs]
    cmd = ediff.sh "$LOCAL" "$REMOTE" "$BASE" "$MERGED"
[mergetool.emacsclient]
    cmd = MYEMACSCLIENT=emacsclient ediff.sh "$LOCAL" "$REMOTE" "$BASE" "$MERGED"

Step 3, create ediff.sh:

#!/bin/sh
[ -z "$MYEMACSCLIENT" ] && MYEMACSCLIENT="emacs"
# emacsclient won't work in git mergetool
# $1=$LOCAL $2=$REMOTE $3=$BASE $4=$MERGED
if [ "$MYEMACSCLIENT" = "emacs" ]; then
    $MYEMACSCLIENT -nw -Q --eval "(setq startup-now t)" -l "$HOME/.emacs.d/init.el" --eval "(progn (setq ediff-quit-hook 'kill-emacs) (if (file-readable-p \"$3\") (ediff-merge-files-with-ancestor \"$1\" \"$2\" \"$3\" nil \"$4\") (ediff-merge-files \"$1\" \"$2\" nil \"$4\")))"
else
    $MYEMACSCLIENT -nw --eval "(progn (setq ediff-quit-hook 'kill-emacs) (if (file-readable-p \"$3\") (ediff-merge-files-with-ancestor \"$1\" \"$2\" \"$3\" nil \"$4\") (ediff-merge-files \"$1\" \"$2\" nil \"$4\")))"
fi

Step 4, run git mergetool -t emacsclient to resolve conflicts.

My init-ediff.el in emacs.d.

Thoughts on "Native shell completion in Emacs"

Native shell completion in Emacs by Troy Hinckley is must read for completion in shell-mode.

One problem is my ~/.bashrc executes /etc/bash_completion,

if [ -f /etc/bash_completion ]; then
    . /etc/bash_completion
fi

Unfortunately /etc/bash_completion makes complete -p output some lines the Emacs function bash-completion-tokenize can't analyze.

Here is output of complete -p at my PC,

...
complete -F _known_hosts mtr
complete -o default -o nospace -W 'homepc
192.168.1.104
github.com
gitlab.com' scp
complete -o default -f -X '!*.dvi' dvipdf
...

The line gitlab.com' scp will crash bash-completion-tokenize. Obviously, one line complete -o default -o nospace -W 'homepc 192.168.1.104 github.com gitlab.com' scp is wrongly split into multiple lines by complete -p.

In shell-mode, completion functions might call bash-completion-tokenize. If bash-completion-tokenize crashes, the completion in shell-mode won't work.

Besides, if company-mode provides auto-completion UI, it's better to place the backend company-files before company-native-complete. It's because the backend company-files displays the full file path in candidates. So users can complete the whole path in one shot.

My setup code for the packages Troy Hinckley suggested,

;; Enable auto-completion in `shell'.
(with-eval-after-load 'shell
  (native-complete-setup-bash))

;; `bash-completion-tokenize' can handle garbage output of "complete -p"
(defadvice bash-completion-tokenize (around bash-completion-tokenize-hack activate)
  (let* ((args (ad-get-args 0))
         (beg (nth 0 args))
         (end (nth 1 args)))
    ;; original code extracts tokens from output of "complete -p" line by line
    (cond
     ((not (string-match-p "^complete " (buffer-substring beg end)))
      ;; filter out some wierd lines
      (setq ad-return-value nil))
     (t
      ad-do-it))))

(defun shell-mode-hook-setup ()
  "Set up `shell-mode'."
  ;; hook `completion-at-point', optional
  (add-hook 'completion-at-point-functions #'native-complete-at-point nil t)
  (setq-local company-backends '((company-files company-native-complete)))
  ;; `company-native-complete' is better than `completion-at-point'
  (local-set-key (kbd "TAB") 'company-complete))
(add-hook 'shell-mode-hook 'shell-mode-hook-setup)

Screenshot,

shell-complete-path-nq8.png

shell-complete-param-nq8.png

How to speed up lsp-mode

Here is my setup,

(with-eval-after-load 'lsp-mode
  ;; enable log only for debug
  (setq lsp-log-io nil)

  ;; use `evil-matchit' instead
  (setq lsp-enable-folding nil)

  ;; no real time syntax check
  (setq lsp-diagnostic-package :none)

  ;; handle yasnippet by myself
  (setq lsp-enable-snippet nil)

  ;; use `company-ctags' only.
  ;; Please note `company-lsp' is automatically enabled if installed
  (setq lsp-enable-completion-at-point nil)

  ;; turn off for better performance
  (setq lsp-enable-symbol-highlighting nil)

  ;; use ffip instead
  (setq lsp-enable-links nil)

  ;; auto restart lsp
  (setq lsp-restart 'auto-restart)

  ;; @see https://github.com/emacs-lsp/lsp-mode/pull/1498 and code related to auto configure.
  ;; Require clients could be slow.
  ;; I only load `lsp-clients' because it includes the js client which I'm interested
  (setq lsp-client-packages '(lsp-clients))

  ;; don't scan 3rd party javascript libraries
  (push "[/\\\\][^/\\\\]*\\.\\(json\\|html\\|jade\\)$" lsp-file-watch-ignored) ; json

  ;; don't ping LSP lanaguage server too frequently
  (defvar lsp-on-touch-time 0)
  (defadvice lsp-on-change (around lsp-on-change-hack activate)
    ;; don't run `lsp-on-change' too frequently
    (when (> (- (float-time (current-time))
                lsp-on-touch-time) 30) ;; 30 seconds
      (setq lsp-on-touch-time (float-time (current-time)))
      ad-do-it)))

(defun my-connect-lsp (&optional no-reconnect)
  "Connect lsp server.  If NO-RECONNECT is t, don't shutdown existing lsp connection."
  (interactive "P")
  (when (and (not no-reconnect)
             (fboundp 'lsp-disconnect))
    (lsp-disconnect))
  (when (and buffer-file-name
             (not (member (file-name-extension buffer-file-name)
                          '("json"))))
    (unless (and (boundp 'lsp-mode) lsp-mode)
      (if (derived-mode-p 'js2-mode) (setq-local lsp-enable-imenu nil))
      (lsp-deferred))))

To enable lsp for the major mode XXX-mode needs only one line,

(add-hook 'XXX-mode-hook #'my-connect-lsp)

You also need install three packages,

Explanation,

Ctags is used to generate tags file for company-ctags and counsel-etags. GNU Find is required for find-file-in-project.

These three packages are faster and can replace the corresponding functionalities in lsp-mode.

I don't need any lint tools from lsp-mode because the lint tool is already included in our build script. I can see the syntax error from terminal.

I advice the lsp-on-change in order to notify the language server less frequently.

js2-mode has its own javascript parser extract imenu items. So I don't need javascript language server's parser to send back imenu items.

By default lsp-client-packages contains many clients, but I only code in javascript which is included in lsp-clients.

Here is code quoted from lsp-mode,

;;;###autoload
(defun lsp (&optional arg)
  ;; ...
  (when (and lsp-auto-configure)
    (seq-do (lambda (package) (require package nil t))
            lsp-client-packages))
  ;; ...
)

I have done some profiling by insert (profiler-report-cpu) at the end of lsp (the bottlenecks is highlighted).

lsp-mode-bottleneck-nq8.png

The language server I used can read jsconfig.json in project root. I can specify the directories to exclude in it.

Yin and Yang in Emacs

As a Chinese, I studied Tao Te Ching since childhood. So I believe Tao (the way) exist in Emacs. Tao is basically Yin and Yang who lives in harmony in Emacs.

I can't say Yin is good and Yang is evil, or vice versa. All I can do is to find the way to make Yin and Yang co-exist.

For example, a few days ago I published Effective "git blame" in Emacs which introduced my package vc-msg.

It became one of my most popular reddit post because its unique feature partial line blame,

vc-msg-good.png

vc-msg-bad.png

I noticed some comments compared my package with Magit. Those comments were very educational and I did learn a few useful tricks.

My point is, vc-msg and Magit could collaborate without any problem, like Yin and Yang lives harmony. If you find any conflict between vc-msg and Magit, just let me know. I will fix it.

I totally understand there are many Magit lovers in Emacs community. So I make vs-msg v1.0.2 to support Magit. You can use partial line blame in vc-msg but calling Magit command to open the commit.

It's only one line setup,

(setq vc-msg-git-show-commit-function 'magit-show-commit)

vc-msg-and-magit.png

I tested in magit-blame-mode and found no issue.

I'm sure vc-msg should work in other major modes or minor modes. There are also two callback functions vc-msg-get-current-file-function and vc-msg-get-line-num-function which users can customize.

Effective "git blame" in Emacs

I published Emacs package vc-msg. It uses git-blame to show commit information of current line.

In the new version, it can display the correct commit information of current line.

For example, the line 6 at https://github.com/redguardtoo/test-git-blame/blob/master/hello.js is changed by three commits.

Select the partial of line 6 and run vc-msg-show, the correct commit is displayed.

Screenshots,

vc-msg-neutral.png

vc-msg-good.png

vc-msg-bad.png

Emacs is the best merge tool for Git

CREATED: <2019-11-13 Wed>

UPDATED: <2020-04-10 Fri> if you use my solution, you can replace emacs with emacsclient. So it's even faster than Vim.

I used to regard vimdiff as the best merge tool for Git because it's simply fast.

Here is the demo how I use vimdiff to resolve conflicts from https://github.com/redguardtoo/test-git-mergetool.

vimdiff-as-git-merge-tool.gif

Please note in the screencast I use Git built in command mergetool. It will automatically open conflicted file one by one using vim. In other software, the developer need manually select and open the conflicted file.

The only issue is Vim is not as powerful as Emacs.

Resolving conflicts is NOT only picking up a diff hunk from remote/local buffer. I often need place my hunk into merged buffer first, then I go to remote buffer and copy some snippet into merged buffer. So there are lots of sub-window operations.

In Emacs, I use Ace-window and Winum to move focus between sub-windows. I also use API window-configuration-to-register and jump-to-register to save/load windows layout. Besides, Ediff is a beast to handle diff and patch.

So I give one example to prove why Emacs should be a better merge tool in theory. If you are good at both Vim and Emacs, you know it's the truth.

Now let's talk the real world problem. And I will show you a perfect solution soon.

The problem is, I never use Emacs to resolve merge conflicts for two reasons:

  • First, My Emacs configuration uses too many packages. It starts up slowly. As you can see from vimdiff demo, git mergetool restarts the editor many times. So the editor should be lightweight.
  • Second, the UI of ediff is not right. UI of Vimdiff is much better. All operations should be completed freely in any sub-window instead of ediff control panel only.

Luckily, Emacs gives me the full freedom to solve the problem. The final result is beyond my expectation.

Here is the complete solution.

This technique is only useful for git mergetool because git will open and close the text editor Emacs many times.

Insert below code into ~/.gitconfig,

[mergetool.ediff]
# use git mergetool ediff to resolve conflicts
cmd = emacs -nw -Q --eval \"(setq startup-now t)\" -l \"~/.emacs.d/init.el\" --eval \"(progn (setq ediff-quit-hook 'kill-emacs) (if (file-readable-p \\\"$BASE\\\") (ediff-merge-files-with-ancestor \\\"$LOCAL\\\" \\\"$REMOTE\\\" \\\"$BASE\\\" nil \\\"$MERGED\\\") (ediff-merge-files \\\"$LOCAL\\\" \\\"$REMOTE\\\" nil \\\"$MERGED\\\")))\"

In above code, option -Q equals -q --no-site-file --no-splash. Actually, only -q is critical. -q means "Do not load an init file". A global emacs lisp flag startup-now is defined before loading ~/.emacs.d/init.el.

Then in ~/.emacs.d/init.el, I need only add one line,

(when (not (boundp 'startup-now))
  ;; heavy weight configuration happens here
  )

When startup-now is defined, all the heavyweight configuration should be off. Considering in this scenario, we are using Emacs only as merge tool, 99% configuration could be turned off. For example, set up for any programming language is not required. Flyspell and flycheck should be off. Yasnippet is also useless.

I only need focus on essential operations related to text/file/window.

Evil should be used. At the beginning of this article, I said "I love vimdiff because it's fast". It's impossible to be more efficient without Evil.

Any patch/diff utilities should be included too. counsel/swiper/ivy is also must have because I can use counsel-git to find file and counsel-git-grep to grep text.

Native Emacs API is enough to save/load windows layout.

Packages dependent on ediff (Magit?) could also benefit from optimization of ediff.

The optimization is simple. Do everything in merged buffer.

First I move focus into merged buffer when Emacs starts up,

This set up happens in ediff-startup-hook,

(defun ediff-startup-hook-setup ()
  ;; hide control panel if it's current buffer
  (when (string-match-p (setq my-ediff-panel-name (buffer-name))
                        "\*Ediff Control Panel.*\*")
    ;; move to the first difference
    (ediff-next-difference)
    ;; move to the merged buffer window
    (winum-select-window-by-number 3)
    ;; save the windows layout
    (window-configuration-to-register ?a)))

(add-hook 'ediff-startup-hook 'ediff-startup-hook-setup)

Please note I use winum-select-window-by-number from winum move focus to merged buffer. You can use any other third party package or native API select-window instead.

Saving initial windows layout into register a is achieved by (window-configuration-to-register ?a) in ediff-startup-hook. (jump-to-register ?a) restores the saved layout.

Then we need make sure ediff commands can be used out of ediff's panel. Currently ediff command can only be triggered inside of its panel.

The trick is "move focus into ediff panel temporarily to execute its commands, then move focus back to original window".

So I designed a macro my-ediff-command to do this,

(defmacro my-ediff-command (cmd &optional no-arg)
  `(lambda (&optional arg)
     (interactive "P")
     (let* ((w (get-buffer-window)))
       ;; go to panel window
       (select-window (get-buffer-window my-ediff-panel-name))
       ;; execute ediff command, ignore any error
       (condition-case e
           (if ,no-arg (funcall ,cmd) (funcall ,cmd arg))
         (error
          (message "%s" (error-message-string e))))
       ;; back to original window
       (select-window w))))

Usage is simple,

(global-set-key (kbd "C-c C-y") (my-ediff-command 'ediff-next-difference))

Here is the list of essential ediff commands,

  • ediff-next-difference
  • ediff-previous-difference
  • ediff-restore-diff-in-merge-buffer
  • ediff-revert-buffers-then-recompute-diffs
  • ediff-copy-A-to-C
  • ediff-copy-A-to-C
  • ediff-copy-both-to-C

You can use Hyra or General.el to assign key bindings.

The definition of ediff-copy-both-to-C,

;; @see https://stackoverflow.com/a/29757750/245363
(defun ediff-copy-both-to-C (&optional arg)
  "Copy code from both A and B to C."
  (interactive)
  (ediff-copy-diff ediff-current-difference nil 'C nil
                   (concat
                    (ediff-get-region-contents ediff-current-difference 'A ediff-control-buffer)
                    (ediff-get-region-contents ediff-current-difference 'B ediff-control-buffer))))

Here is my ~/.gitconfig and my ediff set up in real world.

Please note the techniques introduced here can be used with other VCS (subversion, perforce …).

Demo on using Emacs to resolve merge conflicts,

emacs-as-git-merge-tool.gif

counsel-etags 1.9.0 is out

Counsel-etags is fast, energy-saving, and powerful code navigation solution.

This version can list tags in current buffer.

You can simply run M-x counsel-etags-list-tag-in-current-file.

Or set up imenu before M-x imenu or M-x helm-imenu or M-x counsel-imenu,

(setq imenu-create-index-function 'counsel-etags-imenu-default-create-index-function)

screenshot:

counsel-etags-imenu.png