Electron 中文搜索

Electron 中文搜索
Electron 中文搜索

這篇文章快速演示如何使用 js-search, nodejieba(結巴)來在 Electron 中實現中文搜索。

它快速,實時,比你見過的任何一種搜索都快,快到爆漿。

techversion
electron30.0.6
nodejieba2.6.0
js-search2.0.1

本文將帶你解決在中國大陸使用 npm 鏡像及 nodejieba 可能遇到的一系列問題:

  1. npmmirror 中的 nodejieba 包不存在或無法下載
  2. nodejieba 無人維護,不支持在 win11 及 vs studio 2022 版本運行
  3. nodejieba 不支持 typescript

添加依賴

1npm i js-search
2npm i nodejieba@2.6.0 --save-optional --ignore-scripts

爲什麼 nodejieba 要採取這種方式?因爲 nodejieba 是用 c++ 編寫,而它的社區已經不活躍了。它的編譯腳本會失敗。我們需要跳過它的腳本,自己編譯。

** 你需要安裝 vs studio 2022,並勾選使用 c++ 桌面開發 **。

或者使用下面的 powershell 命令,僅安裝需要的組件:

1Invoke-WebRequest -Uri 'https://aka.ms/vs/17/release/vs_BuildTools.exe' -OutFile "$env:TEMP\vs_BuildTools.exe"
2
3& "$env:TEMP\vs_BuildTools.exe" --passive --wait --add Microsoft.VisualStudio.Workload.VCTools --includeRecommended

修復 nodejieba

nodejieba 不支持 c++ 17 標準,而修改方法很簡單。

你只需要在它編譯之前,將 github.com/yanyiwu/limonp 中的 StringUtil_latest.hpp 替換到 nodejieba 即可。

這是一個樣例。

 1const fs = require('fs');
 2const path = require('path');
 3const projectDir = path.dirname(path.resolve(__dirname));
 4
 5const patchFile = path.resolve(projectDir, 'SOME_FOLDER', 'StringUtil_latest.hpp'); // 將 StringUtil.hpp 保存到本地的某個位置,如 SOME_FOLDER/StringUtil_latest.hpp
 6
 7const dest = path.resolve(projectDir, 'node_modules', ...'/nodejieba/deps/limonp/StringUtil.hpp'.split("/"));
 8// first install nodejieba with `npm install nodejieba@2.6.0 --ignore-scripts`
 9// https://github.com/yanyiwu/limonp/issues/34
10fs.copyFile(patchFile, dest, (err) => {
11  err && console.error(err) && process.exit(1);
12})

limonp-StringUtil.hpp

你也可以選擇提交到 nodejieba 倉庫。我希望中國的開源軟件,都能善始善終,後繼有人。

修改 package.json

我們仍然希望打包的時候,nodejieba 可以被 electron-rebuild 識別。

1"scripts": {
2    "preinstall": "npm i nodejieba@2.6.0 --save-optional --ignore-scripts",
3    "build:plugin": "electron-rebuild -f",

electron-rebuild 幫你完成 node-gyp 需要做的事情。

electron-rebuild

爲 nodejieba 寫一個工具類

拷貝 nodejieba 的字典文件

假設你使用 Electron Builder,該段代碼將 node_modules/nodejieba/dict/ 拷貝到安裝目錄的根目錄。

1"build": {
2    "extraFiles": [
3      {
4        "from": "node_modules/nodejieba/dict/",
5        "to": "dict/"
6      }
7    ],

不要更改以下代碼中的任意一行。

加載本地 node addon 的工具類

 1import fs from "fs";
 2import path from "path";
 3import * as process from "process";
 4import bindings from "bindings";
 5// eslint-disable-next-line import/no-extraneous-dependencies
 6import logger from "_main/logger";
 7import nconsole from "_rutils/nconsole";
 8import { dev } from '_utils/node-env';
 9
10function loadAddon(pluginName: string) {
11  logger.log("preloading plugin");
12  let moduleRoot = path.dirname(process.execPath);
13  let tries = [["module_root", "bindings"]];
14  if (dev) {
15    moduleRoot = process.cwd();
16    tries = [["module_root", "build", "bindings"]];
17    if (!fs.existsSync(path.join(moduleRoot, "build", pluginName + ".node"))) {
18      tries = [["module_root", "bindings"]];
19    }
20  }
21  logger.log("using tries: " + JSON.stringify(tries));
22  let nodeAddon;
23  try {
24    nodeAddon = bindings({
25      bindings: pluginName,
26      module_root: moduleRoot,
27      try: tries,
28    });
29  } catch (e) {
30    logger.error(e);
31  }
32  return nodeAddon;
33}
34
35export default loadAddon;

加載 nodejieba 插件

 1import path from "path";
 2import loadAddon from './load_node_addon';
 3
 4const jbAddon = loadAddon("fastx");
 5
 6let dictDirRoot = process.cwd();
 7if (process.env.NODE_ENV === 'development') {
 8  dictDirRoot = path.resolve(process.cwd(), 'node_modules', 'nodejieba');
 9}
10
11let isDictLoaded = false;
12
13const defaultDict = {
14  dict: `${dictDirRoot}/dict/jieba.dict.utf8`,
15  hmmDict: `${dictDirRoot}/dict/hmm_model.utf8`,
16  userDict: `${dictDirRoot}/dict/user.dict.utf8`,
17  idfDict: `${dictDirRoot}/dict/idf.utf8`,
18  stopWordDict: `${dictDirRoot}/dict/stop_words.utf8`,
19};
20
21interface LoadOptions {
22  dict?: string;
23  hmmDict?: string;
24  userDict?: string;
25  idfDict?: string;
26  stopWordDict?: string;
27}
28
29export const load = (dictJson?: LoadOptions) => {
30  const finalDictJson = {
31    ...defaultDict,
32    ...dictJson,
33  };
34  isDictLoaded = true;
35  return jbAddon.load(
36    finalDictJson.dict,
37    finalDictJson.hmmDict,
38    finalDictJson.userDict,
39    finalDictJson.idfDict,
40    finalDictJson.stopWordDict,
41  );
42};
43
44export const DEFAULT_DICT = defaultDict.dict;
45export const DEFAULT_HMM_DICT = defaultDict.hmmDict;
46export const DEFAULT_USER_DICT = defaultDict.userDict;
47export const DEFAULT_IDF_DICT = defaultDict.idfDict;
48export const DEFAULT_STOP_WORD_DICT = defaultDict.stopWordDict;
49
50export interface TagResult {
51  word: string;
52  tag: string;
53}
54
55export interface ExtractResult {
56  word: string;
57  weight: number;
58}
59
60const mustLoadDict = (f: any, ...args: any[]):any => {
61  if (!isDictLoaded) {
62    load();
63  }
64  return f.apply(undefined, [...args]);
65};
66
67export const cut = (content: string, strict: boolean): string[] => mustLoadDict(jbAddon.cut, content, strict);
68export const cutAll = (content: string): string[] => mustLoadDict(jbAddon.cutAll, content);
69export const cutForSearch = (content: string, strict: boolean): string[] => mustLoadDict(jbAddon.cutForSearch, content, strict);
70export const cutHMM = (content: string): string[] => mustLoadDict(jbAddon.cutHMM, content);
71export const cutSmall = (content: string, limit: number): string[] => mustLoadDict(jbAddon.cutSmall, content, limit);
72export const extract = (content: string, threshold: number): ExtractResult[] => mustLoadDict(jbAddon.extract, content, threshold);
73export const textRankExtract = (content: string, threshold: number): ExtractResult[] => mustLoadDict(jbAddon.textRankExtract, content, threshold);
74export const insertWord = (word: string): boolean => mustLoadDict(jbAddon.insertWord, word);
75export const tag = (content: string): TagResult[] => mustLoadDict(jbAddon.tag, content);
76
77export default {
78  load,
79  cut,
80  cutAll,
81  cutForSearch,
82  cutHMM,
83  cutSmall,
84  extract,
85  textRankExtract,
86  insertWord,
87  tag,
88  DEFAULT_DICT,
89  DEFAULT_HMM_DICT,
90  DEFAULT_USER_DICT,
91  DEFAULT_IDF_DICT,
92  DEFAULT_STOP_WORD_DICT,
93};

你應該將該工具類,通過 window 暴露給 renderer 進程,然後 renderer 進程就可以調用這些方法,例如 window.myAddons.cutForSearch.

將 js-search 和 nodejieba 結合

假設你要搜索這樣一個對象。

1export interface Product {
2  [key: string]: any;
3
4  productCode: string;
5  name: string;
6  namePinyin: string;
7  nameEnglish: string;
8}

你在搜索的組件中這樣寫:

  1import * as JsSearch from 'js-search';
  2import { Search } from 'js-search';
  3
  4const [search, setSearch] = React.useState<string>("");
  5const jsSearchGames = React.useRef<Search>();
  6const [omnisearch_games, setOmnisearchGames] = React.useState<any[]>([]);
  7const [omnisearch_loading, setOmnisearchLoading] = React.useState(false);
  8
  9// ... 
 10
 11// 在頁面加載的時候,構造搜索控件和數據
 12useEffect(() => {
 13  const buildJsSearch = (uidField: string, documents: any[], ...index: string[]) => {
 14    const jsSearch = new JsSearch.Search(uidField);
 15    jsSearch.tokenizer = {
 16      tokenize: (text) => {
 17        const r = window.myAddons.cutForSearch(text, true); // cutForSearch 就是上面工具類中的方法
 18        return r;
 19      },
 20    };
 21    index.forEach((i) => jsSearch.addIndex(i));
 22    jsSearch.addDocuments(documents);
 23    return jsSearch;
 24  };
 25
 26
 27  jsSearchGames.current = buildJsSearch('productCode', p, 'productCode', 'name', 'namePinyin', 'nameEnglish');
 28}, []);
 29
 30// 如果在搜索框中輸入了字符,將開始搜索
 31useEffect((): (() => void) | void => {
 32  if (!search) {
 33    return;
 34  }
 35  const q = search.trim();
 36  if (!q) {
 37    return;
 38  }
 39  setOmnisearchGames([]);
 40  setOmnisearchLoading(true);
 41  // 清空上一次的搜索,如果還沒超過1s的話
 42  if (currentSearchId.current) {
 43    clearTimeout(currentSearchId.current);
 44  }
 45  const doSearch = async () => new Promise<searchResult>((resolve, reject) => {
 46    // 1s 之後纔開始搜索
 47    currentSearchId.current = setTimeout(() => {
 48      const result = {
 49        sitemap: match_sitemap(q),
 50        games: jsSearchGames.current?.search(q) as Product[],
 51        gamesPrecisely: jsSearchGamesPrecisely.current?.search(q) as Product[],
 52        orders: jsSearchOrders.current?.search(q) as Order[],
 53        news: jsSearchNews.current?.search(q) as NotificationType[],
 54        tags: jsSearchTags.current?.search(q) as Tags[],
 55      };
 56      resolve(result);
 57    }, 200);
 58  });
 59  doSearch().then((d) => {
 60    setOmnisearchGames(d.games.filter((p) => p.type !== Constants.API_TYPE_PRODUCT && p.type !== Constants.API_TYPE_GAMEBOX_APP));
 61    if (d.games.length === 0 && q.length >= 2 && q.indexOf("'") < 0) {
 62      Object.keys(requests_in_flght.current).forEach((k) => {
 63        if (q.indexOf(k) === 0) {
 64          clearTimeout(requests_in_flght.current[k]);
 65          delete requests_in_flght.current[k];
 66        }
 67      });
 68      // cut q to keep its largest length to 32
 69      requests_in_flght.current[q] = setTimeout(() => {
 70        post("/saveRecord", {
 71          searchString: q.substring(0, 32),
 72        }).catch(() => {
 73        });
 74      }, 5000);
 75    }
 76  })
 77    .catch(openError)
 78    .finally(() => setOmnisearchLoading(false));
 79}, [search]);
 80
 81
 82return (
 83  <div className="OmniSearch-container">
 84    {inputElement()}
 85    {(search_focus || omniMouseOver || null) && search && (
 86      <aside className="OmniSearch-results-container">
 87        {(omnisearch_loading || null) && <div className="loading">加載中</div>}
 88        {((!omnisearch_loading && omnisearch_result_count === 0) || null) && (
 89          <div className="no-results">
 90            未找到
 91          </div>
 92        )}
 93        {(omnisearch_games.length || null) && (
 94          <div className="results">
 95            <h3>遊戲</h3>
 96            {omnisearch_games.map((e) => (
 97              <div className="result" key={e.productCode}>
 98                <Link to={`/productDetail/${e.type}/${e.productCode}`}>{e.name}</Link>
 99              </div>
100            ))}
101          </div>
102        )}
103      </aside>
104    )}
105  </div>
106);

完成

好了,按照這樣的思路,你就可以實現下面這種搜索效果了。