dictionary appendix to MMSEG


appendix to the mmseg Chinese word segmentation tool
http://sites.google.com/site/sekewei/home/download/dictionary.zip
---
readme.txt

filelist:
getdic.c source to retrieve dictionary in use
getdic.exe binary to retrieve dictionary in use
gcc -o getdic -I../src getdic.c ../src/lexicon.c

mmsegmnt.c source to segment a chunk in simple mode
getchunk.c source to get a chunk for segmenting
mmseg.exe binary to segment words
gcc -o mmseg -I. mmseg.c search.c
segment.c b5char.c iofiles.c
lexicon.c message.c
..\dictionary\getchunk.c
..\dictionary\mmsegmnt.c
mmseg inp.txt out.txt ../lexicon/ complex verbose
usage: mmseg in out lexicon [complex|simple*] [verbose|standard|quiet*]
default options are simple and quiet

mklex1.c source to build dictionary tables CHR?.TXT
mklex1.exe binary to build dictionary tables CHR?.TXT
gcc -o mklex1 -I. ..\dictionary\mklex1.c
mklex1 TSAIWORD.TXT

mklex2.c source to build dictionary tables CHR?.LEX
mklex2.exe binary to build dictionary tables CHR?.LEX
gcc -o mklex2 -I. ..\dictionary\mklex2.c
mklex2 .

mklex3.c source to build dictionary tables CHR?.INX
mklex3.exe binary to build dictionary tables CHR?.INX
gcc -o mklex3 -I. ..\dictionary\mklex3.c
mklex3 .

mmsegwords_raw.txt retrieved dictionary in use (137546 items)
mmsegwords.txt revised dictionary (137555 items)
mmsegwords_sort.txt revised dictionary sorted (137555 items)
TSAIWORD.TXT separate sorted dictionary by Tsai (137425 items)
diff.txt difference between mmsegwords_sort.txt
and TSAIWORD.TXT

沒有留言:

Building a Lightweight Streamlit Client for Local Ollama LLM Interaction

Ollama 提供端點串接服務,可由程式管理及使用本地大語言模型(LLM, Large Language Model)。 以下程式碼展示如何以 Streamlit 套件,建立一個輕量級的網頁介面,供呼叫 本地端安裝的 Ollama 大語言模型。 Ollama 預設的服務...

總網頁瀏覽量