📄 stem_swedish.c
字号:
/* stem_swedish.c: Swedish stemming algorithm. * * ----START-LICENCE---- * Copyright 1999,2000,2001 BrightStation PLC * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License as * published by the Free Software Foundation; either version 2 of the * License, or (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 * USA * -----END-LICENCE----- *//* * This file has been changed from its original source. * Modifications by Mikael Ylikoski, 2002. * - Make it handle UTF-8 encoding. */#include <ctype.h>#include <stdlib.h>#include <string.h>#include "stem_swedish.h"struct swedish_stemmer_ { char *p; int p_size; int k; int k0; int j; int pos;};/* To set up the swedish_stemming process: void * z = setup_swedish_stemmer(); to use it: const char * p = swedish_stem(z, q, len); The word to be stemmed is in byte address q offsets i0 to i1 inclusive (i.e. from q[i0] to q[i1]). The stemmed result is the C string at address p. To close down the stemming process: closedown_swedish_stemmer(z);*//* The main part of the stemming algorithm starts here. z->p is a buffer holding a word to be stemmed. The letters are in z->p[0], z->p[1] ... ending at z->p[z->k]. z->k is readjusted downwards as the stemming progresses. Zero termination is not in fact used in the algorithm. Note that only lower case sequences are stemmed. Forcing to lower case should be done before stem(...) is called. We will write p, k etc in place of z->p, z->k in the comments.*//* * cons(z, i) is true <=> p[i] is a consonant. */static intcons(swedish_stemmer *z, int i) { switch (z->p[i]) { case 'a': case 'e': case 'i': case 'o': case 'u': case 'y': return 0; case 0xc3: switch (z->p[i + 1]) { case 0xa5: //
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -