⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 crf1m_learn.c

📁 CRFsuite is a very fast implmentation of the Conditional Random Fields (CRF) algorithm. It handles t
💻 C
📖 第 1 页 / 共 3 页
字号:
/* *      Linear-chain CRF training. * * Copyright (c) 2007-2009, Naoaki Okazaki * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are met: *     * Redistributions of source code must retain the above copyright *       notice, this list of conditions and the following disclaimer. *     * Redistributions in binary form must reproduce the above copyright *       notice, this list of conditions and the following disclaimer in the *       documentation and/or other materials provided with the distribution. *     * Neither the names of the authors nor the names of its contributors *       may be used to endorse or promote products derived from this *       software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER * OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. *//* $Id: crf1m_learn.c 159 2009-03-17 01:50:30Z naoaki $ */#ifdef    HAVE_CONFIG_H#include <config.h>#endif/*HAVE_CONFIG_H*/#include <os.h>#include <float.h>#include <math.h>#include <stdio.h>#include <stdlib.h>#include <string.h>#include <limits.h>#include <time.h>#include <crf.h>#include "params.h"#include "mt19937ar.h"#include "logging.h"#include "crf1m.h"#define    FEATURE(trainer, k) \    (&(trainer)->features[(k)])#define    ATTRIBUTE(trainer, a) \    (&(trainer)->attributes[(a)])#define    TRANSITION_FROM(trainer, i) \    (&(trainer)->forward_trans[(i)])#define    TRANSITION_TO(trainer, j) \    (&(trainer)->backward_trans[(j)])#define    TRANSITION_BOS(trainer) \    (&(trainer)->bos_trans)#define    TRANSITION_EOS(trainer) \    (&(trainer)->eos_trans)void crf1ml_set_labels(crf1ml_t* trainer, const crf_sequence_t* seq){    int t;    crf1m_context_t* ctx = trainer->ctx;    const crf_item_t* item = NULL;    const int T = seq->num_items;    ctx->num_items = T;    for (t = 0;t < T;++t) {        item = &seq->items[t];        ctx->labels[t] = item->label;    }}voidcrf1ml_state_score(    crf1ml_t* trainer,    const crf_sequence_t* seq,    const floatval_t* w,    const int K,    floatval_t dummy    ){    int a, i, l, t, r, fid;    floatval_t scale, *state = NULL;    crf1m_context_t* ctx = trainer->ctx;    const crf_item_t* item = NULL;    const crf1ml_feature_t* f = NULL;    const feature_refs_t* attr = NULL;    const int T = seq->num_items;    const int L = trainer->num_labels;    /* Loop over the items in the sequence. */    for (t = 0;t < T;++t) {        item = &seq->items[t];        state = STATE_SCORE_AT(ctx, t);        /* Initialize the state scores at position #t as zero. */        for (i = 0;i < L;++i) {            state[i] = 0;        }        /* Loop over the contents (attributes) attached to the item. */        for (i = 0;i < item->num_contents;++i) {            /* Access the list of state features associated with the attribute. */            a = item->contents[i].aid;            attr = ATTRIBUTE(trainer, a);            /* A scale usually represents the atrribute frequency in the item. */            scale = item->contents[i].scale * dummy;            /* Loop over the state features associated with the attribute. */            for (r = 0;r < attr->num_features;++r) {                /* The state feature #(attr->fids[r]), which is represented by                   the attribute #a, outputs the label #(f->dst). */                fid = attr->fids[r];                f = FEATURE(trainer, fid);                l = f->dst;                state[l] += w[fid] * scale;            }        }    }}void crf1ml_transition_score(    crf1ml_t* trainer,    const floatval_t* w,    const int K,    floatval_t dummy    ){    int i, j, r, fid;    floatval_t *trans = NULL;    crf1m_context_t* ctx = trainer->ctx;    const crf1ml_feature_t* f = NULL;    const feature_refs_t* edge = NULL;    const int L = trainer->num_labels;    /* Initialize all transition scores as zero. */    for (i = 0;i <= L;++i) {        trans = TRANS_SCORE_FROM(ctx, i);        for (j = 0;j <= L;++j) {            trans[j] = 0. * dummy;        }    }    /* Compute transition scores from BOS to labels. */    trans = TRANS_SCORE_FROM(ctx, L);    edge = TRANSITION_BOS(trainer);    for (r = 0;r < edge->num_features;++r) {        /* Transition feature from BOS to #(f->dst). */        fid = edge->fids[r];        f = FEATURE(trainer, fid);        trans[f->dst] = w[fid] * dummy;    }    /* Compute transition scores between two labels. */    for (i = 0;i < L;++i) {        trans = TRANS_SCORE_FROM(ctx, i);        edge = TRANSITION_FROM(trainer, i);        for (r = 0;r < edge->num_features;++r) {            /* Transition feature from #i to #(f->dst). */            fid = edge->fids[r];            f = FEATURE(trainer, fid);            trans[f->dst] = w[fid] * dummy;        }            }    /* Compute transition scores from labels to EOS. */    edge = TRANSITION_EOS(trainer);    for (r = 0;r < edge->num_features;++r) {        /* Transition feature from #(f->src) to EOS. */        fid = edge->fids[r];        f = FEATURE(trainer, fid);        trans = TRANS_SCORE_FROM(ctx, f->src);        trans[L] = w[fid] * dummy;    }    }#define OEXP    1void crf1ml_enum_features(crf1ml_t* trainer, const crf_sequence_t* seq, update_feature_t func){    int a, c, i, j, t, r, fid;    floatval_t coeff, scale, *prob = trainer->prob;    crf1m_context_t* ctx = trainer->ctx;    const floatval_t lognorm = ctx->log_norm;    const floatval_t *fwd = NULL, *bwd = NULL, *state = NULL, *edge = NULL;    crf1ml_feature_t* f = NULL;    const feature_refs_t *attr = NULL, *trans = NULL;    const crf_item_t* item = NULL;    const int T = seq->num_items;    const int L = trainer->num_labels;    /*        This code prioritizes the computation speed over the simplicity:        it reduces the number of repeated calculations at the expense of        its beauty. This routine calculates model expectations of features        in four steps:        (i)   state features at position #0 and BOS transition features        (ii)  state features at position #T-1 and EOS transition features        (iii) state features at position #t (0 < t < T-1)        (iv)  transition features.        Notice that the partition factor (norm) is computed as:            norm = 1 / (C[0] * ... * C[T]).     */    /*        (i) Calculate the probabilities at (0, i):            p(0,i) = fwd[0][i] * bwd[0][i] / norm                   = (fwd'[0][i] / C[0]) * (bwd'[0][i] / (C[0] * ... C[T-1])) * (C[0] * ... * C[T])                   = (C[T] / C[0]) * fwd'[0][i] * bwd'[0][i]        to compute expectations of transition features from BOS        and state features at position #0.     */    fwd = FORWARD_SCORE_AT(ctx, 0);    bwd = BACKWARD_SCORE_AT(ctx, 0);    coeff = ctx->scale_factor[T] / ctx->scale_factor[0];    for (i = 0;i < L;++i) {        prob[i] = fwd[i] * bwd[i] * coeff;    }    /* Compute expectations for transition features from BOS. */    trans = TRANSITION_BOS(trainer);    for (r = 0;r < trans->num_features;++r) {        fid = trans->fids[r];        f = FEATURE(trainer, fid);        func(f, fid, prob[f->dst], 1., trainer, seq, 0);    }    /* Compute expectations for state features at position #0. */    item = &seq->items[0];    for (c = 0;c < item->num_contents;++c) {        /* Access the attribute. */        scale = item->contents[c].scale;        a = item->contents[c].aid;        attr = ATTRIBUTE(trainer, a);        /* Loop over state features for the attribute. */        for (r = 0;r < attr->num_features;++r) {            /* Reuse the probability prob[f->dst]. */            fid = attr->fids[r];            f = FEATURE(trainer, fid);            func(f, fid, prob[f->dst], scale, trainer, seq, 0);        }    }    /*        (ii) Calculate the probabilities at (T-1, i):            p(T-1,i) = fwd[T-1][i] bwd[T-1][i] / norm                     = (C[T] / C[T-1]) * fwd'[T-1][i] * bwd'[T-1][i]        to compute expectations of transition features to EOS        and state features at position #(T-1).     */    fwd = FORWARD_SCORE_AT(ctx, T-1);    bwd = BACKWARD_SCORE_AT(ctx, T-1);    coeff = ctx->scale_factor[T] / ctx->scale_factor[T-1];    for (i = 0;i < L;++i) {        prob[i] = fwd[i] * bwd[i] * coeff;    }    /* Compute expectations for transition features to EOS. */    trans = TRANSITION_EOS(trainer);    for (r = 0;r < trans->num_features;++r) {        fid = trans->fids[r];        f = FEATURE(trainer, fid);        func(f, fid, prob[f->src], 1., trainer, seq, T-1);    }    /* Compute expectations for state features at position #(T-1). */    item = &seq->items[T-1];    for (c = 0;c < item->num_contents;++c) {        /* Access the attribute. */        scale = item->contents[c].scale;        a = item->contents[c].aid;        attr = ATTRIBUTE(trainer, a);        /* Loop over state features for the attribute. */        for (r = 0;r < attr->num_features;++r) {            /* Reuse the probability prob[f->dst]. */            fid = attr->fids[r];            f = FEATURE(trainer, fid);            func(f, fid, prob[f->dst], scale, trainer, seq, T-1);        }    }    /*        (iii) For 0 < t < T-1, calculate the probabilities at (t, i):            p(t,i) = fwd[t][i] * bwd[t][i] / norm                   = (C[T] / C[t]) * fwd'[t][i] * bwd'[t][i]        to compute expectations of state features at position #t.     */    for (t = 1;t < T-1;++t) {        fwd = FORWARD_SCORE_AT(ctx, t);        bwd = BACKWARD_SCORE_AT(ctx, t);        coeff = ctx->scale_factor[T] / ctx->scale_factor[t];        /* Initialize the probabilities as -1, which means 'unfilled'.  */        for (i = 0;i < L;++i) prob[i] = -1;        /* Compute expectations for state features at position #t. */        item = &seq->items[t];        for (c = 0;c < item->num_contents;++c) {            /* Access the attribute. */            scale = item->contents[c].scale;            a = item->contents[c].aid;            attr = ATTRIBUTE(trainer, a);            /* Loop over state features for the attribute. */            for (r = 0;r < attr->num_features;++r) {                fid = attr->fids[r];                f = FEATURE(trainer, fid);                i = f->dst;                /* Reuse the probability prob[i] if it has been calculated. */                if (prob[i] == -1) {                    /* Unfilled: calculate the probability. */                    prob[i] = fwd[i] * bwd[i] * coeff;                }                func(f, fid, prob[i], scale, trainer, seq, t);            }        }

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -