A fast linear conditional random fields (CRFs) tool

Coded by Jie Tang
on Nov. 29, 2006
The directory include


Index

  1. A fast linear conditional random fields (CRFs) tool: KEG_CRF [download ZIP or RAR].

This file is a manual of how to use the CRF training and test tool.

 

This software provides a simple manual on how to train and test using the KEG_CRF tool, which can be downloaded as follows.

CRF_train.exe

CRF_apply.exe

run.bat

A running example: train.txt and test.txt

Readme.txt

You can also select to download an all-in-one file by ZIP or RAR.

 

The tool was written by Jie Tang, jery.tang@gmail.com

Knowledge Engineering Group, Tsinghua University, China.

 

This software is available for non-commercial use only. It must not

be modified and distributed without prior permission of the author.

The author is not responsible for implications from the use of this

software.

 

1) Training:

usage: CRF_train [options] datafile modelfile

 

Arguments:

datafile:  training data

modelfile: for storing the training model

Options:

-? help

-d string   dictionary file, default value [dict.txt]

-i int       maximal number of iterations for L-BFGS, default value [60]

-x int      number of the threads (default value: 2)

-b bool   indicate whether bigram edge features are used, i.e. y(i-1,i,X), false indicates that no edge features will be used, true indicates bigram edge features will be used [true]

-j bool    indicate if we use bigram state features, false indicates unigram state features (e.g. yi, xi), true indicates bigram state features (e.g. yi, yi-1, xi) will also be used [false]

-t string  test file name. If this option is set, the program will apply the trained model to the test file in each iteration

-o int             penalty method, 1 denotes L1-norm, 2 denotes L2-norm (default)

 

Other options, please refer to the "help".     

 

2) Test:

usage: CRF_apply [options] testfile outputfile

 

Arguments:

testfile: test data

outputfile: the tagging result

 

For options, please use "CRF_apply -?" to get help

 

Format of Training and Testing data

<data> = a list of <sequence>

<sequence> = a list of <observation>

<observation> = <tag> + <context predicate>

<context predicate> = a list of string   # feature

<tag> = a string

 

Please refer to the ./example/train.txt or ./example/test.txt for the training and test data.

 

 

Output in command window during training:

 

Step 1. Parsing parameter...

       Running with 3 threads

Step 2. Reading training data.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................elapsed: 32seconds

       Reading test data.......................................................................................................................................................................................elapsed: 7seconds

Step 3. Starting training the model...

       1) Building dictionary...Total 745357 context predictions

       2) Performing feature extraction...Total 116267 features generated

       3) Pruning feature set...108409 context predictions after pruning

       4) Creating state feature caching for each observation.....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

       5) Creating edge feature caching...

       6) Building model...

Iteration: 1

Log-likelihood=-673469   L2 ||grad||=456493   L2 ||¦Ë||=0

(new_logliklihood - old)/old = 673469

compute_logli_gradient: 5422 ms          LBFGS function: 16 ms

Iteration elapsed: 5 seconds

 

Iteration: 2

Log-likelihood=-270812   L2 ||grad||=302588   L2 ||¦Ë||=1

(new_logliklihood - old)/old = 0.597885

compute_logli_gradient: 5687 ms          LBFGS function: 16 ms

Iteration elapsed: 6 seconds

 

When you type this command "CRF_train -t test.txt train.txt model.txt", you'll see the output in command window like this:

 

Step 1. Parsing parameter...

       Running with 3 threads

Step 2. Reading training data.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................elapsed: 32seconds

       Reading test data.......................................................................................................................................................................................elapsed: 7seconds

Step 3. Starting training the model...

       1) Building dictionary...Total 745357 context predictions

       2) Performing feature extraction...Total 116267 features generated

       3) Pruning feature set...108409 context predictions after pruning

       4) Creating state feature caching for each observation.....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

       5) Creating edge feature caching...

       6) Building model...

Iteration: 1

Log-likelihood=-673469   L2 ||grad||=456493   L2 ||¦Ë||=0

(new_logliklihood - old)/old = 673469

compute_logli_gradient: 5422 ms          LBFGS function: 16 ms

Iteration elapsed: 5 seconds

 

       correct   model    manual   prec.      rec. F1-Measure

a            0     0     182   0.00       0.00       0.00

b            0     0     604   0.00       0.00       0.00

other             51980    58059    51980    89.53    100.00   94.48

overall          0     0     786   0.00       0.00       0.00

 

Iteration: 2

Log-likelihood=-270812   L2 ||grad||=302588   L2 ||¦Ë||=1

(new_logliklihood - old)/old = 0.597885

compute_logli_gradient: 5687 ms          LBFGS function: 16 ms

Iteration elapsed: 6 seconds

 

       correct   model    manual   prec.      rec. F1-Measure

a            0     0     182   0.00       0.00       0.00

b            0     0     604   0.00       0.00       0.00

other             51980    58059    51980    89.53    100.00   94.48

overall          0     0     786   0.00       0.00       0.00

 

Iteration: 3

Log-likelihood=-132609   L2 ||grad||=43142   L2 ||¦Ë||=3.19889

(new_logliklihood - old)/old = 0.510328

compute_logli_gradient: 5688 ms          LBFGS function: 15 ms

Iteration elapsed: 6 seconds

 

when the option "-t" is set, it will perform "test" during training, and output evaluation result of each tag, it is useful for cross validation.

In the evaluation result, the first line:

"correct": the correct number of predictions

"model": the number of tags found by the trained model

"manual": the gold standard number of tags

"prec.": precision,  prec. = correct/model

"rec.": recall, rec. = correct/maunal

"F1-Measure":  F1-score, F1-Measure = 2*(prec.*rec.)/(prec.+rec.)

all the values above are computed on one tag, one tag corresponds to one line following.

The last line "overall": the results of all the tags

 

 

Any questions please feel free to contact:  jery.tang@gmail.com,  ylm@keg.cs.tsinghua.edu.cn.

 

Thanks for your interests of our software.