A fast linear conditional random fields (CRFs) tool

Coded by Jie Tang
on Nov. 29, 2006
The directory include

Index

A fast linear conditional random fields (CRFs) tool: KEG_CRF [download ZIP or RAR].

This file is a manual of how to use the CRF training and test tool.

This software provides a simple manual on how to train and test using the KEG_CRF tool, which can be downloaded as follows.

CRF_train.exe

CRF_apply.exe

run.bat

A running example: train.txt and test.txt

Readme.txt

You can also select to download an all-in-one file by ZIP or RAR.

The tool was written by Jie Tang, jery.tang@gmail.com

Knowledge Engineering Group, Tsinghua University, China.

This software is available for non-commercial use only. It must not

be modified and distributed without prior permission of the author.

The author is not responsible for implications from the use of this

software.

1) Training:

usage: CRF_train [options] datafile modelfile

Arguments:

datafile: training data

modelfile: for storing the training model

Options:

-? help

-d string dictionary file, default value [dict.txt]

-i int maximal number of iterations for L-BFGS, default value [60]

-x int number of the threads (default value: 2)

-b bool indicate whether bigram edge features are used, i.e. y(i-1,i,X), false indicates that no edge features will be used, true indicates bigram edge features will be used [true]

-j bool indicate if we use bigram state features, false indicates unigram state features (e.g. yi, xi), true indicates bigram state features (e.g. yi, yi-1, xi) will also be used [false]

-t string test file name. If this option is set, the program will apply the trained model to the test file in each iteration

-o int penalty method, 1 denotes L1-norm, 2 denotes L2-norm (default)

Other options, please refer to the "help".

2) Test:

usage: CRF_apply [options] testfile outputfile

Arguments:

testfile: test data

outputfile: the tagging result

For options, please use "CRF_apply -?" to get help

Format of Training and Testing data

<context predicate> = a list of string # feature

<tag> = a string

Please refer to the ./example/train.txt or ./example/test.txt for the training and test data.

Output in command window during training:

Step 1. Parsing parameter...

Running with 3 threads

Step 2. Reading training data.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................elapsed: 32seconds

Reading test data.......................................................................................................................................................................................elapsed: 7seconds

Step 3. Starting training the model...

1) Building dictionary...Total 745357 context predictions

2) Performing feature extraction...Total 116267 features generated

3) Pruning feature set...108409 context predictions after pruning

4) Creating state feature caching for each observation.....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

5) Creating edge feature caching...

6) Building model...

Iteration: 1

Log-likelihood=-673469 L2 ||grad||=456493 L2 ||¦Ë||=0

(new_logliklihood - old)/old = 673469

compute_logli_gradient: 5422 ms LBFGS function: 16 ms

Iteration elapsed: 5 seconds

Iteration: 2

Log-likelihood=-270812 L2 ||grad||=302588 L2 ||¦Ë||=1

(new_logliklihood - old)/old = 0.597885

compute_logli_gradient: 5687 ms LBFGS function: 16 ms

Iteration elapsed: 6 seconds

When you type this command "CRF_train -t test.txt train.txt model.txt", you'll see the output in command window like this:

Step 1. Parsing parameter...

Running with 3 threads

Step 3. Starting training the model...

1) Building dictionary...Total 745357 context predictions

2) Performing feature extraction...Total 116267 features generated

3) Pruning feature set...108409 context predictions after pruning

5) Creating edge feature caching...

6) Building model...

Iteration: 1

Log-likelihood=-673469 L2 ||grad||=456493 L2 ||¦Ë||=0

(new_logliklihood - old)/old = 673469

compute_logli_gradient: 5422 ms LBFGS function: 16 ms

Iteration elapsed: 5 seconds

correct model manual prec. rec. F1-Measure

a 0 0 182 0.00 0.00 0.00

b 0 0 604 0.00 0.00 0.00

other 51980 58059 51980 89.53 100.00 94.48

overall 0 0 786 0.00 0.00 0.00

Iteration: 2

Log-likelihood=-270812 L2 ||grad||=302588 L2 ||¦Ë||=1

(new_logliklihood - old)/old = 0.597885

compute_logli_gradient: 5687 ms LBFGS function: 16 ms

Iteration elapsed: 6 seconds

correct model manual prec. rec. F1-Measure

a 0 0 182 0.00 0.00 0.00

b 0 0 604 0.00 0.00 0.00

other 51980 58059 51980 89.53 100.00 94.48

overall 0 0 786 0.00 0.00 0.00

Iteration: 3

Log-likelihood=-132609 L2 ||grad||=43142 L2 ||¦Ë||=3.19889

(new_logliklihood - old)/old = 0.510328

compute_logli_gradient: 5688 ms LBFGS function: 15 ms

Iteration elapsed: 6 seconds

when the option "-t" is set, it will perform "test" during training, and output evaluation result of each tag, it is useful for cross validation.

In the evaluation result, the first line:

"correct": the correct number of predictions

"model": the number of tags found by the trained model

"manual": the gold standard number of tags

"prec.": precision, prec. = correct/model

"rec.": recall, rec. = correct/maunal

"F1-Measure": F1-score, F1-Measure = 2*(prec.*rec.)/(prec.+rec.)

all the values above are computed on one tag, one tag corresponds to one line following.

The last line "overall": the results of all the tags

Any questions please feel free to contact: jery.tang@gmail.com, ylm@keg.cs.tsinghua.edu.cn.

Thanks for your interests of our software.