SyntheticDataGenerator lit options

Options for generating transactions can be displayed by typing

SyntheticDataGenerator lit

Available Options
Name Value Description Default value
-fname filename Output base file name No default
-tlen double Average transaction length 10
-nitems integer Item count 100000
-randseed integer Master random seed (must <= 0) 0*
-lit.npats integer Large item set pattern count 10000
-lit.patlen double Large item set average pattern length 4
-lit.corr double Large item set correlation (-corr) 0.25
-lit.conf double Large item set confidence (-conf) 0.75
-ntrans integer Transaction count 1000000

* A randseed of zero results in a random seed being automatically generated

The transaction generator produces three files:

filename.config The parameters used to generate transactions
filename.patterns The large item set patterns
filename.transactions The transactions

At a minimum, an output file name must be specified, e.g.

SyntheticDataGenerator lit -fname transactions.txt

Last edited Feb 9, 2011 at 9:16 AM by arthur_pitman, version 11


quangpd Sep 22, 2012 at 9:54 AM 
Dear sir,
Thanks for this tool. How I can generate binary transactions data-set with it. I try to use SyntheticDataGenerator lit syntax but it will generate a transaction file with duplicate items on same line/row/transaction.