赞
踩
在看MySQL文章的时候偶然发现生成数据的工具,此处直接将软件作者的文档贴了过来,说明了使用方式及下载地址
Many times in my job I need to generate random data for a specific table in order to reproduce an issue.
After writing many random generators for every table, I decided to write a random data generator, able to get the table structure and generate random data for it.
Plase take into consideration that this is the first version and it doesn’t support all field types yet!
NOTICE
This is an early stage project.
Field type | Generated values |
---|---|
tinyint | 0 ~ 0xFF |
smallint | 0 ~ 0XFFFF |
mediumint | 0 ~ 0xFFFFFF |
int - integer | 0 ~ 0xFFFFFFFF |
bigint | 0 ~ 0xFFFFFFFFFFFFFFFF |
float | 0 ~ 1e8 |
decimal(m,n) | 0 ~ 10^(m-n) |
double | 0 ~ 1000 |
char(n) | up to n random chars |
varchar(n) | up to n random chars |
date | NOW() - 1 year ~ NOW() |
datetime | NOW() - 1 year ~ NOW() |
timestamp | NOW() - 1 year ~ NOW() |
time | 00:00:00 ~ 23:59:59 |
year | Current year - 1 ~ current year |
tinyblob | up to 100 chars random paragraph |
tinytext | up to 100 chars random paragraph |
blob | up to 100 chars random paragraph |
text | up to 100 chars random paragraph |
mediumblob | up to 100 chars random paragraph |
mediumtext | up to 100 chars random paragraph |
longblob | up to 100 chars random paragraph |
longtext | up to 100 chars random paragraph |
varbinary | up to 100 chars random paragraph |
enum | A random item from the valid items list |
set | A random item from the valid items list |
The program can detect if a field accepts NULLs and if it does, it will generate NULLs ramdomly (~ 10 % of the values).
mysql_random_data_load <database> <table> <number of rows> [options...]
Option | Description |
---|---|
–bulk-size | Number of rows per INSERT statement (Default: 1000) |
–debug | Show some debug information |
–fk-samples-factor | Percentage used to get random samples for foreign keys fields. Default 0.3 |
–host | Host name/ip |
–max-fk-samples | Maximum number of samples for fields having foreign keys constarints. Default: 100 |
–max-retries | Maximum number of rows to retry in case of errors. See duplicated keys. Deafult: 100 |
–no-progressbar | Skip showing the progress bar. Default: false |
–password | Password |
–port | Port number |
Print queries to the standard output instead of inserting them into the db | |
–user | Username |
–version | Show version and exit |
If a field has Foreign Keys constraints, random-data-load
will get up to --max-fk-samples
random samples from the referenced tables in order to insert valid values for the field.
The number of samples to get follows this rules:
1. Get the aproximate number of rows in the referenced table using the rows
field in:
EXPLAIN SELECT COUNT(*) FROM <referenced schema>.<referenced table>
1.1 If the number of rows is less than max-fk-samples
, all rows are retrieved from the referenced table using this query:
SELECT <referenced field> FROM <referenced schema>.<referenced table>
1.2 If the number of rows is greater than max-fk-samples
, samples are retrieved from the referenced table using this query:
SELECT <referenced field> FROM <referenced schema>.<referenced table> WHERE RAND() <= <fk-samples-factor> LIMIT <max-fk-samples>
CREATE DATABASE IF NOT EXISTS test; CREATE TABLE `test`.`t3` ( `id` int(11) NOT NULL AUTO_INCREMENT, `tcol01` tinyint(4) DEFAULT NULL, `tcol02` smallint(6) DEFAULT NULL, `tcol03` mediumint(9) DEFAULT NULL, `tcol04` int(11) DEFAULT NULL, `tcol05` bigint(20) DEFAULT NULL, `tcol06` float DEFAULT NULL, `tcol07` double DEFAULT NULL, `tcol08` decimal(10,2) DEFAULT NULL, `tcol09` date DEFAULT NULL, `tcol10` datetime DEFAULT NULL, `tcol11` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `tcol12` time DEFAULT NULL, `tcol13` year(4) DEFAULT NULL, `tcol14` varchar(100) DEFAULT NULL, `tcol15` char(2) DEFAULT NULL, `tcol16` blob, `tcol17` text, `tcol18` mediumtext, `tcol19` mediumblob, `tcol20` longblob, `tcol21` longtext, `tcol22` mediumtext, `tcol23` varchar(3) DEFAULT NULL, `tcol24` varbinary(10) DEFAULT NULL, `tcol25` enum('a','b','c') DEFAULT NULL, `tcol26` set('red','green','blue') DEFAULT NULL, `tcol27` float(5,3) DEFAULT NULL, `tcol28` double(4,2) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB;
To generate 100K random rows, just run:
mysql_random_data_load test t3 100000 --user=root --password=root
mysql> select * from t3 limit 1\G *************************** 1. row *************************** id: 1 tcol01: 10 tcol02: 173 tcol03: 1700 tcol04: 13498 tcol05: 33239373 tcol06: 44846.4 tcol07: 5300.23 tcol08: 11360967.75 tcol09: 2017-09-04 tcol10: 2016-11-02 23:11:25 tcol11: 2017-03-03 08:11:40 tcol12: 03:19:39 tcol13: 2017 tcol14: repellat maxime nostrum provident maiores ut quo voluptas. tcol15: Th tcol16: Walter tcol17: quo repellat accusamus quidem odi tcol18: esse laboriosam nobis libero aut dolores e tcol19: Carlos Willia tcol20: et nostrum iusto ipsa sunt recusa tcol21: a accusantium laboriosam voluptas facilis. tcol22: laudantium quo unde molestiae consequatur magnam. tcol23: Pet tcol24: Richard tcol25: c tcol26: green tcol27: 47.430 tcol28: 6.12 1 row in set (0.00 sec)
效果良好
There are binaries available for each version for Linux and Darwin. You can find compiled binaries for each version in the releases tab:
https://github.com/Percona-Lab/mysql_random_data_load/releases
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。