赞
踩
437 Data Sets
Table View List View
Name
Data Types
Default Task
Attribute Types
# Instances
# Attributes
Year
Abalone
Multivariate
Classification
Categorical, Integer, Real
4177
8
1995
Adult
Categorical, Integer
48842
14
1996
Annealing
798
38
Anonymous Microsoft Web Data
Recommender-Systems
Categorical
37711
294
1998
Arrhythmia
452
279
Artificial Characters
6000
7
1992
Audiology (Original)
226
1987
Audiology (Standardized)
69
Auto MPG
Regression
Categorical, Real
398
1993
Automobile
205
26
Badges
Univariate, Text
1
1994
Balance Scale
625
4
Balloons
16
Breast Cancer
286
9
1988
Breast Cancer Wisconsin (Original)
Integer
699
10
Breast Cancer Wisconsin (Prognostic)
Classification, Regression
Real
198
34
Breast Cancer Wisconsin (Diagnostic)
569
32
Pittsburgh Bridges
108
13
1990
Car Evaluation
1728
6
1997
Census Income
Chess (King-Rook vs. King-Knight)
Multivariate, Data-Generator
22
Chess (King-Rook vs. King-Pawn)
3196
36
1989
Chess (King-Rook vs. King)
28056
Chess (Domain Theories)
Domain-Theory
Bach Chorales
Univariate, Time-Series
100
Connect-4
Multivariate, Spatial
67557
42
Credit Approval
690
15
Japanese Credit Screening
Multivariate, Domain-Theory
Categorical, Real, Integer
125
Computer Hardware
209
Contraceptive Method Choice
1473
Covertype
581012
54
Cylinder Bands
512
39
Dermatology
366
33
Diabetes
Multivariate, Time-Series
20
DGP2 - The Second Data Generation Program
Data-Generator
Document Understanding
EBL Domain Theories
Echocardiogram
132
12
Ecoli
336
Flags
194
30
Function Finding
Function-Learning
352
Glass Identification
214
Haberman's Survival
306
3
1999
Hayes-Roth
160
5
Heart Disease
303
75
Hepatitis
155
19
Horse Colic
368
27
ICU
Image Segmentation
2310
Internet Advertisements
3279
1558
Ionosphere
Integer, Real
351
Iris
150
ISOLET
7797
617
Kinship
Relational
Relational-Learning
104
Labor Relations
57
LED Display Domain
Lenses
24
Letter Recognition
20000
1991
Liver Disorders
345
Logic Theorist
Lung Cancer
56
Lymphography
148
18
Mechanical Analysis
Meta-data
528
Mobile Robots
Molecular Biology (Promoter Gene Sequences)
Sequential, Domain-Theory
106
58
Molecular Biology (Protein Secondary Structure)
Sequential
128
Molecular Biology (Splice-junction Gene Sequences)
3190
61
MONK's Problems
432
Moral Reasoner
202
Multiple Features
2000
649
Mushroom
8124
Musk (Version 1)
476
168
Musk (Version 2)
6598
Nursery
12960
Othello Domain Theory
Page Blocks Classification
5473
Optical Recognition of Handwritten Digits
5620
64
Pen-Based Recognition of Handwritten Digits
10992
Post-Operative Patient
90
Primary Tumor
339
17
Prodigy
Qualitative Structure Activity Relationships
Quadruped Mammals
72
Servo
167
Shuttle Landing Control
Solar Flare
1389
Soybean (Large)
307
35
Soybean (Small)
47
Challenger USA Space Shuttle O-Ring
23
Low Resolution Spectrometer
531
102
Spambase
4601
SPECT Heart
267
2001
SPECTF Heart
44
Sponge
Clustering
76
45
Statlog Project
Student Loan Relational
1000
Teaching Assistant Evaluation
151
Tic-Tac-Toe Endgame
958
Thyroid Disease
7200
21
Trains
University
285
Congressional Voting Records
435
Water Treatment Plant
527
Waveform Database Generator (Version 1)
5000
Waveform Database Generator (Version 2)
40
Wine
178
Yeast
1484
Zoo
101
Undocumented
Twenty Newsgroups
Text
Australian Sign Language signs
6650
Australian Sign Language signs (High Quality)
2565
2002
US Census Data (1990)
2458285
68
Census-Income (KDD)
299285
Coil 1999 Competition Data
340
Corel Image Features
68040
89
E. Coli Genes
EEG Database
122
El Nino
Spatio-temporal
178080
Entree Chicago Recommendation Data
Transactional, Sequential
50672
CMU Face Images
Image
640
Insurance Company Benchmark (COIL 2000)
Regression, Description
9000
86
Internet Usage Data
10104
IPUMS Census Database
256932
Japanese Vowels
KDD Cup 1998 Data
191779
481
KDD Cup 1999 Data
4000000
M. Tuberculosis Genes
Movie
Multivariate, Relational
10000
MSNBC.com Anonymous Web Data
989818
NSF Research Award Abstracts 1990-2003
129000
2003
Pioneer-1 Mobile Robot Data
Pseudo Periodic Synthetic Time Series
100000
Reuters-21578 Text Categorization Collection
21578
Robot Execution Failures
463
Synthetic Control Chart Time Series
Time-Series
Classification, Clustering
600
Syskill and Webert Web Page Ratings
Multivariate, Text
332
UNIX User Data
Text, Sequential
Volcanoes on Venus - JARtool experiment
Statlog (Australian Credit Approval)
Statlog (German Credit Data)
Statlog (Heart)
270
Statlog (Landsat Satellite)
6435
Statlog (Image Segmentation)
Statlog (Shuttle)
58000
Statlog (Vehicle Silhouettes)
946
Connectionist Bench (Nettalk Corpus)
20008
Connectionist Bench (Sonar, Mines vs. Rocks)
208
60
Connectionist Bench (Vowel Recognition - Deterding Data)
Economic Sanctions
Protein Data
Cloud
1024
CalIt2 Building People Counts
10080
2006
Dodgers Loop Sensor
50400
Poker Hand
1025010
11
2007
MAGIC Gamma Telescope
19020
UJI Pen Characters
Multivariate, Sequential
1364
Mammographic Mass
961
Forest Fires
517
2008
Reuters Transcribed Subset
200
Bag of Words
8000000
Concrete Compressive Strength
1030
Hill-Valley
606
Arcene
900
Dexter
2600
Dorothea
1950
Gisette
13500
Madelon
4400
500
Ozone Level Detection
Multivariate, Sequential, Time-Series
2536
73
Abscisic Acid Signaling Network
Causal-Discovery
300
43
Parkinsons
197
Character Trajectories
2858
Blood Transfusion Service Center
748
UJI Pen Characters (Version 2)
11640
2009
Semeion Handwritten Digit
1593
256
SECOM
Classification, Causal-Discovery
1567
591
Plants
22632
70
Libras Movement
360
91
Concrete Slump Test
103
Communities and Crime
Acute Inflammations
120
Wine Quality
4898
URL Reputation
2396130
3231961
p53 Mutants
16772
5409
2010
Parkinsons Telemonitoring
5875
Demospongiae
503
Opinosis Opinion ⁄ Review
51
Breast Tissue
Cardiotocography
2126
Wall-Following Robot Navigation Data
5456
Spoken Arabic Digit
8800
Localization Data for Person Activity
Univariate, Sequential, Time-Series
164860
AutoUniv
Steel Plates Faults
1941
MiniBooNE particle identification
130065
50
YearPredictionMSD
515345
2011
PEMS-SF
440
138672
OpinRank Review Dataset
Relative location of CT slices on axial axis
53500
386
Online Handwritten Assamese Characters Dataset
8235
PubChem Bioassay Data
Record Linkage Comparison Patterns
5749132
Communities and Crime Unnormalized
2215
147
Vertebral Column
310
EMG Physical Action Data Set
Vicon Physical Action Data Set
3000
Amazon Commerce reviews set
Multivariate, Text, Domain-Theory
1500
Amazon Access Samples
Time-Series, Domain-Theory
Regression, Clustering, Causal-Discovery
30000
Reuter_50_50
2500
Farm Ads
4143
54877
DBWorld e-mails
4702
KEGG Metabolic Relation Network (Directed)
Multivariate, Univariate, Text
Classification, Regression, Clustering
53414
KEGG Metabolic Reaction Network (Undirected)
65554
29
Bank Marketing
45211
2012
YouTube Comedy Slam Preference Data
1138562
Gas Sensor Array Drift Dataset
13910
ILPD (Indian Liver Patient Dataset)
583
OPPORTUNITY Activity Recognition
2551
242
Nomao
Univariate
34465
SMS Spam Collection
5574
Skin Segmentation
245057
Planning Relax
182
PAMAP2 Physical Activity Monitoring
3850505
52
Restaurant & consumer data
138
CNAE-9
1080
857
Individual household electric power consumption
Regression, Clustering
2075259
seeds
210
Northix
115
QtyT40I10D100K
3960456
Legal Case Reports
Human Activity Recognition Using Smartphones
10299
561
One-hundred plant species leaves data set
1600
Energy efficiency
768
Yacht Hydrodynamics
308
2013
Fertility
Daphnet Freezing of Gait
237
3D Road Network (North Jutland, Denmark)
Sequential, Text
434874
ISTANBUL STOCK EXCHANGE
Multivariate, Univariate, Time-Series
536
Buzz in social media
Time-Series, Multivariate
Regression, Classification
140000
77
First-order theorem proving
6118
Wearable Computing: Classification of Body Postures and Movements (PUC-Rio)
165632
Gas sensor arrays in open sampling settings
18000
1950000
Climate Model Simulation Crashes
540
MicroMass
931
1300
QSAR biodegradation
1055
41
BLOGGER
Daily and Sports Activities
9120
5625
User Knowledge Modeling
403
Reuters RCV1 RCV2 Multilingual, Multiview Text Categorization Test collection
111740
NYSK
Multivariate, Sequential, Text
10421
Turkiye Student Evaluation
5820
ser Knowledge Modeling Data (Students' Knowledge Levels on DC Electrical Machines)
EEG Eye State
14980
Physicochemical Properties of Protein Tertiary Structure
45730
seismic-bumps
2584
banknote authentication
1372
USPTO Algorithm Challenge, run by NASA-Harvard Tournament Lab and TopCoder Problem: Pat
YouTube Multiview Video Games Dataset
120000
1000000
Gas Sensor Array Drift Dataset at Different Concentrations
Classification, Regression, Clustering, Causa
129
Activities of Daily Living (ADLs) Recognition Using Binary Sensors
2747
SkillCraft1 Master Table Dataset
3395
Weight Lifting Exercises monitored with Inertial Measurement Units
39242
152
SML2010
Multivariate, Sequential, Time-Series, Text
4137
2014
Bike Sharing Dataset
17389
Predict keywords activities in a online social media
Thoracic Surgery Data
470
EMG dataset in Lower Limb
SUSY
5000000
HIGGS
11000000
28
Qualitative_Bankruptcy
250
LSVT Voice Rehabilitation
126
309
Dataset for ADL Recognition with Wrist-worn Accelerometer
Wilt
4889
User Identification From Walking Activity
Activity Recognition from Single Chest-Mounted Accelerometer
Leaf
Dresses_Attribute_Sales
501
Tamilnadu Electricity Board Hourly Readings
45781
Airfoil Self-Noise
1503
Wholesale customers
Twitter Data set for Arabic Sentiment Analysis
2
Combined Cycle Power Plant
9568
Urban Land Cover
Diabetes 130-US hospitals for years 1999-2008
55
Bach Choral Harmony
5665
StoneFlakes
Classification, Clustering, Causal-Discovery
79
Tennis Major Tournament Match Statistics
127
Parkinson Speech Dataset with Multiple Types of Sound Recordings
1040
Gesture Phase Segmentation
9900
Perfume Data
Univariate, Domain-Theory
560
BlogFeedback
60021
281
REALDISP Activity Recognition Dataset
1419
Newspaper and magazine images segmentation dataset
AAAI 2014 Accepted Papers
399
Gas sensor array under flow modulation
120432
Gas sensor array exposed to turbulent gas mixtures
180
150000
UJIIndoorLoc
21048
529
Sentence Classification
Dow Jones Index
750
sEMG for Basic Hand movements
AAAI 2013 Accepted Papers
Geographical Original of Music
1059
Condition Based Maintenance of Naval Propulsion Plants
11934
Grammatical Facial Expressions
27965
NoisyOffice
216
2015
MHEALTH Dataset
Student Performance
ElectricityLoadDiagrams20112014
370
140256
Gas sensor array under dynamic gas mixtures
4178504
microblogPCU
Multivariate, Univariate, Sequential, Text
221579
Firm-Teacher_Clave-Direction_Classification
10800
Dataset for Sensorless Drive Diagnosis
58509
49
TV News Channel Commercial Detection Dataset
129685
Phishing Websites
2456
Greenhouse Gas Observing Network
2921
5232
Diabetic Retinopathy Debrecen Data Set
1151
HIV-1 protease cleavage
6590
Sentiment Labelled Sentences
Online News Popularity
39797
Forest type mapping
326
wiki4HE
913
53
Online Video Characteristics and Transcoding Time Dataset
168286
Chronic_Kidney_Disease
400
25
Machine Learning based ZZAlpha Ltd. Stock Recommendations 2012-2014
Sequential, Time-Series
314080
0
Folio
637
Taxi Service Trajectory - Prediction Challenge, ECML PKDD 2015
Multivariate, Sequential, Time-Series, Domain-Theory
Clustering, Causal-Discovery
1710671
Cuff-Less Blood Pressure Estimation
12000
Smartphone-Based Recognition of Human Activities and Postural Transitions
10929
Mice Protein Expression
82
UJIIndoorLoc-Mag
40000
Heterogeneity Activity Recognition
43930257
Educational Process Mining (EPM): A Learning Analytics Data Set
230318
HEPMASS
10500000
2016
Indoor User Movement Prediction from RSS data
13197
Open University Learning Analytics dataset
default of credit card clients
Mesothelioma’s disease data set
324
Online Retail
541909
SIFT10M
11164866
GPS Trajectories
163
Detect Malacious Executable(AntiVirus)
373
513
Occupancy Detection
20560
Improved Spiral Test Using Digitized Graphics Tablet for Monitoring Parkinson’s Disease
News Aggregator
422937
Air Quality
9358
Twin gas sensor arrays
Multivariate, Time-Series, Domain-Theory
480000
Gas sensors for home activity monitoring
919438
Facebook Comment Volume Dataset
40949
Smartphone Dataset for Human Activity Recognition (HAR) in Ambient Assisted Living (AAL)
5744
Polish companies bankruptcy data
10503
Activity Recognition system based on Multisensor data fusion (AReM)
42240
Dota2 Games Results
102944
116
Facebook metrics
UbiqLog (smartphone lifelogging)
9782222
NIPS Conference Papers 1987-2015
11463
5812
HTRU2
17898
2017
Drug consumption (quantified)
1885
Appliances energy prediction
19735
Miskolc IIS Hybrid IPS
1540
67
KDC-4007 dataset Collection
4007
Geo-Magnetic field and WLAN dataset for indoor localisation from wristband and smartphone
153540
DrivFace
6400
Website Phishing
1353
YouTube Spam Collection
1956
Beijing PM2.5 Data
43824
Cargo 2000 Freight Tracking and Tracing
3942
98
Cervical cancer (Risk Factors)
858
Quality Assessment of Digital Colposcopies
287
KASANDR
17764280
2158859
FMA: A Dataset For Music Analysis
106574
518
Air quality
Epileptic Seizure Recognition
11500
179
Devanagari Handwritten Character Dataset
92000
Stock portfolio performance
315
MoCap Hand Postures
78095
Early biomarkers of Parkinson�s disease based on natural connected speech
130
65
Data for Software Engineering Teamwork Assessment in Education Setting
74
PM2.5 Data of Five Chinese Cities
52854
Parkinson Disease Spiral Drawings Using Digitized Graphics Tablet
Sales_Transactions_Dataset_Weekly
811
Las Vegas Strip
504
Eco-hotel
401
MEU-Mobile KSD
2856
71
Crowdsourced Mapping
10546
gene expression cancer RNA-Seq
801
20531
Hybrid Indoor Positioning Dataset from WiFi RSSI, Bluetooth and magnetometer
chestnut – LARVIC
1451
Burst Header Packet (BHP) flooding attack on Optical Burst Switching (OBS) Network
1075
Motion Capture Hand Postures
Anuran Calls (MFCCs)
7195
TTC-3600: Benchmark dataset for Turkish text categorization
3600
4814
Gastrointestinal Lesions in Regular Colonoscopy
698
Daily Demand Forecasting Orders
Paper Reviews
405
extention of Z-Alizadeh sani dataset
59
Z-Alizadeh Sani
Dynamic Features of VirusShare Executables
107888
482
IDA2016Challenge
76000
171
DSRC Vehicle Communications
Mturk User-Perceived Clusters over Images
Character Font Images
745000
411
DeliciousMIL: A Data Set for Multi-Label Multi-Instance Learning with Instance Labels
12234
8519
Autistic Spectrum Disorder Screening Data for Children
292
Autistic Spectrum Disorder Screening Data for Adolescent
APS Failure at Scania Trucks
60000
Wireless Indoor Localization
HCC Survival
165
CSM (Conventional and Social Media Movies) Dataset 2014 and 2015
217
University of Tehran Question Dataset 2016 (UTQD.2016)
1175
Autism Screening Adult
704
Activity recognition with healthy older people using a batteryless wearable sensor
75128
Immunotherapy Dataset
2018
Cryotherapy Dataset
OCT data & Color Fundus Images of Left & Right Eyes
Discrete Tone Image Dataset
News Popularity in Multiple Social Media Platforms
Multivariate, Time-Series, Text
93239
Ultrasonic flowmeter diagnostics
173
ICMLA 2014 Accepted Papers Data Set
105
BLE RSSI Dataset for Indoor localization and Navigation
6611
Container Crane Controller Data Set
Residential Building Data Set
372
Health News in Twitter
25000
chipseq
4960
SGEMM GPU kernel performance
241600
Repeat Consumption Matrices
130000
21000
detection_of_IoT_botnet_attacks_N_BaIoT
Absenteeism at work
740
SCADI
206
Condition monitoring of hydraulic systems
2205
43680
Carbon Nanotubes
10721
Optical Interconnection Network
Sports articles for objectivity analysis
Breast Cancer Coimbra
GNFUV Unmanned Surface Vehicles Sensor Data
1672
Dishonest Internet users Dataset
322
Victorian Era Authorship Attribution
93600
Supported By:
In Collaboration With: