T
he following online resources provide a lot of useful databases that can be set as benchmarking datasets for machine learning tasks (regression, classification, cluster, etc.) . Please note that some information may occur repeatly in the list.
Widely-used Ones:
- The Mnist Database for Handwritten Digits Recognition http://yann.lecun.com/exdb/mnist/
- LIBSVM http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html#usps
- UCI Machine Learning Repository http://archive.ics.uci.edu/ml/index.html
- Delve Datasets http://www.cs.toronto.edu/~delve/data/summaryTable.html
- StatLib-Datasets Archive http://lib.stat.cmu.edu/datasets/
- Feature Selection Datasets http://featureselection.asu.edu/datasets.php
- Virtual Library of Simulation Functions http://www.sfu.ca/~ssurjano/index.html
- KEEL http://www.keel.es/
- Indian Script Character Databases:Link
- Economic Research: https://fred.stlouisfed.org/series/DEXUSEU
Miscellaneous Ones
- Cambridge Handwritten Word Imageshttp://documents.cfar.umd.edu/resources/database/handwriting.database.html
- CEDAR Handwritten Database http://www.cedar.buffalo.edu/Databases/
- Signature Verification Competition 2004 Database http://www.cs.ust.hk/svc2004/download.html/
- MCYT Online and Offline Signature Database http://atvs.ii.uam.es/bbdd_EN.html
- Caltech Signature Database http://www.vision.caltech.edu/mariomu/research.html
- Offline GPDS signature database http://www.gpds.ulpgc.es/download/
- HIT-MW Chinese Signature database http://hitmwdb.googlepages.com/writeridentification
- LIBSVM Data: Classification, Regression, and Multi-label [Link]