Seedea, scalable creativity
Xye, consultancy for serious creators
see also AImatrix
Collect the key repositories of easy to interesting, accurate, large and easy exploit datasets.
PS : individual specialized dataset should not be listed here, only repositories.
|CKAN||Handling datasets as packages and having its own "client" datapkg*||Open Knowledge Foundation (irc)||860 registered packages||live|
|The Map of Data by Sindice||?||DERI||?||?|
|Linked Data Sets as RDF Dumps||from the ESW Wik||W3C||?||?|
|Public Datasets||Dedicated to EC2 usage||Amazon AWS||?||?|
|Free Redistributable Rich Data Sets||not always easy to use because of scarce data or "old" formats||InfoChimps.org||?||?|
|data sets||"for people with large data sets"||theinfo.org||?||?|
|datasets from ManyEyes||?||IBM||?||?|
|Data.gov||restricted to the USA, no mention of DARPA data (as of June 2009)||US government||?||?|
|data.gov.uk||restricted to the UK||?||?||?|
|Community Open Data Tables||prepared for Yahoo's YQL||Yahoo/community-driven||?||?|
|Datasets from Programming For Peace||Datasets specialized in political conflicts||Multiple research groups||<10||?|
|NYTimes Linked Open Data||SKOS File + API||New York Times||?||live|
|Concept Web||relying on wiki structure||WikiProfessional||1 Million||?|
|LinkedData.org||provides a (non-working...) RSS feed||administered by Tom Heath||?||?|
|ScraperWiki||wiki with scapers to configure||?||?||?|
Most (if not all?) are linked to one specific set of well formatted datasets, be sure to check if the data your want are there of easy to convert first.