Tuesday, December 1, 2009

Get Started with Machine Learning in R

A Beautiful WWW put together a great set of resources for getting started with machine learning in R.  First, they recommend the previously mentioned free book, The Elements of Statistical Learning.  Then there's a link to a list of dozens of machine learning and statistical learning packages for R.  Next, you'll need data.  Hundreds of free real datasets are available at the UCI machine learning repository.  Each dataset, such as this breast cancer dataset from Wisconsin, has its own page giving a summary, links to publications of major findings, and detailed descriptions of the variables in the data.  If you want to simulate genetic data, check out our software, genomeSIMLA, capable of simulating gene-gene interactions in case-control and family-based GWAS-sized datasets with realistic patterns of linkage disequilibrium.  If you're interested, check out the genomeSIMLA paper.  Finally, if time is not an issue, consider taking MIT's OpenCourseWare Machine Learning course.  Alternatively, check out Stanford Engineering professor Andrew Ng - all his lectures are available on youtube.  Here's the first lecture.


For more, check out the link below.

A beautiful WWW: Guide to Getting Started in Machine Learning

No comments:

Post a Comment

Creative Commons License
Getting Genetics Done by Stephen Turner is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.