added an update
Project
Massive Computational Experiments, Painlessly
Hatef Monajemi
David Donoho
- Private Profile
Goal: Our goal is reduce the burden of managing massive computational experiments while conducting them in a reproducible way. To this end, we are working on the open source project ClusterJob which handles massive computations and makes it painless to track, harvest and analyze millions of computational jobs. More info on our paper https://web.stanford.edu/~vcs/papers/osbg-MDS2016.pdf
or http://clusterjob.org
We are also teaching a course STATS 285 in which students will learn state-of-the-art technique and tools for painless massive computing. http://explorecourses.stanford.edu/search?view=catalog&filter-coursestatus-Active=on&page=0&catalog=&q=STATS285
Updates
0 new
33
Recommendations
0 new
0
Followers
0 new
84
Reads
3 new
3651
Project log
Watch Ali Zaidi as he described distributed tools on Azure for data scientists.
…
Watch Riccardo Murri from the university of Zurich address challenges for taking scientific computing to the cloud:
Lecture 08 video:
…
Watch Greg Kurtzer (CEO of Sylabs.io) give an in-depth explanation of container technologies, in particular Singularity.
…
Lecture 6: “Some reflections about data science“ by John Chambers
Full lecture video:
…
Part 1) XYZ Studies, A paradigm for research in data science:
Part 2) Science in the cloud:
…
Watch lecture 3 as Hatef Monajemi discusses automation in data science and the necessity of Experiment Management Systems (EMS) in data science.
…
Watch Mark Piercy from Stanford Research Computing Center (SRCC) talk about cluster computing basics and the features of Stanford medium-risk research cluster Sherlock.
…
Full Video of Lecture 01:
…
Link to the video of this lecture: https://www.youtube.com/watch?v=gAIXyT71Ja8&feature=youtu.be
Abstract:
With increasing computational demands due to ambitious data science studies and the scarcity of computational resources (e.g., GPUs) on university campuses, researchers are forced inescapably to adopt cloud-based solutions for their computing needs.
In this lecture, we lay out the foundation of a new computing model in which researchers build their own ephemeral personal clusters on the cloud, conduct their experiments, and destroy their clusters when they are no longer needed. This is a departure from traditional research computing model where many researchers share a in-house HPC cluster (e.g., Sherlock) with a set of determined policies. The new model will integrate building personal clusters seamlessly with experiment design, job management, data harvesting and data analysis.
Deep learning (DL) research is a prime example that requires massive computational resources, and in particular access to many GPUs. In the lecture, we will review deep learning and explain the computations that are involved in a DL experiment. We will then teach how to do these experiments at scale on Google Cloud push-button: With one push of a button, the researcher builds her own personal cluster and with another push she will fire up 1000's of jobs to her cluster on the cloud.
…
Venue: Thornt110
Time : 3:00 PM on Monday Nov 27
Title : Push-button Deep Learning on the Cloud
Abstract:
With increasing computational demands due to ambitious data science studies and the scarcity of computational resources (e.g., GPUs) on university campuses, researchers are forced inescapably to adopt cloud-based solutions for their computing needs.
In this lecture, we lay out the foundation of a new computing model in which researchers build their own ephemeral personal clusters on the cloud, conduct their experiments, and destroy their clusters when they are no longer needed. This is a departure from traditional research computing model where many researchers share a in-house HPC cluster (e.g., Sherlock) with a set of determined policies. The new model will integrate building personal clusters seamlessly with experiment design, job management, data harvesting and data analysis.
Deep learning (DL) research is a prime example that requires massive computational resources, and in particular access to many GPUs. In the lecture, we will review deep learning and explain the computations that are involved in a DL experiment. We will then teach how to do these experiments at scale on Google Cloud push-button: With one push of a button, the researcher builds her own personal cluster and with another push she will fire up 1000's of jobs to her cluster on the cloud.
…
Riccardo Murri gave an exceptional lecture in Stats285. Check out the video here:
…
Lecture03: Occupy The Cloud (Eric Jonas):
Lecture04: Reproducibility in Computational Science (Victoria Stodden):
…
You may now find slides for lecture 02 at
Stay tuned for the video!
…
The slides of the first lecture now are now available on Stats285 website:
…
Follow course updates on https://twitter.com/stats285 starting next week.
…
Great guest speakers this quarter for Stats285. It is going to be a fun quarter at Stanford: https://stats285.github.io
…
Our goal is reduce the burden of managing massive computational experiments while conducting them in a reproducible way. To this end, we are working on the open source project ClusterJob which handles massive computations and makes it painless to track, harvest and analyze millions of computational jobs. More info on our paper https://web.stanford.edu/~vcs/papers/osbg-MDS2016.pdf
We are also teaching a course STATS 285 in which students will learn state-of-the-art technique and tools for painless massive computing. http://explorecourses.stanford.edu/search?view=catalog&filter-coursestatus-Active=on&page=0&catalog=&q=STATS285
…