Credits: 1 Offered: Fall
This course provides an introduction to computer systems and scientific computing environments to enable effective use of computational and data resources. The course assumes no prior computing experience and is broken into 3 component modules. These are:
1. UNIX/Linux fundamentals with a focus on operating systems (file systems, navigation, communication, multi-user environments, permissions, file sharing, UNIX shells, POSIX architecture), beginning and intermediate shell scripting, and Linux environment applications commonly encountered in scientific computing (e.g., awk, sed). 2. Computer system architectures and applications in scientific computing, topics including the history of scientific computing, HPC architecture and application design (Von Neumann architecture, parallel processing, shared and distributed memory, vector processing, MIMD/SIMD, accelerator computing, parallel numerical libraries), HPC batch processing systems (e.g., scheduling) and finally modern distributed data-parallel approaches (e.g., Hadoop-style and ecosystem, Spark, MapReduce as a paradigm and implementation).
3. Introduction to scientific programming in Python 3, with relevant comparison/contrast to other important languages commonly encountered in scientific computing (e.g., perl, R, C/C++). Variables, operators, data structures, control flow, decisions, file I/O, exception handling, and modern python libraries encountered in HPC, scientific computing, and data science (e.g., scipy, numpy, pandas, scikit-learn).
Emphasis will be placed on real-world practicality by motivating study with examples and tasks relevant to bioinformatics, structural biology, imaging, and data science. The student will develop both a solid conceptual foundation and experience solving real problems by the end of the class.
Credits: 1 Offered: Fall
This course provides an introduction to computer systems and scientific computing environments to enable effective use of computational and data resources. The course assumes no prior computing experience and is broken into 3 component modules. These are: 1. UNIX/Linux fundamentals with a focus on operating systems (file systems, navigation, communication, multi-user environments, permissions, file sharing, UNIX shells, POSIX architecture), beginning and intermediate shell scripting, and Linux environment applications commonly encountered in scientific computing (e.g., awk, sed). 2. Computer system architectures and applications in scientific computing, topics including the history of scientific computing, HPC architecture and application design (Von Neumann architecture, parallel processing, shared and distributed memory, vector processing, MIMD/SIMD, accelerator computing, parallel numerical libraries), HPC batch processing systems (e.g., scheduling) and finally modern distributed data-parallel approaches (e.g., Hadoop-style and ecosystem, Spark, MapReduce as a paradigm and implementation). 3. Introduction to scientific programming in Python 3, with relevant comparison/contrast to other important languages commonly encountered in scientific computing (e.g., perl, R, C/C++). Variables, operators, data structures, control flow, decisions, file I/O, exception handling, and modern python libraries encountered in HPC, scientific computing, and data science (e.g., scipy, numpy, pandas, scikit-learn). Emphasis will be placed on real-world practicality by motivating study with examples and tasks relevant to bioinformatics, structural biology, imaging, and data science. The student will develop both a solid conceptual foundation and experience solving real problems by the end of the class.
Credits: 1.5 Offered: Fall
This course provides an introduction to computer systems and scientific computing environments to enable effective use of computational and data resources. The course assumes no prior computing experience and is broken into 3 component modules. These are: 1. UNIX/Linux fundamentals with a focus on operating systems (file systems, navigation, communica- tion, multi-user environments, permissions, file sharing, UNIX shells, POSIX architecture), beginning and intermediate shell scripting, and Linux environment applications commonly encountered in scientific computing (e.g., awk, sed). 2. Computer system architectures and applications in scientific computing, topics including the history of scientific computing, HPC architecture and application design (Von Neumann architecture, parallel processing, shared and distributed memory, vector processing, MIMD/SIMD, accelerator computing, parallel numerical libraries), HPC batch processing systems (e.g., scheduling) and finally modern distributed data-parallel approaches (e.g., Hadoop-style and ecosystem, Spark, MapReduce as a paradigm and implementation). 3. Introduction to scientific programming in Python 3, with relevant comparison/contrast to other important languages commonly encountered in scientific computing (e.g., perl, R, C/C++). Variables, operators, data structures, control flow, decisions, file I/O, exception handling, and modern python libraries encountered in HPC, scientific computing, and data science (e.g., scipy, numpy, pandas, scikit-learn). Emphasis will be placed on real-world practicality by motivating study with examples and tasks relevant to bioinformatics, structural biology, imaging, and data science. The student will develop both a solid conceptual foundation and experience solving real problems by the end of the class.
Credits: 3 Offered: Fall
This course is a computer-science intensive program intended as a survey of algorithms - that is, computational methods used to solve appropriately defined problems, and their implementation on modern scientific computing hardware. Core to any modern discussion of algorithms is competency in one or more object-oriented programming languages, in addition to a deep dive into data structures, without which the discussion of practical algorithm implementation is not useful. We complete the course with a survey of mathematical optimization techniques typically not encountered in an ordinary course on algorithms, but which form the mathematical basis for many problems in computational biology, biochemistry, genomics, and data science. In this course, we use Python 3 as the core programming tool. The class is structured as 1.5 hours of lecture each week with a 1.5 hour lab component, for 12 total weeks. The course can be logically broken up into 3 modular topics, with the bulk of the time discussing fundamental algorithms and data structures; however, each module builds on the previous and therefore the course should be taken as a whole.
Credits: 0 Offered: Fall
This course is open to 2nd Year MS BDS students who are working full-time to complete their capstone in the Fall semester.
Credits: 3-9 Offered: Fall
This course represents the culmination of the Master in Biomedical Data Science (MBDS) Curriculum. In a semester-long, active learning project, students will work with a mentor to devise a potential solution to a contemporary problem in biomedical data science. The process of researching current unsolved problems, outlining potential solutions, and writing a final report will require students to integrate and synthesize concepts learned in the program’s core coursework, thus providing a demonstration that trainees have mastered and can apply pertinent ideas and approaches. The course is 9 credits where students will complete intensive, full-time research under the direct guidance of a mentor. Pre-requisites: Students will take this course after having completed the full sequence of core courses for the MBDS program. This will require them to have developed significant, minimum scripting-level, programming experience with demonstrated productivity in one or more programming languages. To develop the expertise necessary for a strong capstone project, students in the program will have taken the following courses:
BDS1005-1007(all modules) Computer Systems
BDS2005 Introduction to Algorithms BDS3002 Machine Learning for Biomedical Data Science