Credits: 1 Offered: Fall
This course provides an introduction to computer systems and scientific computing environments to enable effective use of computational and data resources. The course assumes no prior computing experience and is broken into 3 component modules. These are:
1. UNIX/Linux fundamentals with a focus on operating systems (file systems, navigation, communication, multi-user environments, permissions, file sharing, UNIX shells, POSIX architecture), beginning and intermediate shell scripting, and Linux environment applications commonly encountered in scientific computing (e.g., awk, sed). 2. Computer system architectures and applications in scientific computing, topics including the history of scientific computing, HPC architecture and application design (Von Neumann architecture, parallel processing, shared and distributed memory, vector processing, MIMD/SIMD, accelerator computing, parallel numerical libraries), HPC batch processing systems (e.g., scheduling) and finally modern distributed data-parallel approaches (e.g., Hadoop-style and ecosystem, Spark, MapReduce as a paradigm and implementation).
3. Introduction to scientific programming in Python 3, with relevant comparison/contrast to other important languages commonly encountered in scientific computing (e.g., perl, R, C/C++). Variables, operators, data structures, control flow, decisions, file I/O, exception handling, and modern python libraries encountered in HPC, scientific computing, and data science (e.g., scipy, numpy, pandas, scikit-learn).
Emphasis will be placed on real-world practicality by motivating study with examples and tasks relevant to bioinformatics, structural biology, imaging, and data science. The student will develop both a solid conceptual foundation and experience solving real problems by the end of the class.
Credits: 1 Offered: Fall
This course provides an introduction to computer systems and scientific computing environments to enable effective use of computational and data resources. The course assumes no prior computing experience and is broken into 3 component modules. These are: 1. UNIX/Linux fundamentals with a focus on operating systems (file systems, navigation, communication, multi-user environments, permissions, file sharing, UNIX shells, POSIX architecture), beginning and intermediate shell scripting, and Linux environment applications commonly encountered in scientific computing (e.g., awk, sed). 2. Computer system architectures and applications in scientific computing, topics including the history of scientific computing, HPC architecture and application design (Von Neumann architecture, parallel processing, shared and distributed memory, vector processing, MIMD/SIMD, accelerator computing, parallel numerical libraries), HPC batch processing systems (e.g., scheduling) and finally modern distributed data-parallel approaches (e.g., Hadoop-style and ecosystem, Spark, MapReduce as a paradigm and implementation). 3. Introduction to scientific programming in Python 3, with relevant comparison/contrast to other important languages commonly encountered in scientific computing (e.g., perl, R, C/C++). Variables, operators, data structures, control flow, decisions, file I/O, exception handling, and modern python libraries encountered in HPC, scientific computing, and data science (e.g., scipy, numpy, pandas, scikit-learn). Emphasis will be placed on real-world practicality by motivating study with examples and tasks relevant to bioinformatics, structural biology, imaging, and data science. The student will develop both a solid conceptual foundation and experience solving real problems by the end of the class.
Credits: 1.5 Offered: Fall
This course provides an introduction to computer systems and scientific computing environments to enable effective use of computational and data resources. The course assumes no prior computing experience and is broken into 3 component modules. These are: 1. UNIX/Linux fundamentals with a focus on operating systems (file systems, navigation, communica- tion, multi-user environments, permissions, file sharing, UNIX shells, POSIX architecture), beginning and intermediate shell scripting, and Linux environment applications commonly encountered in scientific computing (e.g., awk, sed). 2. Computer system architectures and applications in scientific computing, topics including the history of scientific computing, HPC architecture and application design (Von Neumann architecture, parallel processing, shared and distributed memory, vector processing, MIMD/SIMD, accelerator computing, parallel numerical libraries), HPC batch processing systems (e.g., scheduling) and finally modern distributed data-parallel approaches (e.g., Hadoop-style and ecosystem, Spark, MapReduce as a paradigm and implementation). 3. Introduction to scientific programming in Python 3, with relevant comparison/contrast to other important languages commonly encountered in scientific computing (e.g., perl, R, C/C++). Variables, operators, data structures, control flow, decisions, file I/O, exception handling, and modern python libraries encountered in HPC, scientific computing, and data science (e.g., scipy, numpy, pandas, scikit-learn). Emphasis will be placed on real-world practicality by motivating study with examples and tasks relevant to bioinformatics, structural biology, imaging, and data science. The student will develop both a solid conceptual foundation and experience solving real problems by the end of the class.
Credits: 3 Offered: Fall
This course is a computer-science intensive program intended as a survey of algorithms - that is, computational methods used to solve appropriately defined problems, and their implementation on modern scientific computing hardware. Core to any modern discussion of algorithms is competency in one or more object-oriented programming languages, in addition to a deep dive into data structures, without which the discussion of practical algorithm implementation is not useful. We complete the course with a survey of mathematical optimization techniques typically not encountered in an ordinary course on algorithms, but which form the mathematical basis for many problems in computational biology, biochemistry, genomics, and data science. In this course, we use Python 3 as the core programming tool. The class is structured as 1.5 hours of lecture each week with a 1.5 hour lab component, for 12 total weeks. The course can be logically broken up into 3 modular topics, with the bulk of the time discussing fundamental algorithms and data structures; however, each module builds on the previous and therefore the course should be taken as a whole.
Credits: 0 Offered: Fall
This course is open to 2nd Year MS BDS students who are working full-time to complete their capstone in the Fall semester.
Credits: 3-9 Offered: Fall
This course represents the culmination of the Master in Biomedical Data Science (MBDS) Curriculum. In a semester-long, active learning project, students will work with a mentor to devise a potential solution to a contemporary problem in biomedical data science. The process of researching current unsolved problems, outlining potential solutions, and writing a final report will require students to integrate and synthesize concepts learned in the program’s core coursework, thus providing a demonstration that trainees have mastered and can apply pertinent ideas and approaches. The course is 9 credits where students will complete intensive, full-time research under the direct guidance of a mentor. Pre-requisites: Students will take this course after having completed the full sequence of core courses for the MBDS program. This will require them to have developed significant, minimum scripting-level, programming experience with demonstrated productivity in one or more programming languages. To develop the expertise necessary for a strong capstone project, students in the program will have taken the following courses:
BDS1005-1007(all modules) Computer Systems
BDS2005 Introduction to Algorithms BDS3002 Machine Learning for Biomedical Data Science
Credits: 2 Offered: Spring
Software plays a vital and increasingly significant role in all aspects of biomedical research, translation of successful research findings, and patient care. How is this software created? What best practices should biomedical software professionals follow to design, create, and deploy such software? Many of these practices are widely used by software engineers. How should biomedical computing adapt them to address our unique challenges? We teach software engineering best practices that will enable students to efficiently and consistently design and create quality biomedical software. We focus on a comprehensive set of practical, well-regarded methods and tools that students can apply immediately. These include requirements analysis, modular and object-oriented design, complexity hiding, coding standards, software reuse, version control, unit and regression testing, and logging and debugging. We employ both traditional classroom and experiential pedagogy. In addition to completing simple programming assignments, all students must be a working on a biomedical software project. Each student’s project provides a context for exploring the ideas and practicing skills taught in the classroom. Students in the MS in Biomedical Informatics Program must take this course concurrently with the program’s required Capstone Project course. Other students must identify or create a suitable project in which they participate.
Credits: 3 Offered: Spring
This course is a computer-science intensive program intended as a survey of algorithms - that is, computational methods used to solve appropriately defined problems, and their implementation on modern scientific computing hardware. Core to any modern discussion of algorithms is competency in one or more object-oriented programming languages, in addition to a deep dive into data structures, without which the discussion of practical algorithm implementation is not useful. We complete the course with a survey of mathematical optimization techniques typically not encountered in an ordinary course on algorithms, but which form the mathematical basis for many problems in computational biology, biochemistry, genomics, and data science. In this course, we use Python 3 as the core programming tool. The class is structured as 1.5 hours of lecture each week with a 1.5 hour lab component, for 12 total weeks. The course can be logically broken up into 3 modular topics, with the bulk of the time discussing fundamental algorithms and data structures; however, each module builds on the previous and therefore the course should be taken as a whole.
Credits: 3 Offered: Spring
Data are becoming all the more important in today's world to discover reliable understanding of complex processes and actionable hypothesis on ways to productively perturb these processes. Biology and medicine have been witnessing a data revolution driven by rapid progress in a variety of biotechnologies and an increased emphasis on personalized medicine. This course is designed to train students, staff and faculty in commonly used methods to organize, mine and learn from data sets, especially those that are complex and large (big data). These methods include basic data concepts, classification, clustering, network inference and analysis and outlier/anomaly detection. Note that the teaching of these methods will focus on the abstract and mathematical concepts involved in learning from general data sets. Relatable examples of the application of these methods to biomedical datasets will also be provided. Students in teams will also be expected to conceive a relevant project at the beginning of the course and present their approach and results at the end. The overall goal of this course is to teach the attendees how to apply the methods above to complex biomedical data sets to extract actionable knowledge that may not be obtainable from other methods.
Credits: 3-9 Offered: Spring
This course represents the culmination of the Master in Biomedical Data Science (MBDS) Curriculum. In a semester-long, active learning project, students will work with a mentor to devise a potential solution to a contemporary problem in biomedical data science. The process of researching current unsolved problems, outlining potential solutions, and writing a final report will require students to integrate and synthesize concepts learned in the program’s core coursework, thus providing a demonstration that trainees have mastered and can apply pertinent ideas and approaches. The course is 9 credits where students will complete intensive, full-time research under the direct guidance of a mentor. Pre-requisites: Students will take this course after having completed the full sequence of core courses for the MBDS program. This will require them to have developed significant, minimum scripting-level, programming experience with demonstrated productivity in one or more programming languages. To develop the expertise necessary for a strong capstone project, students in the program will have taken the following courses: BDS1005=1007(all modules) Computer Systems BDS2005 Introduction to Algorithms BDS3002 Machine Learning for Biomedical Data Science