User Tools

Site Tools


cibrclusters

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
cibrclusters [2025/07/05 21:09] – created steveludtkecibrclusters [2025/07/05 21:32] (current) steveludtke
Line 1: Line 1:
-===== Practical Introduction to Programming for Scientists ====+====== CIBR Cluster Co-op ======
-==== Spring 2018 ==== +
-=== Mondays & Fridays, 9am - 10:30  N315 ===+
  
-For several reasons I use this site rather than Blackboardthis Wiki page will host all class material, including:+**WARNING: ** The existing CIBR clusters are now >10 years oldand we no longer know how long the room housing them will remain operational due to BCM cuts. If you have material on the CIBR Clusters we strongly suggest moving it to local storage at your earliest convenience. Note also that [[https://www.bcm.edu/academic-centers/dan-l-duncan-comprehensive-cancer-center/research/cancer-center-shared-resources/biostatistics-and-informatics/biomedical-informatics-group/high-performance-computer-cluster|BISR]] operates a fee-for-service cluster which is more actively maintained and developed.
  
-  * Lecture notes 
-  * Recordings of most lectures 
-  * Homework assignments 
- 
----- 
-||Lecture ||Notes ||Video ||Homework || Other || 
-||1 - Introduction ||[[http://blake.bcm.edu/dl/EMAN2/lecture_1.pdf]]\\ [[http://blake.bcm.edu/dl/EMAN2/notebook_1.ipynb]] ||[[http://blake.grid.bcm.edu/dl/intro_programming_18/Lecture1.mp4|Lecture Video]]|| 1. Take this survey: https://goo.gl/forms/ys63VOoY5aVEeLo42 \\ 2. Install Anaconda 5 Python 3.6 on your laptop\\ 3. Follow the test procedure on the last page of the lecture notes || || 
-||2 - Loops and Conditionals, Team Learning ||[[http://blake.bcm.edu/dl/EMAN2/lecture_2.pdf]]\\ [[http://blake.bcm.edu/dl/EMAN2/group_2.pdf]]\\ [[http://blake.bcm.edu/dl/EMAN2/notebook_2.ipynb]] ||Sorry, video failed|| Team learning due by midnight Monday, Jan 8\\ Homework due by midnight Thursday, Jan 11 || || 
-||3 - Writing programs ||[[http://blake.bcm.edu/dl/EMAN2/lecture_3.pdf]]\\ [[http://blake.bcm.edu/dl/EMAN2/notebook_3.ipynb]] ||[[http://blake.grid.bcm.edu/dl/intro_programming_18/Lecture3.mp4|Lecture Video]]|| No class Monday (holiday). Homework due by midnight Thursday.|| || 
-||4 - Standard Libraries, Biopython ||[[http://blake.bcm.edu/dl/EMAN2/lecture_4.pdf]]\\ [[http://blake.bcm.edu/dl/EMAN2/notebook_4.ipynb]] ||[[http://blake.grid.bcm.edu/dl/intro_programming_18/Lecture4.mp4|Lecture Video]]|| Homework due by Thursday at Midnight|| || 
-||5 - Numerical Computing and Plotting, Team Learning ||[[http://blake.bcm.edu/dl/EMAN2/lecture_5.pdf]]\\ [[http://blake.bcm.edu/dl/EMAN2/team_5.pdf]]\\ [[http://blake.bcm.edu/dl/EMAN2/notebook_5.ipynb]] ||[[http://blake.grid.bcm.edu/dl/intro_programming_18/Lecture5.mp4|Lecture Video]]|| Team Learning due by Midnight Today (monday) ||File for team learning: [[http://blake.bcm.edu/dl/EMAN2/curve.txt]] || 
-||6 - Complex Data representations|| [[http://blake.bcm.edu/dl/EMAN2/lecture_6.pdf]] || Unfortunately, the new version of my screen recorder crashed my computer (again) and lost the recording. This lecture from last year was very similar, but not identical. The missed homework review doesn't matter since the homework was different. \\ [[http://blake.grid.bcm.edu/dl/intro_programming_17/Lecture6.mp4| 2017 Lecture Video]] || || || 
-||7 - Web Scraping, File I/O, Command Line ||[[http://blake.bcm.edu/dl/EMAN2/lecture_7.pdf]]\\ [[http://blake.bcm.edu/dl/EMAN2/team_7.pdf]]\\ [[http://blake.bcm.edu/dl/EMAN2/notebook_7.ipynb]] ||[[http://blake.grid.bcm.edu/dl/intro_programming_18/Lecture7.mp4|Lecture Video]]|| Team Learning due by Midnight Today (monday) || || 
-||8 - Image Processing ||[[http://blake.bcm.edu/dl/EMAN2/lecture_8.pdf]] ||[[http://blake.grid.bcm.edu/dl/intro_programming_18/Lecture8.mp4|Lecture Video]]|| Homework due Thursday at Midnight as usual || || 
-||9 - Regular Expressions ||[[http://blake.bcm.edu/dl/EMAN2/lecture_9.pdf]]\\ [[http://blake.bcm.edu/dl/EMAN2/team_9.pdf]] ||[[http://blake.grid.bcm.edu/dl/intro_programming_18/Lecture9.mp4|Lecture Video]]|| Team Learning due by Midnight Today (monday) || [[http://blake.bcm.edu/dl/EMAN2/ecoli.k12.txt]] || 
-||10 - The Art of Programming, Data compression ||[[http://blake.bcm.edu/dl/EMAN2/lecture_10.pdf]] ||[[http://blake.grid.bcm.edu/dl/intro_programming_18/Lecture10.mp4|Lecture Video]]|| || || 
-||11 - Making, Interfacing ||[[http://blake.bcm.edu/dl/EMAN2/lecture_11.pdf]]\\ [[http://blake.bcm.edu/dl/EMAN2/team_11a.pdf]]\\ [[http://blake.bcm.edu/dl/EMAN2/team_11b.pdf]] ||[[http://blake.grid.bcm.edu/dl/intro_programming_18/Lecture11.mp4|Lecture Video]]|| Nothing to turn in for team learning || || 
-||12 - Graphical User Interfaces ||[[http://blake.bcm.edu/dl/EMAN2/lecture_12.pdf]] ||[[http://blake.grid.bcm.edu/dl/intro_programming_18/Lecture12.mp4|Lecture Video]]|| || || 
-||13 - Databases, Daemons and the Internet ||[[http://blake.bcm.edu/dl/EMAN2/lecture_13.pdf]] ||[[http://blake.grid.bcm.edu/dl/intro_programming_18/Lecture13.mp4|Lecture Video]]|| || || 
 ---- ----
  
-===== Important Instructions for Class Projects =====+**NOTE:** The CIBR cluster Co-op strives to maximize CPU/cost, and BCM no longer provides any staff support for cluster operations. Dr. Ludtke's group is handling issues as a community service. Will still provide cluster access to users meeting the requirements (faculty CIBR membership), but software configuration within your account and running jobs is your responsibility. If you need more inclusive service (costs $) please see [[https://www.bcm.edu/academic-centers/dan-l-duncan-comprehensive-cancer-center/research/cancer-center-shared-resources/biostatistics-and-informatics/biomedical-informatics-group/high-performance-computer-cluster|BISR]]
  
-If you believe you will need an exception to something below, please ask by Feb 23 (the final lecture !)+==== Overview ==== 
 +[[https://www.bcm.edu/research/centers/computational-integrative-biomedical-research|CIBR]] and a number of specific faculty have sponsored the purchase of several clusters managed for shared use by faculty. CIBR faculty may request moderate allocations for free, as a benefit of CIBR membership (which is free). Faculty who contributed funds directly to the purchase have larger guaranteed allocations.
  
-For your class presentation, your first slide should have+The clusters currently in operation are
-  Your name +||**Name**||**Cores**||**Storage**||**Purchased**||**Free CIBR Access**||**Paid Access**||**Notes**|| 
-  Your department/program +||[[CIBRClusters:Sphere|sphere.grid.bcm.edu]]||960||60 TB + 80 TB||2015||25,000 CPU-hr/qtr||Ludtke, Guan, Waterland || || 
-  Descriptive title of your project +||[[CIBRClusters:Prism|prism.grid.bcm.edu]]||704||63+88 TB||2013||25,000 CPU-hr/qtr|| Wensel, Ludtke, Guan|| || 
-  You may wish to say something very briefly about your previous programming experience +||[[CIBRClusters:Store|store]]|| - || 350 TB Raid6x2 || 2013 || upon request || - || For inactive data which will still be needed on the clusters. This storage array is available upon request to users of any of the clusters, but is directly accessible only from the clusters to discourage people for using it for general lab backup purposes. It is a RAID6 volume, which gives it good reliability, but every 4-5 years we experience some sort of failure. At one time this storage was backed up in addition to being a RAID6 array, but that is no longer true. If the RAID suffers a critical failure, all data could be lost||
-  We will be on a tight schedule, so I will have to enforce the 10 minute timeline pretty rigorously +
-  * Test your laptop with the projector in this room BEFORE Feb 26! You may use the network-based connection if you like, but note that it is laggy.+
  
-**Please follow these instructions exactly:**+==== Requesting Access ==== 
 +Free cluster accounts must be requested directly by a PI (tenure track faculty) who must be an established member of CIBR, and such accounts will charge against that PI's allocation. Contact lmayran@bcm.edu (Larry Mayran) about CIBR membership. We may ask to discuss your intended use for the cluster before granting accounts, simply to insure that it will not interfere with paid cluster users or violate policy. Please note that cluster security does not meet HIPAA requirements, so no identified patient data may be stored or processed on the clusters.
  
-  * Your class project MUST be submitted by 11:59 PM on SatFeb 24No revisions will be accepted after this time. You can use Sunday to prepare your oral presentation +To request an accountemail sludtke@bcm.edu with:  
-  * Your submission should consist of: +  * Full name 
-    one or more .py files (should have sufficient comments to figure out how they work) +  Requested username 
-    any necessary additional files to demonstrate that the program works (email me to discuss in advance if the files are >20 MB) +  Position (studentpostdocetc) 
-    * A PDF file with a brief description of your programwhat inputs the program takeswhat outputs the program produces, and what it is supposed to do. +  Cell Phone Number (for emergencies and initial password) 
-    The final item in the PDF should be a command-line to use in running the program, and any necessary instructions to demonstrate that it works. +  * An email address which must be checked at least daily.
-  * Combine all files into a .zip file named:  Familyname_Givenname_project_2016.zip +
-  * Email sludtke42@gmail.com with the subject "Class project submission", and attach the .zip file. If the zip file is too large, feel free to use Box, Dropbox or Goodle Drive to send the attachment+
  
-===== Important notes =====+==== Cluster Usage ==== 
 +Each cluster has independent user accounts, as different faculty contributed different amounts to each. Please click on one of the links above for detailed policies and procedures for each cluster
  
-  * A laptop computer is required to take this class, and it should be brought to class for in-class team learning exercises +Individual users are expected to be familiar with the use of Linux clusters in general, and if not, to obtain assistance from their acquaintancesrather than learning by trial and errorOccasionally CIBR offers workshop to help familiarize people with cluster computingwhich will be advertised via normal CIBR mechanisms.
-  * Homework is generally found on the last page of the lecture handout, and completed homework should be emailed to both TAs and myself +
-  * If you miss a lecture review the material on this page at least 1 day before the next lectureas there may be an assignment! +
-  * Course TA's are Michael Bell <jmbell@bcm.edu> and Muyuan Chen <Muyuan.Chen@bcm.edu>. They both sit in N420 +
-  * I don't have formal office hours, but can be found in my office, N420.01 most mornings. The later in the day the busier I get. Email at any time <sludtke@bcm.edu> +
-  * This class uses Python 3.6 via free distribution called Anaconda available for Linux, Macs and Windows. By using a common environment, it is easier to deal with the differences between Python on different platforms. The default Python available on Linux and Mac is still Python 2.X. Please do not try to make-do with some other Python installation on your machine. If you are knowledgeable enough to know of other distributionsyou are also knowledgeable enough to install Anaconda side-by-side with your other tools.+
  
-Anaconda is available here: [[https://www.anaconda.com/download]] (you want the Python 3.6 version of Anaconda 5)+All of the clusters run some version of CentOS Linux. **We do not provide any commercial software**, but most standard open-source tools are installed, and users are welcome to install commercial or free software within their own accounts. BCM's site license for Matlab is usable on the cluster, but must be installed/licensed in the user's account, and there are some setup issuesContact Larry Mayran about this for details We may also be able to install other open-source software system-wide, if you have such a need, please ask.
  
-===== Textbook ===== +Details on some of the open-source software we have made available for users is on the [[CIBRClusters:cibrclusters_Software|Software page]].
-I started writing an introduction to programming book some years ago, and while I haven't gotten around to finishing it, some other classes have found it a useful supplement to the class lectures, particularly for people just starting with programming. For that reason I'm making the current (very incomplete) draft of the book available to you:+
  
-  * [[http://blake.bcm.edu/dl/EMAN2/Intro_Programming_01_17.ibooks]] - Multimedia version of book (iPad/Mac iBooks only) +There is a CIBR Cluster Google Group/Mailing List (https://groups.google.com/forum/#!forum/cibrcluster) which is used for announcements of outagesproblems, policy changes, etc. We strongly recommend all CIBR Cluster users join.
-  * [[http://blake.bcm.edu/dl/EMAN2/Intro_Programming_01_17.pdf]] - PDF version of bookmissing some material+
  
- +==== In manuscripts, grant proposals, presentations, and other publications ==== 
-===== Class Project Overview ===== +Please acknowledge the assistance from the CIBR Center by including statement similar to the following: 
- +"We gratefully acknowledge the assistance and computing resources provided by the CIBR Center for Computational and Integrative Biomedical Research of Baylor College of Medicine in the completion of this work."
-The class project will count for 1/2 of your grade in the class, and will be scored on both your presentation and the program itself. You will likely have only ~5 minutes to present your projects when the time comes, but that shouldn't limit their complexity or your ambitions. It is good idea to select a project which is somewhat ambitious but has some fallback positions in case you don't succeed in everything you had planned to do. Your initial project plan will not be a factor in your final grade. If the program meets the criteria below, even if it's very different than your original aim, you will still receive full credit. +
- +
-Each person will, over the course of the term, write a program, and briefly present it at the end of the term. The sole requirements for the program are:  +
- 1. It must do something useful not easily completed with existing freely available tools +
- 2. Not be completely trivial. The complexity of your project is expected to correspond somewhat to your level of past programming experience.  +
- +
-==== Examples of past class projects ==== +
-  * Analysis of DNA capture targets that failed during sequencing +
-  * Calculating the probabilities of different discrete distributions +
-  * A Candidate Gene Searcher +
-  * Calculating dN/dS automatically from pairs of orthologs by pipelining clustal and paml +
-  * Pubmed search tool +
-  * Identify evolutionarily conserved water molecules in structure +
-  * 96-well reader and calculator +
-  * PCR Annealing Temperature Calculator +
-  * One click identifier for PDF files +
-  * Scraping and processing microarray data from the lab webpage+
  
 ---- ----
- +CIBR hosted a Mini-Workshop on cluster computing in 2018:  
-This class attracts people with widely varying backgrounds and skill levels. Since the course is supposed to be accessible to people with little to no programming experience, the bar for achieving an acceptable grade (B) in the class is set fairly lowIf you make a reasonable attempt at all of the homework assignments, even if not completely successful, and complete a class project of some sort, you can expect to get at least a B in the classThis does not mean you can slip through without making an effort at allParticularly if you have no programming experience, the class WILL take a significant effort on your partThose who don't make a reasonable attempt at virtually every assignment may not achieve a BTurning in something incomplete is better than turning in nothing at all. +The slides can be downloaded here: [[http://blake.bcm.edu/dl/EMAN2/workshop2018.pdf]] 
- +Archived video of the presentation is here: [[http://blake.grid.bcm.edu/dl/CIBR/cluster_computing.mp4|Cluster Computing]]
-**Auditors are welcome**, but if possible (all students, and some others) please formally audit the class, rather than just showing up1) this means you have at least a small commitment to actually attend and 2) if you don't formally audit, the GS has no record of your interest and they may give me a very small room to teach in next time (not that we need N315).  +
  
cibrclusters.1751749798.txt.gz · Last modified: by steveludtke