General programming
Markdown
Markdown is a lightweight markup language for creating formatted text using a plain-text editor. It is widely used for blogging, including our own Lab manual! You can use Markdown for many other things, such as creating slides and other types of presentation material (example).
To do list
- Go through this guide that will introduce you to Markdown. If you want to practice your Markdown skills, consider writing a post or webpage for this manual!
GitHub
GitHub is a developer platform that allows developers to create, store, manage and share their code. It uses Git software, providing the distributed version control of Git plus access control, bug tracking, software feature requests, task management, continuous integration, and wikis for every project. It is commonly used to host open-source software development.
Our team relies heavily on GitHub, and all our software projects are hosted on the CIS Lab GitHub page–including the group manual. The expectation is that, over time, you will become a GitHub proficient user.
To do list
Open an account on GitHub.
Ask Stefano to give you access to the CIS Lab GitHub page.
Take the online training provided on GitHub website. We recommend proceeding as follows:
- Start with Introduction to GitHub
- Take all modules on First day on GitHub
- Take all modules on First week on GitHub (Code with Codespaces and Code with Copilot are optional)
Introduction to Python and R
In your daily work you will certainly make use of a general-purpose language, such as Matlab, Python, Julia, or R (with the caveat that R is more functional to statistical analysis and data visualization). In our lab, the two commonly used languages are Python and R. Being familiar with one (or both) of them is therefore important.
To do list
Install Python on your computer. The easiest way to do so is to install Anaconda, an open-source distribution of the Python and R programming languages for data science that aims to simplify package management and deployment. Anaconda, together with its interface Anaconda Navigator, allow you to easily manage programming languages and packages.
Read the Anaconda documentation.
Familiarize with Python. There are hundreds of guides available online; our suggestion is to use this Python tutorial.
Note that R can also be installed as a stand-alone software (i.e., independent of Anaconda). If you are planning to install it in such a way, simply visit the R Project for Statistical Computing website. We recommend installing also RStudio, an integrated development environment for R.
Python coding standard
When working on a long-term collaboration project with others, adopting a consistent coding standard is key. This practice not only improves the readability and maintainability of the codebase, saving you from future headaches, but also develops industry-valued skills.
To do list
- Familiarize yourself with the Google Python Style Guide. While the choice of a specific style guide is subjective, Google Python Style Guide offers a comprehensive set of common rules along with clear explanations and examples.
- Streamline your coding workflow by configuring your favourite IDE to automatically format your code with tools, such as Black.
Linux for research
GNU/Linux is the powerhouse (operating system) behind most of our research group’s computing clusters. Mastering the command line is required for running computational experiments. If you want a first dive into Linux, then check our blog post here. A deeper dive into this topic requires taking a short course on Bash Scripting.
To do list
Read our tutorial
Take a Bash Scripting Tutorial.
Cluster basics
The Cornell University Center for Advanced Computing (CAC) provides several computing resources. As part of the EWRS concentration, we have access to Hopper, a 22 compute nodes (c0001-c0022) with dual 20-core Intel Xeon Gold 5218R CPUs @ 2.1 GHz and 192 GB of RAM. This is likely the first cluster you will use.
To do list (Getting started with Hopper)
To use Hopper, submit the request form to CAC. Also email Professor Vivek Srikrishnan to ask for his approval of the request.
While waiting for the approval, read and understand this guide to get started with Hopper.
Large-scale computing
… to be completed
To do list
…
… To be completed
Google Earth Engine
Another resource you may use is Google Earth Engine … to be completed
To do list
…
… To be completed
Software documentation
Software documentation provides information about a software program for everyone involved in its creation, deployment and use. To be completed
To do list
Open an account on Read the Docs.
… To be completed