Winter Institute in Data Science and Big Data
Comparative Computing
Le Bao
Massive Data Institute, Georgetown University
January 7, 2022
Plan
Python
Basics
Data structures
Application: working with web data
Comparative computing
Python, R, Shell
Polyglot programming and computing tools
Computing environment
Containers and cloud computing
Operating systems and system dependencies
Docker
Cloud computing with Code Ocean
Data Science Toolbox
Some most used DS tools:
Apache Spark
BigML
D3.js
MATLAB
Excel
tidyverse
Tableau
Jupyter
ggplot2
Matplotlib
NLTK
Scikit-learn
TensorFlow
Weka
…
Calculating Path to Jupyter Using Excel