## Georgia Tech’s Online Master in Computer Science

Earlier this year, Udacity, Georgia Tech, and AT&T announced a ~$7000 Online Master of Science in Computer Science (OMSCS) graduate program. OMSCS courses will be offered on Udacity’s platform and are specifically designed to be delivered online. In comparison, similar programs at Stanford and UIUC cost >$10000 and are not at all designed for interactive online delivery.

I applied to the pilot program back in October. I was recommended by Mark Conger, DeLiang Wang, and Ken Supowit (other really great teachers/mentors also offered, but there was not enough time to get them all in; I am thankful to all).

I got in! I am in the initial group of 401 students, selected from among the 2359 who applied. We will be starting in January and I am immensely excited!

My goals are:

1. I want to become a very strong engineer specializing in machine learning
3. I want to be part of a revolution in education.

The first term of course offerings includes: Advanced Operating Systems, Networks, Software Development, Machine Learning, AI for Robotics. Due to life and work constraints, I plan to start off with only one class. I am leaning toward Machine Learning right now because it has been a while since I studied it and I need a good review!

Though not strictly an autodidactic affair, I will be blogging about my endeavors here. I will probably post my code on Github: https://github.com/shashir.

## Brainstorming ideas for online science education (+ MOOC lore)

Note: I am a MOOC enthusiast and have no affiliation with any MOOC platform.

In late 2011, three novel online computer science education platforms arose: ai-class taught by Profs. Sebastian Thrun and Peter Norvig, ml-class taught by Prof. Andrew Ng, and db-class taught by Prof. Jennifer Widom. They were novel because unlike previous iterations of “open source” classrooms (including MIT OCWArs Digita University, Khan Academy, etc.), these platforms were built to be interactive for large online audiences. The first project ai-class became KnowLabs/Udacity and the remaining two became Coursera. Soon after, the term MOOC (massive open online course) caught on and we are undergoing something like a revolution in this world.

In late 2011/early 2012, having experienced the first generation pilots of Udacity and Coursera as a student, I brainstormed some ideas into a “pitch.” None of these ideas are new and surely many organizations were thinking similarly. However, not all of these ideas have been completely realized, IMHO. That’s partly why I am posting a heavily edited version of that “pitch” of brainstorming ideas here.

Note that I wrote the following document in early 2012 when KnowLabs and Coursera were only starting to take off (edX and friends were nonexistent). Some bits may seem trivial or outdated.

## Introduction

Below are some ideas to pursue in open source online science education. I will generally focus on online dissemination of computer science knowledge, but nearly every discipline can work.

## Learning modules as nodes of a graph

Suppose all the learning modules were nodes of a massive directed acyclic graph. The most fundamental topics of any given subject are the starting nodes and the student follows a “path” (not strictly in a graph sense) into more advanced topics. Each edge module A—module B of this graph denotes “A is a suggested prerequisite of B.” There may be alternative “paths” (i.e. alternative sets of prerequisites) leading into the same module. This facilitates more student-driven learning options. The starting nodes can include topics as simple as “Finite state machines”, “What is binary?”, or “Introduction to circuits” and the more advanced nodes can be at the cutting edge of research in machine learning or robotics or any other area. I imagine that large subgraphs of this DAG can represent traditional college courses: Computer Architecture, Introduction to Algorithms, Complexity Theory, Computability Theory, Neural Networks, Graph Theory, Databases, Entrepreneurship, etc. There will certainly be overlap, as all subjects overlap, but completing all the learning modules in any of these subgraphs should be equivalent to completing a CS/EE XXX course at a university.

## Short, sweet, modern, and mobile

Lessons are short (average 5 mins viewing time — akin to Youtube shorts), interactive, and mobile. The original ai-class, ml-class, and db-class (Fall 2011) broke down the “lectures” into short videos interspersed with quizzes. There is potential for even greater interactivity. Perhaps the modules are part video, part coding, part technical reading, and part game. There already exist platforms which implement simplified code interpreters and REPLs: db-class featured a SQL interpreter; CodeYear has an interactive console for teaching JavaScript. Perhaps a simulated breadboard should be an essential component of a basic online electronics class? Due to the ubiquity of mobile devices, expect business executives, housewives/househusbands, athletes, kids in schools, or practically anyone wanting to consume bite-sized lessons on their mobile devices at any place and at any time.

## Content growth

Lessons are populated by a collaborative social network like a wiki. Currently, most online education projects recruit academics to produce the content while the students have no direct impact. Assuming that teaching is the best form of learning, perhaps students can be encouraged to help create the learning modules. This provides a way to constantly evolve the content and expand it much faster than is possibly by a small group of experts. I think Wikibooks is a semi-successful attempt at this, but I think that they are very limited by their platform (which encourages no interactivity with the reader).

## Content selection and philosophy of engaging students

(Section omitted.)

## Modern Sanskrit Transcription

I was sifting through some of my archives and noticed a fun article I had written about common methods for Sanskrit transcription. The information is already dated now, but it might still be interesting to some!

http://shashir.autodidactus.org/shashir_umich/sanskrit_transcription.html

This used to be located at: http://www-personal.umich.edu/~shashir/sanskrit_transcription.html

## Bachmann-Landau Asymptotic Notation

(This article looks technical, although it is really not. I overuse mathematical symbols if only to exercise a newly found way to display $\TeX$ within HTML. Other people may prefer to discuss asymptotic notation in a much more concrete way.)

According to Donald Knuth, asymptotic notation was invented by Paul Bachmann and popularized by Edmund Landau (both were mathematicians). It is a commonly used method for studying how an algorithm scales with input sizes or input complexity. We will now get some general definitions out of the way.

Since we will be speaking of sets of functions, we will use the set theory notation for these (given with respect to domains and ranges). Suppose we have sets A and B. We define $B^A$ to be the set of all functions having A as their domain and B as their range. This means that $f \in B^A \iff f:A \to B$.

Definition. Consider a real function $f \in {\bf R}^{\bf R}$. We define the “Big O of eff” as: $O(f) = \{ g \in {\bf R}^{\bf R} : (\exists x_0 \in {\bf R} (\exists c \in {\bf R}_{>0} (\forall x>x_0(|g(x)| \le c|f(x)|)))) \}$.

Definition. Consider a real function $f \in {\bf R}^{\bf R}$. We define the “Big Omega of eff” as: $\Omega(f) = \{ g \in {\bf R}^{\bf R} : (\exists x_0 \in {\bf R} (\exists c \in {\bf R}_{>0} (\forall x>x_0(c|f(x)| \le |g(x)|)))) \}$.
Definition. Consider a real function $f \in {\bf R}^{\bf R}$. We define the “Big Theta of eff” as: $\Theta(f) = O(f) \cap \Omega(f)$.

(CLR prefers Big Theta, but DPV prefers Big O.)

In a way, Big O, Omega, and Theta can be seen as maps from ${\bf R}^{\bf R}$ to the power set $\cal{P}({\bf R^R})$. There also exist the stronger “Small O” and “Small Omega” sets, but we will not be concerned with these.

We defined Big O, Omega, and Theta to be subsets of the set of real functions ${\bf R}^{\bf R}$. However, we may be concerned with functions other than real functions (perhaps even functions in multiple variables). We can opt for a more general definition of Big O, Omega, and Theta as long as our domain and range have linear orderings. In fact, we are often concerned with sets of functions whose domain is the naturals and whose range is the positive reals, i.e. ${\bf R}_+^{\bf N}$. Note that functions with different domains and ranges may not be order-comparable in the way we defined Big O, Omega, and Theta. So we must be careful to maintain the same domain and range for all the functions that we are interested in comparing.

If Big O were defined as a binary relation on the set ${\bf R}^{\bf R}$, then it would be reflexive, transitive, and antisymmetric (i.e. a partial ordering). Similarly, Big Omega as a binary relation would also be reflexive, transitive, and antisymmetric. Moreover, Big O and Omega are inverse relations since $f \in O(g) \iff g \in \Omega (f)$. If Big Theta were defined as a binary relation, then it would be reflexive, symmetric, and transitive. Therefore, Big Theta forms an equivalence class of functions. Even though both Big O amd Omega relations have the possibility of strict set inclusion (e.g $O(1) \subset O(x) \subset O(x^2) \subset O(x^3) \subset \cdots$, this is not possible with Big Theta. Therefore, we have $\Theta(f) \subseteq \Theta(g) \iff \Theta(f) = \Theta(g)$.

We are generally very lax when dealing with asymptotic notation. For example, it is (more) common to write $O(f(x))$ instead of $O(f)$. It is also more common to write the function’s analytical expression within the Big Letter’s parentheses, e.g. if $f(x)=x^2+2$, then Big Omega of eff may be written as $\Omega (x^2+2)$, which simplifies to $\Omega (x^2)$. This abuse of notation is dangerous because there is no way to distinguish a function symbol and an independent variable symbol (just consider taking the Big O of a function such as $x(t)=e^t$). So general conventions restrict independent variables to the letters x, n, and t. If we see one of these letters within the parentheses of Big O, Omega, or Theta then we are dealing with the analytical expression for a function and not the function itself.

## References

Cormen, T., Leiserson, C., and Rivest, R.  Introduction to Algorithms.

“Big O notation.” Wikipedia http://en.wikipedia.org/wiki/Big_O_notation.

## The study of the French language

A few years ago, I had resolved to study French to the point where I may comprehend a popular movie or a novel. I have forgotten why I pursued French over all the other languages in the world (it might have had something to do with a girl). Whatever the original reason, as of recently, French has become a regular hobby for me and I already feel rewarded. In this article, I will explain the stages I had undergone so far and my current sense of learning French.

## Naive Stage

Since the faculty for language is biologically “native” to humans (see Universal Grammar), I felt that I had only to provide my brain with the grammatical paradigms and vocabulary for it to automatically put together syntactical French sentences. This would certainly be possible if my faculty for language had also evolved an interface through which to download all the grammatical rules as nice logical English sentences and thenceforth apply them whenever I chose to speak in French. However, if the intention of communication is to transfer semantics between people, then this method is hopeless for several reasons.

• Interface. I don’t yet know a “quick” way to train my brain to pick up grammatical rules and to apply them naturally. The only method that has worked so far has required a lot of time and a lot of un-coerced “pattern-grokking” by my mind.
• Inefficiency. Memory seems to evaporates with disuse. I have forgotten most of my vocabulary from this stage of my French learning. It becomes extremely difficult to consciously remember all the grammatical rules every time.
• Grammarians. Are. Often. Dumb. There is a fundamental difference between grammarians and linguists… the latter describe language… the former coerce a subjectively chosen set of ancient rules which don’t always conform to popular usage.
• Common usage. Even perfectly natural and inclusive grammars will contain more unusual sentences than the commonly familiar ones. It would be nice to construct far-fetched sentences, but contrived semantics would be obscured by the unfamiliar usage.

## Semi-enlightened Stage

When the methods of the Naive Stage failed, a new system of French language learning had to be devised. Formal and Informal Speech and Writing had to be mastered by constant exposure and usage. Eh. Obvious enough. To this end, I am just exposing myself to as many French audio programs as possible to become accustomed to “naturally” speak French impromptu. I suppose no one audio program will provide a complete variety of sentence construction. Thus, I am exposing myself to multiple programs in order to gain enough practice with all the types of sentences (especially the most common kinds, which I am proportionately more exposed to). Memorization of grammatical rules and vocabulary are still important, but not the primary concern. The following resources are coming in handy at this stage.

Speech.

Michel Thomas Method French For Beginners by Michel Thomas et al. This is an audio series featuring the host Michel Thomas with two students, one male and one female. It is very easy to follow despite Thomas’s severe accent and the blunders of the female student (we learn from her mistakes). Covers very basic pesent-tense and future-tense scenarios. I finished the “Beginners” series, and will be starting on “Advanced” soon.

Pimsleur French I, II, III. This is a very long audio series compared to Michel Thomas’s. However, it provides plenty more practice for sentence formation in multiple varieties. For now, I consider this my staple for French speech practice.

I haven’t yet experienced Alan Moys’s program or the Barron’s program, but I suppose I will look into those soon.

Written.

Schaum’s Outline of French Grammar by Mary Coffman. This text seems very thorough and really makes sense of some of the unexplained patterns from the audio series. I often consult this as a guide. I should some day just read through the entire thing to make sure that I am familiar with all the major topics.

Harry Potter a L’ecole Des Sorciers by J. K. Rowling. My favorite children’s fantasy novel in French! Good practice, but it is quite difficult to work through despite knowing the storyline beforehand.

## The study of Data Structures and Algorithms

I have become interested in computer science some years ago and I am now concerned with educating myself on the fundamentals of “Data Structures and Algorithms” as taught in the courses of the same name at many universities. I never actually enrolled in such a course and so my curriculum is somewhat impromptu.

In a college course, from the student’s point of view, every new class is a surprise known most clearly only to the instructor. In a self-taught regimen, “surprises” are pre-arranged. This article is a syllabus for such a study. I am basing it on the advice of instructors, friends, and MIT’s OpenCourseWare (courses 6.006 and 6.046).

Textbooks:

by Cormen, Leiserson, and Rivest (henceforth “CLR”). This book is thick, ugly, and believed to be essential by many computer scientists.

by Dasgupta, Papadimitrou, and Vazirani (henceforth “DPV”). This book is light, written very cleanly, and seems completely incompatible with CLR. The pre-print is available for free online.

Topics (unordered list of main topics):

• Graphs and paths
• Binary search trees
• Divide and conquer algorithms
• Greedy algorithms
• Sorting
• Searching
• Hashing
• Dynamic programming
• NP-completeness
• Linear programming
• Quantum algorithms

Curriculum:

I will be alternating the chapters between CLR (associated with the MIT syllabus) and DPV. Article posts are made by topic rather than chapter. This way, revisited topics can be added to with new insights gathered from different resources.

All articles, especially this one are in constant revision until I am satisfied.