Learning About Big O Notation as a Self Taught Programmer

I am a self-taught programmer.

Out of sheer passion and weaponized ADHD hyper-focus I have been able to learn a lot about programming and the web app space.

My career has mostly been in web design and marketing. Taking designs, translating them into functional HTML, CSS, and JavaScript. Making them functional (and maintainable) with PHP. I built on that and learned about web application security, database design, systems administration. I became “full stack” in the sense that I can design a website or web app, build the application (in the early days, from scratch, later with frameworks like CodeIgniter and CakePHP, today with Rails), design performant database schema, architect the infrastructure, build the CI/CD pipelines, deploy, maintain, monitor, manage, etc.

I have been doing that for 20 years. Designer, Developer, and Systems Admin (wearing a DevOps cape).

I can recognize good code. I have a sense of when code feels right. I refactor. I write tests. I am getting better at designing features in code. For the last 5 years I have been deeply immersed in the world of Ruby and Ruby on Rails, with a goal to become a world-class software engineer.

But I recognize the weakness in my game. The weakest part of my stack is foundational. It’s a little painful to admit that, I will be honest.

I never learned data structures and algorithms.

For me to step up to that next level I strongly believe I need to develop these foundational skills. Being able to recognize why a bit of code may be performing poorly (it’s O(N^2)!), or to recognize some patterns to know what sort of algorithm or data structure may be best suited, means far less brute force, and more sophistication in my work.

I am aware that Big O notation is a thing. Last month I couldn’t tell you what it meant though. I also use arrays all the time, couldn’t tell you anything about how they work behind the scenes, or how to implement one in Go (for example).

This is the year I strengthen the foundation!

I am reading the excellent book “A Common Sense Guide to Data Structures and Algorithms” The first part of the book is an introduction to Big O notation. I am going to try to explain it, as I understand it, after reading the first 8 chapters.

Big O notation is a way to describe how many steps an algorithm will take given N data elements in a worst case scenario.

We think about Big O in terms of categories. The notation used is always going to be the worst performing category that is identified. Meaning, if you have both O(N) and O(N^2) in your algorithm then it is represented as O(N^2). Big O ignores constants, so if your algorithm could be expressed as N + (N-1) + (N-2)… it is equivalent to N2 / 2, which means we just write it as O(N2)

The categories I know of so far:

  • O(1) - constant time
  • O(log N) - logarithmic time
  • O(N) - linear time
  • O(N^2) - quadratic time
  • O(2^N) - exponential time
  • O(N!) - factorial time

What is funny is that in my career I’ve come across code like this (written code like this plenty too) and I’ve learned pretty quickly that if you have nested array iterations (like, the same array) you are likely going to have performance issues. You just learn to not do that. Now I have a name for it, and can understand not just that it’s bad, but why it is bad. It’s a signal to find a more performant algorithm.

for ($i = 0; $i <= count($array); $i++) {
  for ($j = 0; $j <= count($array); $j++) {
    // eg: comparing $array[$i] to $array[$j;
  }
}

What I can say now is that this is a quadratic time algorithm, O(N2). As the number of elements increases, the number of steps grow exponentially (okay technically polynomially - I had to look that up). A data set of 5 means 25 steps, a data set of 20 means 400 steps, 80 elements would be 6400 steps!

I am very much enjoying the book. Looking forward to getting deeper into data structures. Next steps are to put it into practice. I think leetcode is likely the way :D

Comments

Interactions