Scaling up Software

“Don’t believe anything you read on the net. Except this. Well, including this, I suppose.”
–Douglas Adams

In this module, we’ll discuss the reality of long-term, larger-scale software. We’ll examine how real-world software, which must release, meet the ever-changing needs of customers, and evolve and adapt over time, differs from software written in introductory programming classes. Finally, we’ll address how this forces us to change our approach when building software.


Contents


Writing Code vs Solving Problems

If you’re taking this course at the University of Virginia, you’ve likely been learning how to write code for a year or more. This means you are capable of writing simple scripts, and even small multi-class programs, especially in Java. Even if you aren’t at the University of Virginia and found this or had this website used by your instructor, you should still have some of these skills. These valuable skills are the bedrock on which to grow your skills.

However, the skill by itself is just one set of tools. A master craftsperson must know how to use all kinds of tools, and must constantly learn the usage of new tools. But knowing how to use a single set of tools in specific, well-defined situations, doesn’t make one a master. Being a master of one’s craft comes from the ability to combine these tools to create great works. It comes from being able to determine which tool is best for a specific job, and the creativity to design and plan your great works. A master craftsperson works efficiently and accurately, yes, but they also produces work that they take pride in.

In this class, I want to do more than code. I want to you to become a craftsperson, one who builds high quality work that will last; work that you can be proud of. You won’t attain mastery in one semester, nor will you become a great craftsperson during a 4-year college education in computer science. That will take decades of work, experience, failures, successes, and continued learning. But the goal of this course is to give you enough tools, techniques, and goals to start you on your journey.


Large scale systems

“‘Space,’ it says, ‘is big. Really big. You just won’t believe how vastly, hugely, mindbogglingly big it is. I mean, you may think it’s a long way down the road to the chemist’s, but that’s just peanuts to space.’”
Hitchhiker’s Guide to that Galaxy, Douglas Adams

Your classroom experience in coding so far likely has not prepared you for writing enterprise systems. The word enterprise means “a project or undertaking, typically one that is difficult or requires effort.” When we talk about enterprise software, we are talking about larger software systems. These systems are built and maintained by teams of programmers over a long period of time, and must respond to changing customer needs or environmental changes.

(A quick note that in the software community “enterprise” sometimes gets used a pejorative for older software or over-designed software. Such software does exist, and it has a well-earned reputation for becoming unwieldy and resistant to change. My key point in the previous paragraph is that commercial software systems are simply far larger than anything that can be feasibly built in an academic setting).

Unfortunately, software problems can’t be solved by just adding more people. Famously, Brook’s Law states that “adding manpower to a late software project makes it later.” Fred Brooks, the namesake of the law, notes that it takes time for even skilled programmers to become productive in a software system. That “ramp-up” time likely means an experienced programmer in the software system is spending their time on training their inexperienced colleagues. This is time the experienced programmer otherwise could have spent implementing features, refactoring code to improve understandability, etc.

Communication overhead is also a serious consideration; as the size of a team increases, the number of lines of communication increase quadratically. For instance, a team of 5 people only has 10 lines of communication, whereas a team of 10 people as 45 lines of communication! Simply doubling the size of the team more than quadruples the lines of communication. Remember, increasing quadratically is why we don’t bother with selection and insertion sort in most cases, because the growth rate is so bad.

This problem isn’t unique to software! Engineers of all stripes have to deal with novel problems emerging as the size of a project increases!


Ad hoc solutions

Imagine a scenario: You are hiking with several small children. You come across a small creek in your path. You need to get children over the creek without their feet getting wet. What do you do?

A picture of Bassett Creek in Plymouth, Minnesota, via Bassett Creek Watershed

Well, thinking quickly, you may see a log nearby, drag it over, and lay it across the creek, having kids walk over. After the kids are over, you may leave the log and walk away, never thinking of crossing that creek again.

This is an ad hoc solution: that is, a one-time solution to the problem. You had a short-term need, so you made a quick and easy solution to solve your immediate problem. You don’t care about maintaining the solution, you don’t care about other people being able to use the solution, you don’t have a long term plan in place if the log rots, etc.

The thing is, most engineering problems aren’t solved this way. But this is likely how you’ve been doing most homework so far! You build one assignment at a time, sometimes the day before it is due, you keep writing code until it works, and then as soon as it works you submit it and move on to something else. You may never even look at that code again! Obviously consumer software doesn’t work this way.


Increasing problem scale

Now imagine that instead of a small creek, you are crossing a large river. Instead of small children, you are transporting cars. And instead of crossing the river once, you need these cars to be able to go back and forth across this bridge for several decades. This means not just building the bridge, but establishing the process of how you are going to build the bridge, what is the budget of this bridge, how is this bridge going to be maintained, how are you going to monitor this bridge for faults, how will you ensure the bridge doesn’t fail when a heavy truck drives over, how will you ensure the bridge meets federal and local regulations, etc.

“Dragging over an even bigger log” isn’t a solution.

The key takeaway is that small scale ad hoc problem-solving is fundamentally different from building larger, sustainable, systematic solutions to problems. Building one-off throw away assignments in earlier programming classes does not teach you the critical skills needed for industry, where you will build software that needs to meet complex and diverse requirements and must be maintained for years or even decades after launch.

Large-scale software isn’t built by one programmer working alone for one afternoon, or cramming at midnight before the due date, testing their code against a teacher provider auto-grader. In industrial software, long-term projects built incrementally one step at a time over months or years, and maintained long after release, are the norm.


Goals of this course

In this course, we will start preparing you for writing and evolving larger software systems. You will practice incremental development techniques alongside other programmers, building more complex software than you have worked with previously.

In your future classes, you can expect to use these skills! You will have to be ready to answer several questions that will come up during your project (I include the module group most related to that question)!

  • “How are we going to share code and break up work?” - (Collaboration)
  • “How will we know if we’re on the right track?” - (Testing)
  • “How can we avoid our code becoming an incomprehensible, entangled mess?” - (Refactoring)
  • “How can be we ensure our software will be receptive to change?” - (Design)
  • “What are some established techniques we can use to ensure maintainability?” - Patterns
  • “How will our users interact with our software?” - GUI
  • “Where is our data coming from, and where is it going?” - Data

While this course teaches a grab-bag of different skills, these were not selected at random. At the end of this course, you should be ready to jump in to a larger project, and be able to build the system incrementally over time with your project teammates.

“I’d take the awe of understanding over the awe of ignorance any day.”
The Salmon of Doubt, Douglas Adams


Next submodule: