Staples of human-centric software engineering

One of the leit-motifs of my software engineering career has always been trying to focus people's attention on the fact that producing software is as much of an humanistic discipline, if not more, than it is a scientifical one. Always trying to find a way to convey this message, I've tried to sum up and condense a set of principles that I've been using to guide my approach to software engineering, and that I've been sharing with my most trusted colleagues and dearest friends over the course of the last 10+ years.

Note that most examples will focus on specific aspects of software crafting, such as coding or infrastructure architecture, but the principles are meant to be applied to the whole software development lifecycle, from inception to maintenance, and the idea behind this whole post is to shed a light on the holistic nature of professional software engineering, and how it's not just about focusing on vertical silos of technical virtuosity.

Testing: Unveiling the primary goal

Coupling:

Before we dive into the main role of testing, let's first grasp the concept of coupling. In software development, coupling refers to the degree of interdependence between different modules, classes, or components of a system. Tight coupling occurs when two or more elements are strongly intertwined, making it difficult to modify or replace one without affecting others. On the other hand, decoupling promotes loose relations, enabling independent development, testing, and evolution of individual components of a complex domain, without concepts leaking from one context to another.

When tests are burdened by the need to navigate a web of dependencies, they become difficult to write and maintain. This is a sign of high coupling, which should not be fought by "fixing" our tests and coming up with complex mocking strategies, but by fixing the way we architect our software, and properly defined boundaries and contracts between its layers.

The hidden power of tests:

While testing is commonly associated with verifying the correctness of software, its true power lies in its ability to decouple domains. Although discussing a singular technical task within the realm of IT is not my preference, let us direct our attention to a renowned coding pattern to exemplify this notion

By adopting test-driven development (TDD) or writing tests alongside code, developers are compelled to break down complex functionalities into smaller, testable units. These units, or modules, can be developed independently, reducing their interdependencies and fostering loose coupling.

Decoupling and design principles:

Decoupling code through testing aligns with several essential design principles, such as the Single Responsibility Principle (SRP) and the Open-Closed Principle (OCP). SRP encourages each module to have a single responsibility, enabling easier comprehension and modification. By isolating units for testing, developers are prompted to identify and separate concerns, reinforcing the SRP. Similarly, OCP emphasizes extending functionality through new code rather than modifying existing code. Decoupled code allows for easier extension without impacting existing components.

Real world example:

But let's see how this applies to the real world.

Let's say we have a piece of code that checks if a user is entitled to a promotion. The code is tightly coupled to the database, as it needs to fetch the user and promotion data from it to perform the check. This makes it difficult to test the code in isolation, as we need to mock the database to verify the behavior.

import { db } from '@infra/db';

export function isUserEntitledToPromotion(
  userId: User['id'],
  promotionId: Promotion['id']
): boolean {
  const user = await db.users.find(userId);
  const promotion = await db.promotions.find(promotionId);

  return user.registrationDate < promotion.expirationDate;
}

This is a smell, as the function is doing too much. It's not only checking if the user is entitled to a promotion, but also fetching the user and promotion data from the database, directly depending from the infrastructure layer. Instead of "fixing the test", by trying hard to mock the database, we should fix the code. The function should only check if the user is entitled to a promotion, nothing more.

It could be rewritten as:

export function isUserEntitledToPromotion(
  user: User['registrationDate'],
  promotion: Promotion['expirationDate']
): boolean {
  return user.registrationDate < promotion.expirationDate;
}

In this way, we can test the function in isolation, without the need to mock the database.

This not only makes the code more testable, but also more maintainable, as it's easier to understand and modify. The piece of business logic is now decoupled from infrastructural concerns, This is a good example of the Single Responsibility Principle (SRP), which states that a module should have only one reason to change. In this case, the module is the isUserEntitledToPromotion function, and the reason to change is the change of eligibility criteria for users to be entitled to a promotion, not updates to our database client or schema.

Code read/write ratio

Despite its sometimes obscure and sci-fi looking syntax, never forget that code (well, at least the code you’re writing) is made for humans. If humans weren’t part of the equation when implementing new programming languages, then we’d be all well off writing bytecode by hand, spending countless hours (see feasibility-boundaries) to implement the simplest logic.

Visual example

function jsSumInt(array, n) {
  var s = 0;
  for (var i = 0; i < n; i++) {
    s += array[i];
  }
  return s;
}
(func $stackAlloc (;19;) (export "stackAlloc") (param $var0 i32) (result i32)
  (local $var1 i32)
  block (result i32)
    global.get $global6
    local.set $var1
    global.get $global6
    local.get $var0
    i32.add
    global.set $global6
    global.get $global6
    i32.const 15
    i32.add
    i32.const -16
    i32.and
    global.set $global6
    local.get $var1
  end
)
(func $stackSave (;20;) (export "stackSave") (result i32)
  global.get $global6
)
(func $stackRestore (;21;) (export "stackRestore") (param $var0 i32)
  local.get $var0
  global.set $global6
)
(func $establishStackSpace (;22;) (export "establishStackSpace") (param $var0 i32) (param $var1 i32)
  block
    local.get $var0
    global.set $global6
    local.get $var1
    global.set $global7
  end
)
(func $setThrew (;23;) (export "setThrew") (param $var0 i32) (param $var1 i32)
  global.get $global8
  i32.eqz
  if
    local.get $var0
    global.set $global8
    local.get $var1
    global.set $global9
  end
)
(func $setTempRet0 (;24;) (export "setTempRet0") (param $var0 i32)
  local.get $var0
  global.set $global10
)
(func $getTempRet0 (;25;) (export "getTempRet0") (result i32)
  global.get $global10
)
(func $_sumInt (;26;) (export "_sumInt") (param $var0 i32) (param $var1 i32) (result i32)
  (local $var2 i32) (local $var3 i32)
  block (result i32)
    local.get $var1
    i32.const 0
    i32.gt_s
    if
      i32.const 0
      local.set $var3
      i32.const 0
      local.set $var2
      loop $label0
        local.get $var0
        local.get $var3
        i32.const 2
        i32.shl
        i32.add
        i32.load
        local.get $var2
        i32.add
        local.set $var2
        local.get $var3
        i32.const 1
        i32.add
        local.tee $var3
        local.get $var1
        i32.ne
        br_if $label0
      end $label0
    else
      i32.const 0
      local.set $var2
    end
    local.get $var2
  end
)
(func $_emscripten_get_global_libc (;27;) (export "_emscripten_get_global_libc") (result i32)
  i32.const 6896
)
(func $func28 (param $var0 i32) (result i32)
  (local $var1 i32) (local $var2 i32)
  block (result i32)
    global.get $global6
    local.set $var1
    global.get $global6
    i32.const 16
    i32.add
    global.set $global6
    local.get $var1
    local.tee $var2
    local.get $var0
    i32.load offset=60
    call $func35
    i32.store
    i32.const 6
    local.get $var2
    call $env.___syscall6
    call $func31
    local.set $var0
    local.get $var1
    global.set $global6
    local.get $var0
  end
)
(func $func29 (param $var0 i32) (param $var1 i32) (param $var2 i32) (result i32)
  (local $var3 i32) (local $var4 i32) (local $var5 i32) (local $var6 i32) (local $var7 i32) (local $var8 i32) (local $var9 i32) (local $var10 i32) (local $var11 i32) (local $var12 i32) (local $var13 i32)
  block (result i32)
    global.get $global6
    local.set $var5
    global.get $global6
    i32.const 48
    i32.add
    global.set $global6
    local.get $var5
    i32.const 16
    i32.add
    local.set $var6
    local.get $var5
    i32.const 32
    i32.add
    local.tee $var3
    local.get $var0
    i32.const 28
    i32.add
    local.tee $var10
    i32.load
    local.tee $var4
    i32.store
    local.get $var3
    local.get $var0
    i32.const 20
    i32.add
    local.tee $var11
    i32.load
    local.get $var4
    i32.sub
    local.tee $var4
    i32.store offset=4
    local.get $var3
    local.get $var1
    i32.store offset=8
    local.get $var3
    local.get $var2
    i32.store offset=12
    local.get $var5
    local.tee $var1
    local.get $var0
    i32.const 60
    i32.add
    local.tee $var12
    i32.load
    i32.store
    local.get $var1
    local.get $var3
    i32.store offset=4
    local.get $var1
    i32.const 2
    i32.store offset=8
    block $label2
      block $label0
        local.get $var4
        local.get $var2
        i32.add
        local.tee $var4
        i32.const 146
        local.get $var1
        call $env.___syscall146
        call $func31
        local.tee $var7
        i32.eq
        br_if $label0
        i32.const 2
        local.set $var8
        local.get $var3
        local.set $var1
        local.get $var7
        local.set $var3
        loop $label1
          local.get $var3
          i32.const 0
          i32.ge_s
          if
            local.get $var1
            i32.const 8
            i32.add
            local.get $var1
            local.get $var3
            local.get $var1
            i32.load offset=4
            local.tee $var9
            i32.gt_u
            local.tee $var7
            select
            local.tee $var1
            local.get $var1
            i32.load
            local.get $var3
            local.get $var9
            i32.const 0
            local.get $var7
            select
            i32.sub
            local.tee $var9
            i32.add
            i32.store
            local.get $var1
            i32.const 4
            i32.add
            local.tee $var13
            local.get $var13
            i32.load
            local.get $var9
            i32.sub
            i32.store
            local.get $var6
            local.get $var12
            i32.load
            i32.store
            local.get $var6
            local.get $var1
            i32.store offset=4
            local.get $var6
            local.get $var7
            i32.const 31
            i32.shl
            i32.const 31
            i32.shr_s
            local.get $var8
            i32.add
            local.tee $var8
            i32.store offset=8
            local.get $var4
            local.get $var3
            i32.sub
            local.tee $var4
            i32.const 146
            local.get $var6
            call $env.___syscall146
            call $func31
            local.tee $var3
            i32.eq
            br_if $label0
            br $label1
          end
        end $label1
        local.get $var0
        i32.const 0
        i32.store offset=16
        local.get $var10
        i32.const 0
        i32.store
        local.get $var11
        i32.const 0
        i32.store
        local.get $var0
        local.get $var0
        i32.load
        i32.const 32
        i32.or
        i32.store
        local.get $var8
        i32.const 2
        i32.eq
        if (result i32)
          i32.const 0
        else
          local.get $var2
          local.get $var1
          i32.load offset=4
          i32.sub
        end
        local.set $var2
        br $label2
      end $label0
      local.get $var0
      local.get $var0
      i32.load offset=44
      local.tee $var1
      local.get $var0
      i32.load offset=48
      i32.add
      i32.store offset=16
      local.get $var10
      local.get $var1
      i32.store
      local.get $var11
      local.get $var1
      i32.store
    end $label2
    local.get $var5
    global.set $global6
    local.get $var2
  end
)
        
// Continues for ~50k lines...
        
The above two code snippets do the same thing. They sum all the values of an array of numbers, and return the result. The left snippet is written in JavaScript, while the right one is written in WebAssembly. The version on the right vastly outperformed the one on the left in benchmarks, but it's also much harder to read and understand.

Code is for humans

I think we all might agree that code is a gateway language for humans to declare their intentions (or at least hopes) to machines, and as such it’s meant to be read, modified and obviously written by humans. This means our code should be always strive for readability, elegant conciseness when no abstractions are needed, and abstraction when the technical details put a burden on the reader’s ability to get into the flow of a code-block’s description of our system’s logic.

As Robert C. Martin wrote in his Clean Code book:

“Indeed, the ratio of time spent reading versus writing is well over 10 to 1. We are constantly reading old code as part of the effort to write new code. ...[Therefore,] making it easy to read makes it easier to write.”.

What does it mean? It means that while writing code is a one-off exercise in technical prowess, often made by an individual alone in a relativelly small amount of time, reading it is something that happens multiple times across a longer timespan, often triggering thoughts and discussions amongst multiple people, and as such a conduit of significance, it should lend itself to be interpreted with the minimum amount of friction possible.

Programming conversations

Resuming Wittgenstein’s idea of language, we can say that language is a mind-altering substance that happens during conversation. Conversation creates feedback loops of meaning, laying down the foundations for shared understanding and a common knowledge context, which is the only window into the exploration of complex systems that we need to build and maintain.

Code is a conversation between the author and the reader, and as such it should be written in a way that makes it easy to understand.

Heisenberg principle for micro-optimisations

The Heisenberg principle, named after the physicist Werner Heisenberg, states that it is impossible to measure certain properties of a particle with absolute precision. The same works for micro-optimisations we often get lost into when trying to squeeze the last tiniest bit of performance from a line of code.

How many times has it happened to you to spend countless hours (or even days) writing benchmarks on microscopic portions of your code, only to be left perplexed by the inexplicable performance of certain portions? How many times have you tried to fight that sense of looking into a darker-than-black void by performing all kinds of measurements to your runtime behaviour, only to find out that the more you measure your system, the less you understand why it’s behaving the way it’s behaving?

As we attempt to measure and understand one aspect of performance, we often encounter the Heisenberg-esque dilemma: the more closely we focus on a specific performance metric, the less visibility we have into other contributing factors.

Analogous to the observer effect in quantum mechanics, where the act of observing a particle influences its behavior, the act of monitoring and measuring software performance can affect the system under scrutiny. And not only that, it’s also more prone to get impacted by under-the-hood variables and automatic micro-optimisations, to the point where its behaviour is not reasonably understandable by a human being.

Navigating the performance-precision trade-off

To effectively navigate the performance-precision trade-off, we must strike a delicate balance. It's essential to capture enough data to gain insights into the system's behavior without imposing excessive monitoring overhead. This calls for employing lightweight instrumentation techniques, carefully selecting appropriate performance metrics, and considering the broader context in which the system operates.

Optimizing software performance within the bounds of uncertainty

Much like the uncertainty principle does not render quantum mechanics useless, the Heisenberg-esque uncertainty in software performance debugging does not imply the futility of optimization efforts. Instead, it emphasizes the importance of adopting a holistic approach that considers multiple factors and trade-offs. By analyzing performance data, identifying bottlenecks, and iteratively refining the system, we can make informed decisions to improve software performance.

Exploring feasibility boundaries: Time as the ultimate constraint

In the fast-paced world of software development, the concept of feasibility boundaries sheds a light on the crucial interplay between time and the existence itself of outcomes. As developers, we must recognize that our time is finite and that the duration required to complete tasks directly impacts their projection on the plane of reality. Just as pictures only become a continuous motion when displayed at a certain frame rate, the feasibility of a task is directly proportional to the time allocated to its completion. By embracing this reality, we gain valuable insights into prioritization, efficiency, and the importance of delivering within realistic timeframes.

I can't even recount how many times I've witnessed a friend or colleague do some repetitive task over and over in a suboptimal way, just because they didn't have the willpower to learn a better way to do it. What they didn't take in account is that issues, of any kind and size, stack up and compound over time, playing a pivotal role in the feasibility of an operation. Copying one line of data from one spreadsheet to another is not a big deal, but doing it thousands of times is probably not even doable in a day's worth of work. That's ironically what we are paid to do as software engineers: automate repetitive tasks, and make their execution time fast enough to build upon them to create more complex systems.

The risk of excessive time

In the face of ambitious goals or complex challenges, there is a natural inclination to allocate more time for their completion. However, it's important to recognize the inherent risk associated with lengthy endeavours. The longer a task takes, the greater the chance of encountering unforeseen obstacles, changing priorities, or shifting requirements. By acknowledging the potential pitfalls of extended timelines, we can assess the trade-offs and make informed decisions that preserve project feasibility.

Feasibility boundaries remind us of the finite nature of our time and the significant impact it has on the success of our endeavors. Becoming able to get stuff done in less time is not only a personal vanity metric, it can end up playing a pivotal role in the feasibility (or not) of stuff we’re working on, as we keep ourself aware of the fact that time plays a role, and that we can only execute tasks in timeframes that allow our execution time to deliver significant results.

Introducing boring pull requests

One metric that can help us gauge the mental manageability of our codebase is the concept of "boring pull requests". A boring pull request is one that requires minimal mental effort for the developer to write, and for the reviewer(s) to review and merge. It implies that the code changes are well-organized, concise, and align with established coding standards.

Boring pull requests indicate that the codebase is maintainable, with well-separated concerns, modular components, and clean abstractions. When pull requests are consistently boring, it signifies that the codebase is in good shape, making it easier for developers to understand and contribute effectively.

In the realm of software engineering, the presence of boring pull requests serves as a testament to the codebase's robustness and comprehensibility. Such pull requests can be regarded as tangible evidence of a development ecosystem that fosters clarity, simplicity, and adherence to best practices. As software projects evolve and grow in complexity, the ability to maintain a consistent stream of boring pull requests becomes increasingly crucial, as it directly impacts the productivity and satisfaction of developers.

Modulating feedback loops frequency: signal-to-noise ratio management

In the realm of software engineering, effective communication plays a pivotal role in the success of projects. Whether it's development, architecture, testing, or team collaboration, clear and timely feedback is essential for achieving desired outcomes. However, just as signal aliasing can cause the noise level to raise when the sampling frequency is too low, miscomprehensions and unwanted communication noise can arise when the frequency of feedback loops is not appropriately modulated. In this article, we explore the concept of modulating feedback loops frequency and its significance in avoiding misunderstandings and improving overall communication in software engineering endeavors.

Signal aliasing and information loss

To understand the importance of modulating feedback loops, let's first examine the concept of aliasing.

In signal processing, aliasing occurs when a signal is improperly sampled at a rate lower than the Nyquist frequency, leading to distorted and misleading representations of the original signal. Just think of filming an helicopter's wheel spinning with a camera that has a low frame rate: the wheel will appear to be spinning in a direction and speed that are not correlated to their actual values. Similarly, in software engineering, if feedback loops occur too infrequently or are poorly timed, information can be lost or misconstrued, leading to a breakdown in understanding and potential miscommunication.

One aspect of modulating feedback loops frequency involves carefully fine-tuning the frequency of feedbacks between communicating systems, or even people. Insufficient feedback can result in anemic information: information which is isolated, spoiled of contextual meaning or lacking clarification on requirements. Conversely, an overwhelming amount of feedback can lead to information getting improperly processed, causing confusion and misunderstandings. This means that the frequency of feedback loops should be modulated to ensure that the right amount of information is provided at the right time, thus avoiding information loss and misrepresentation.

Furthermore, the timing of feedback is crucial. Feedback that is delayed or provided too late in the development process may result in costly rework or missed opportunities for improvement. Conversely, immediate and timely feedback enables swift course correction, prevents the propagation of errors, and promotes an iterative development approach. By ensuring that feedback loops occur at appropriate intervals throughout the project, teams can effectively address challenges and enhance overall efficiency.

In the development phase, regular feedback sessions allow developers to validate their progress, receive guidance on best practices, and ensure alignment with project goals. Architecture reviews provide an opportunity to assess the viability, scalability, and maintainability of a system, fostering discussions that lead to improved design decisions. During testing, prompt feedback helps identify defects and provides a mechanism for continuous quality assurance.

Effective team communication heavily relies on well-modulated feedback loops. Regular stand-up meetings, sprint retrospectives, and peer code reviews create opportunities for team members to share their insights, discuss challenges, and provide constructive feedback. Transparent and frequent communication channels promote a collaborative environment, reduce misunderstandings, and enhance the overall cohesion and productivity of the team.

In conclusion, modulating feedback loops frequency is a critical aspect of effective communication in software engineering endeavors. By finding the right balance and timing, teams can avoid miscomprehensions and unwanted communication noise, similar to how audio aliasing can be mitigated by higher sampling rates. Striking a balance between too little and too much feedback, ensuring timely delivery of feedback, and incorporating it into various aspects of software engineering promote clarity, understanding, and collaboration. By embracing a well-modulated feedback approach, software engineering teams can foster a culture of effective communication, leading to successful project outcomes and continuous improvement.

Identity as a Flux: fluid domain modeling

In a world where the line between the real and the virtual continues to blur, a surprising and provocative philosophy can be applied to the software architecture world. Enter Jean Baudrillard's idea of hyperreality, where reality is not merely concealed, but reality and simulation cannot be distinguished.

In the context of modern software architecture, the concept has become particularly resonant in the domain of identity modeling. It's an age-old challenge in system design: how to represent actors in a way that both reflects their multifaceted real-world identity and accommodates the fluidity of modern online interactions. Furthermore, by accepting contradictory information and providing mechanisms to manage these complexities, we're acknowledging that identities can be inherently contradictory and that this contradiction is a natural part of their existence and their different facts in their representation in different domains and contexts.

Identity: No longer a static concept

Traditionally, identity within systems is rigid, defined by a set of static attributes such as name, address, and date of birth. However, in our hyperconnected world, identity is fluid, reflecting diverse personas, relationships, and temporal states.

Much like Baudrillard's hyperreality, where simulations are not just mimicking reality but becoming a reality in themselves, our digital identities are not mere reflections of who we are but become a fundamental part of our existence.

Flux architecture: Embracing complexity

To represent this complex reality, the traditional methods of modeling identity fall short. Embracing Baudrillard's thinking, we must consider identity as a flux, something that's constantly changing, and full of contradictions.

The flux architecture paradigm focuses on managing changes in state over time. In the domain of identity modeling, this means recognizing that identity attributes can be transient, multifaceted, and even contradictory.

Implementing the Baudrillardian approach

In practical terms, this might mean:

  • Allowing for multiple personas: Recognizing that a single agent may wear many hats, depending on the context (e.g.: a simple "user" may be a customer, a vendor, and an employee).
  • Time-sensitive attributes: Recognizing that certain identity attributes may change over time, and making informed decisions about how to decouple descriptors from the core identity they represent in a given context.
  • Contradictory information management: Accepting that contradictory information can coexist and providing the system's users the tools to navigate and manage these complexities.
  • Contextual identity: Recognizing that identity is not a static concept, but rather a fluid one that changes depending on the context.
  • Modular identity: Recognizing that identity is not a monolithic concept, but rather a modular one that can be broken down into smaller, more manageable pieces.

The flux architecture as a paradigm shift

The contemporary challenge of modeling identity in a world that is ever-changing and multifaceted requires an innovative approach. By looking at Baudrillard's complex theories, we can gain insights into how to approach identity not as a static entity but as a constantly changing flux.

This approach not only makes our systems more robust and flexible but also aligns them more closely with the multifaceted and often contradictory nature of human existence in the digital age. It's a reminder that even in the rigid, binary world of software, we must make room for the complexity and dynamism of the real world.

Final words

In this articles we've explored a few principles under which help us reason and continuously re-frame our approach to software engineering, hopefully keeping humans at the center of our attention, and not only the machines we're building for them. As a matter of fact, we can observe how a few of these principles are actually a re-framing of well-known software engineering principles, but with a human-centric twist:

  • Testing helps us write code in parallel, not only parallel code
  • Time boundaries guide us in optimizing our execution time in delivering technical results, not the execution time of our code
  • We strive for boring pull requests to dumb down code for humans fruition, not for machines ease of execution
  • Being aware of Heisenberg's principle we don’t force humans into microscopically narrow scopes of thought, we elevate our perspective to see the system as a whole
  • Modulating feedback loops frequency helps us keep humans in the loop, not only the machines
  • Identity as a flux helps us model complex identities, not dumb them down into machine-representable attributes

With all these principles in mind, we can now start to see how they all fit together, and how they can be used to glue together a set of practices that can help us write better software, under the assumption that if we’re crafting technology whose purpose shall not be to solely exist and function in a vacuum, but to be used by humans, and serve them in the purpose of making their lives better, then also the way we approach it should be holistic and human-centric, not machine-centric.

Publication date: 15/05/2023