Feeling Isolated

Chris Oldwood from The OldWood Thing

By and large I think I’ve been fairly lucky with my time as a contract programmer. Virtually all the teams I’ve worked in and systems I’ve worked on have been pretty decent. None of them are going to change the world but they’ve been enjoyable, which is probably why I’ve ended up working on them for a decent length of time [1].

I can only say “virtually all” because one contract sadly fell way short of the mark. Although I was technically part of a team it only really felt that way from a managerial perspective, even though we shared a codebase. I felt somewhat isolated both physically and mentally. Aside from the morning stand-up I could easily have gone the rest of the day without speaking to my teammates if I had chosen to do so.

Physical Isolation

I started the contract on a separate floor from the rest of my team with a couple of other recent joiners [2]. We were the only people on that floor with the air conditioning on full blast so we had to wear our coats in the afternoon to stay warm. None of the rest of my team had an office pass that could access the floor either, should they want to talk face-to-face while getting us up to speed.

Even when they moved us onto the same floor a month later we were still on the opposite side of the room. In the next desk shuffle I got to swap colleagues although they were working on an entirely separate area of the system with a totally different bunch of people so we had little need to collaborate per-se, only to make small talk. Also the two desks next to me only seemed to be used for a game of Tower of Hanoi by the office movers given how the occupants came and went.

Even my “customer”, at least, the one I knew about, because they were paying for the project, was situated in a different country and spoke a different language. Although their English was way better than any knowledge I have of a second language I quickly discovered why most communication was via email or IM instead of vocally.

Project Isolation

Being an enterprise scale organisation the work was all about projects, and who was sponsoring how many “resources”. Nowhere was this more apparent than the Scrum Board with its project-oriented swim-lanes. Each swim-lane had the names of the team members assigned to that project, and as the stand-up proceeded it walked down the board a project at a time with each member of the sub-team providing an update.

It was fairly apparent right from the moment I started, just by reading the body language of the team members, that there was often little real interest in what the rest of the team was doing. Those that did, cut across projects to some degree because they tended to nurse the build system, deployments and monitoring. A couple of team members never attended our stand-up because they already attended a different one that encompassed their project.

To be fair some of the apathy at the stand-up was almost certainly down to its excessive length. And with little reason for attending except to provide a status update for the managers it’s no surprise those mostly on the periphery zoned out. Sometimes the only common goal of the team seemed to be to not break the system.

Code Isolation

During my short stint I effectively had one feature to work on. There were a couple of other minor tweaks to begin with but ultimately my project was one feature (nay, user story) and it took 5 months to deliver. That one feature involved making a change in an area of the codebase that nobody else knew except one of the tech leads who I soon discovered was leaving. In fact, taking away his days off after the announcement of his departure, I effectively had 3 days for any handover.

Not only were there no docs to work from there were no tests either. The only real knowledge about how any of the service was expected to behave had left firmly inside the head of the author. This pretty much just left doing a spot of software archaeology with the VCS in the hope that the commit messages might contain some extra clues. Many features had been tracked in a feature tracking tool but there were not enough licenses to go round so I had to hassle a teammate to look things up. Even then it often wasn’t worth it as there were no useful details; it felt like the ticket was just there to “tick a box”.

The code relied heavily on the caller “doing the right thing” so any understanding only made sense if you already knew what the caller was supposed to do, and that relied heavily on knowledge of the problem domain and the organisation’s other systems. (At the interview I made it perfectly clear that I still knew little about the problem domain, despite the many years I have worked in it [3].)

Methodology Isolation

Ever since I had my epiphany [4] around testing all those years ago I have become a firm believer in TDD and automated testing as the preferred approach to the sustainable delivery of quality software. Being told early in the project that “you won’t have time to write tests”, despite being asked in the interview about what your approach is, did not bode well.

It soon became apparent that the previous approach had been to rush something out and rely on manual, end-to-end testing and the customer doing things “right”. Validation was almost entirely left to the underlying maths library and so bizarre errors manifested and needed investigating by the developers due to a lack of basic error handling and reporting [5].

With no way of knowing if I had broken anything, because I didn’t know for sure what anything was supposed to do, my only recourse was to write new code with tests and then refactor later when someone (potentially me) could be sure that it was safe to do so. For existing code that I had to change or understand I would write a barrage of tests first to try and ensure I didn’t accidentally break anything. In some cases it was hard to know what was “by design” and what was “by accident”.

Clearly not everyone took this approach, as you can see in “It Compiles, Ship It!”. My pessimism paid off though once the edge cases and little extras started appearing as I could turn around a fix or improvement (safely) in minutes due to my suite of automated unit and regression tests.

Environment Isolation

Sadly, despite my ability to push through changes quickly into the integration test environment, it still took weeks for them to actually appear in the production environment. When my first task, a handful of lines of boilerplate code, took 6 weeks to make it into production I assumed continuous delivery was not something they cared about.

On the contrary, for one aspect of the business, releases were very frequent. It was just that I was on the other side and due to some (IMHO) poor architecture and deployment decisions my part of the distributed system was tightly-coupled to another (major) system’s release cycle.

While it might seem great having my own integration test environment to play with, I ran into issues no one else knew about and I had no idea who was really using it and for what. Once again that information pretty much departed with the author.

Parting Thoughts

On reflection I have to look at my own behaviour first and ask myself whether I was at least partly responsible for feeling left out. Once we moved onto the same floor it was definitely easier to wander over and ask people questions, which I did. However when the response is “well I worked all this out by myself originally” and “that’s more than anyone ever gave me” I think it’s not entirely unfair to assume that knowledge sharing isn’t high on some people’s agenda.

I believe I was as welcoming as I normally am and was happy to help out where possible, given the limited knowledge I had acquired. I guess that culturally there was such a large drive for autonomy that the idea of just chatting about stuff to see what improvements in the system or process would be beneficial just wasn’t on the cards. A couple of times what should have been a constructive comment or question definitely came out of me more as a snide remark which is never a good sign. I’ve been trying hard to be more aware of any sarcasm, which unfortunately comes all too easily to me, and so not add to any unnecessary negativity but I know I failed a few times.

Ultimately I think it says a lot about an organisation that rejects your approach because “they are not a start-up” when your application of that approach has only ever been in large enterprises and none of them has ever had an issue with it before. On the contrary they have often been grateful for the insights and improvements that I’ve brought.

Maybe if I was a lot younger I’d not have known any better and stuck it out a bit more but these days I know it’s just not worth the effort. I feel comfortable that I left the place in a better state than I joined it by documenting various things and writing tests for the code I wrote. After a slightly rocky start my customer seemed pretty pleased with everything I delivered, which I guess is largely what matters most.

As ever, my main regret is leaving behind some people that I wish I could have gotten to know better. Maybe I will, in another life, one where the benefits of collaboration are more positively encouraged.

 

[1] Mostly my tenure has been measured in years, not months.

[2] Only one of which was left when I called it a day – the other two barely lasted a month or so.

[3] See “Problem Domain Expert or Technical Expert or Even Both” for more on this recurring theme.

[4] See “My [Unit] Testing Epiphany” and my more recent ACCU / Agile on the Beach talk “A Test of Strength” for what lead to my enlightenment.

[5] Poor error messages is a popular topic of mine, see “Terse Exception Messages”. Also “The Perils of DateTime.Parse()” covers one specific example.

Fallibility

Chris Oldwood from The OldWood Thing

I’ve generally been pretty fortunate with the people I’ve found myself working with. For the most part they’ve all been continuous learners and there has always been some give and take on both sides so that we’ve learned different things from each other. Many years ago on one particular contract I had the misfortune to be thrown a curveball twice, by two different teammates. This post is a reflection on both theirs and my behaviour.

The Unsolicited Review

The first incident occurred when I had only been working on the project for a few weeks. Whilst adding some new behaviour to one of the support command-line tools I spotted some C++ code similar to this:

std::vector<string*> hosts;

for (. . .)
  hosts.push_back(new string(. . .));

Having been used to using values, the RAII idiom and smart pointers for so long in C++ I was genuinely surprised by it. Naturally I flicked back through the commit log to see who wrote it and whether they could shed any light on it. This was also out of place given the rest of the code I’d seen. I discovered not only who the author was, but realised they were sitting but a few feet away and so decided to tap them up if they weren’t busy to find out a little more.

Although I cannot be sure, I believe that I approached them in a friendly manner and enquired why this particular piece of code used raw pointers instead of one of the more usual resource management techniques [1]. What I expected was the usual kind of “Doh!” reply that we often give when we noticed we’ve done something silly. What I absolutely wasn’t prepared for was the look of anger on their face followed by them barking “Are you reviewing my code? Have I asked you to do that?”

In somewhat of a daze I apologised for interrupting them and left the code as-was for the time being until I had due cause to fix it – I didn’t want to be seen to be going behind someone’s back either at this point as that might only cause even more friction.

Not long after this episode I had to work more closely with them on the build and deployment scripts. They would make code changes but then make no effort to test them, so even when I knew they were wrong I felt I should wait for the build to fail (a 2 hour process!) rather than be seen to “review” it.

Luckily the person left soon after, but I had already been given the remit to fix as many memory leaks as possible so could close out my original issue before that point.

Whose Bug?

The second incident features someone I actually referred to very briefly in a post over 5 years ago (“Can Code Be Too Simple?”), but that was for a different reason a little while after the following one.

I got pulled into a support conversation after some compute nodes appeared to be failing to load the cache file for a newly developed cache mechanism. For some reason the cache file appeared to be corrupted and so every time the compute process started, it choked on loading it. The file was copied from a UNC share on-demand and so the assumption was that this was when the corruption was happening.

What I quickly discovered was that the focus of the investigation was around the Windows API call CopyFile(). The hypothesis was that there was a bug in this function which was causing the file to become truncated.

Personally I found this hypothesis somewhat curious. I suggested to the author that the chances of there being a bug in such a core Windows API call in a version of Windows Server that was five years old was incredibly slim – not impossible of course, but highly unlikely. Their response was that “my code works” and therefore the bug must be in the Windows call. Try as I might to get them to entertain other possibilities and to investigate other avenues – that our code elsewhere might have a problem – they simply refused to accept it.

Feeling their analysis was somewhat lacklustre I took a look at the log files myself for both the compute and nanny processes and quickly discovered the source of the corruption. (The network contention copying the file was causing it to exceed the process start-up timeout and it was getting killed by the nanny during the lengthy CopyFile() call [2].)

Even when I showed them the log messages which backed up my own hypothesis they were still somewhat unconvinced until the fix went in and the problem went away.

Failure is Always an Option

Although I hadn’t heard it back then, this quote from Jeffrey Snover really sums up the attitude I’ve always tried to adopt with my team mates:

“When confronted by conflict respond with curiosity.”

Hence whenever someone has found a fault in my code or I might have done the same with theirs I do not just assume I’m right. In the first example I was 99% sure I knew how to fix the code but that wasn’t enough, I wanted to know if I was missing something I didn’t know about C++ or the codebase, or if the same was true for the author. In short I wanted to fix the root cause not just the symptoms.

In the second example there was clearly a conflict in our approaches. I’m willing to accept that any bug is almost certainly of my own making and that I’ll spend as much time as possible working on that basis until the only option left is it for to be in someone else’s code. Although I was okay to entertain their hypothesis, I also wanted to understand why they felt so sure of their own work as Windows API bugs are, in my experience, pretty rare and well documented [3].

Everyone has their off days and I’m no exception. If these had been one of those I’d not be writing about them. On the contrary these were just the beginning of some further unfortunate experiences. Both people continued to display tendencies that showed they were overconfident in their approach whilst also making it difficult for anyone else to critique their work. For (supposedly) experienced professionals I would have expected a little more personal reflection and openness.

The consequence of being such a closed book is that it is hard for others who may be able to provide valuable insights and learning to want to do so. When you work with people who are naturally reflective and inquisitive you get a buzz from helping them grow, and likewise when they teach you something new in return. With junior programmers you can allow for a certain amount of arrogance [4] and that’s a challenge worth taking on, but with much older programmers the view that “an old dog can’t learn new tricks” makes the prospect far less rewarding.

As an “old dog” myself I know that I probably have to work a little harder these days to appear open and attentive to change and I believe that process starts by accepting I’m far from infallible.

 

[1] In this instance simply using string values directly was more than adequate.

[2] The immediate fix of course was simply to copy to a temporary filename and then rename on completion, see “Copy & Rename (Like Copy & Swap But For File-Systems)”.

[3] The “Intriguing SCHTASKS Bug” that I found back in 2011 was certainly unusual, but a little googling turned up an answer reasonably quickly.

[4] See “The Downs and Ups of Being an ACCU Member” for my own watershed moment about how high the bar really goes.