How Much Does A Mistake Cost?

Kick the can down the road, spread the blame …

Mistakes can be expensive. Depending on one’s accounting method.

I can think of four five instances in my career … without trying to think of instances.

 

1

Once upon a time “working correctly” was more important than cost (for that matter, cost was third: “proper operation” and “meeting schedule” were #1 and #2)

It was a phase-locked loop circuit for a satellite. More to it than that, but a PLL was the core. The cost of a mistake was the cost of launch, the cost of a replacement, and perhaps the cost of bankruptcy.

A moderately experienced engineer (MSEE with 5 yrs or so) was designing the control board. All worked well and the numbers >looked< good … then the brassboard came out of manufacturing … and it didn’t work. Brassboard was the last stage before flight-qualification – in essence, it IS the flight qualification network.

No one takes the time to do more than skim – perhaps only glance – at an engineering report. A proper one is long, boring, and full of math.

As was this one. But the problem might have been caught earlier if the reports had been properly reviewed.

During the “What happened?” phase, it was discovered that the lead engineer used a “memorized” equation – and forgot to square one term in the denominator of a moderately complex expression. Of course the numbers worked – the analysis being based on the term with x rather than x^2. The results were all consistent as they should have been but the reference point was incorrect – as were the results.

The lead engineer was almost terminated – it cost the company much money and time as the mistake had been made months earlier and he should have caught it – but with enough colleague support (and the fact that the reviewers didn’t catch the error either), management kept him on. But he had not yet regained a lead role by the time I left the organization. I never heard anything more.

That part of the project was put on high priority and overtime paid (and cost absorbed by the company) … recall I stated that proper operation and schedule were priority #1 and #2. As far as I recall, the company was successful with the project … but not so much so with the expected profit.

Had the mistake not been caught, the bird would have launched … and failed to meet its objective. Although it’s likely the mistake may have been caught later in the verification process, by that time it would have been too late to meet schedule.

 

I’ve not seen the attitude of “correct is most important” in a number of years. Talked of, yes … but not practiced when the chips fall. I get the idea that nowadays, there’s more money to be made in “progress” than first-round “success”.

I don’t memorize equations; I memorize networks and the location of where the necessary equations may be located (“I want a log function with MOS. What are the proper subthreshold equations?“). And write them down along with derivations and project notes … which leads to many of the articles I may end up posting on this site.

Has anybody noticed how often reference textbooks leave so much out? Not every term equals “1” or goes to “0” in the real world.

 

2

Coming on Christmas. IC project behind schedule; unexpected problems – not least of which were inadequate device models and computers (good computers were – and are – expensive; why spend the money? – the engineers can work around them).

A very senior engineer was forced to work over Christmas to meet EOY due date. I’m not sure what the stick was but it was there. His H1B helper was threatened with deportation to the point of tears (which I directly overheard). The engineer did due diligence, got the plans out before the EOY deadline with the materials and models he had … and as expected, the project didn’t work in silicon – the models were wrong through no fault of his own.

The manager was praised and given a bonus; “he” made schedule. The engineer was admonished; his design failed and cost the company money. (The manager didn’t meet schedule, the engineer did; the manager stayed home and enjoyed the holidays. But that’s the way the ball bounces …)

“First to market” with junk doesn’t do anyone any favors … Meeting a schedule does not assure success (see #4)

I was not involved with that project but I terminated my contract with that organization shortly thereafter (I was 1099, not W2). That boss is still a member of my “Top 5” list (he’s held that position for near-on 20 years – so bad bosses may be memorable but not common).

 

3

A software-controlled data-acquisition system. Software development was in lead role. Each of the individual systems worked more or less as desired, but during prototype testing of the entire project, some flaws were discovered – the DAQ system appeared to be dead, no data was recorded.

Much effort was expended and blame abounded; the flaw in the DAQ board could not be found and management was on the backs of the hardware development team to find the problem. The system never did work, the project shut down, and hardware people let go as “not competent”.

During follow-up debrief, it was discovered that when a parameter was changed in software – say channel gain – the entire page of code needed to be re-written onto EEPROM but the way the program was designed, only the single changed parameter was re-written. The code failed, not the hardware. But the idea that software was infallible was so ingrained, a problem in code was not even considered.

Money wasted and people lost their jobs over an unrealistic belief in “the computer”. The software people were kept on though; “software development is too important to be disrupted by changing the team” as I recall the quote.

 

4

Long-term project – multi-year development. Special procurement procedures required; limited approved manufacturer list; long lead time components. Inexperienced but favoured project lead. You can see where this is headed, eh?

Much time and effort is spent on schedules and cost … even before the design is defined. Funding is awarded based on low cost and quick schedule.

The design process begins. The project gets to the point where the schedule mandates “Parts Ordered” … yet the design hadn’t been completed to the point where it was time to order parts (some “approved” parts are no longer available; some have excessively long lead times – these issues need to be worked around … and that takes time and re-work). But it was mandated that parts be ordered by this date, so parts were ordered – based on short-lead time. Right parts? No, but a checkmark on the schedule was met – and apparently, that was the primary goal. Later events did nothing to assuage that notion.

Time moves on; so does the project lead. He had done such a good job to that point, he was promoted to another project to keep working his scheduling magic.

18 months or so later. The project is in test phase … and doesn’t work. Smell-the-smoke type of doesn’t work. The problem was traced back to the parts that were mandated in order to meet the schedule. This moves the schedule back almost the entire 18 months. The problem board needs to be re-designed, parts re-ordered (long lead times), and PCBs re-tested and certified. Because it was a primary power board, almost all other progress needed to be delayed.

There’s always blame. Where did it fall?

Did the original project lead that made the decision get any blame? No, he was successful, no longer on that project, and had done his job by meeting schedule.

Did the schedule itself get the blame? No, it was developed by people that “know” so it couldn’t be wrong. Besides, it helped win the proposal.

It came down on the engineers for “not pushing back hard enough on management” at the time. That’s easy for management to say well after the fact … I don’t know what your experiences have taught you, but mine suggest that management really, REALLY hates being pushed back on at the time pushing back is necessary. To the point of threatening one’s employment. So duck your head and kick the can down the road. Count on having a chair when the music stops.

I had been on the project early on, but luckily, I had also moved on to a different project.

 

It’s not cynicism, it’s experience.

 

That’s good for now.

 

One more comes to mind in which I was the prime factor. Fairly early as an engineer, I built test verification hardware for an electronics factory. I was charged with re-designing a test module that measured a certain parameter and provided a simple “PASS”/”FAIL” output. To verify operation, I was given a “Gold” card unit. This benchmark was pulled from production after having passed all test procedures of which mine was only one. This directly affected profit – once pulled, it was no longer usable as a sellable product yet had absorbed all the cost of a sellable product.

My design kept failing the card. I worked and worked on my test unit and everything looked fine. As I recall, it was simply an amplifier, a filter, and a comparator. I could not find an error … but I’m fairly new out of school. Meanwhile, I’ve got my manager breathing down my neck about how poor an engineer I was and why had I even been hired. Now, unlike many of my peers, I had worked at a high-reliability facility as a technician for a number of years before I got my engineering degree and so I had some degree of legitimate confidence in my ability … but not so much that I felt it impossible to make a mistake. But something this basic? (“Experience: The wisdom to recognize a mistake when you make it again.“)

I should note the production line for this component was shut down during this effort. The Heat Is On …

Finally, I gave up and told management that if I was so bad they should bring in a senior engineer to look over my design. Out on a very skinny limb … but I was out there in any case.

They didn’t bring in someone from a different part of our plant – they brought out a very senior engineer from corporate in a different state. (Oops, gulp!!!). He was not pleased with the situation and in particular, not pleased with me.

He went over my design, my notes, and the actual production test unit I had built. He seemed less peeved with me, and then he disappeared with my box and the benchmark unit. Then he went back home. I wait for the pink slip … this could be termination for cause.

It turned out the gold unit was faulty; my design worked as it was supposed to – it simply didn’t give the results expected. It turned out the calculation test software was generating a PASS rate of just below the target (92% vs. 95%? I don’t recall the exact numbers). The actual pass rate was closer to 55% if I recall. A gross failure.

Was I vindicated? Hm-m-m … at the time it seemed so, but hindsight suggests not. There were no tears when I choose to leave sometime later (and doing so worked out well for me later on). But I was young and stupid in those days. Now I’m old and stupid and see things in a different light.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top