DaveWentzel.com            All Things Data

DevOps

Understand accounting and you'll understand the trends in software development and IT

(Overheard at the water cooler many years ago)..."Why are we moving to kanban and agile?  I don't understand it, our teams always deliver."

(Overheard at a recent SQL Saturday I attended)..."One of our ISVs (Independent Software Vendors) is now offering subscription-based licensing in addition to their standard release-based licensing.  The subscriptions are almost twice as much as release-based licensing, yet my stupid company just switched to subscription.  There goes my raise."

(Overhead by a DBA in the hallway)..."I can't believe we are moving all of our data to the cloud.  Security is going to suck and by my calculations it's going to actually cost us more money.  I don't understand the rationale for this.  I guess we'll all be laid off soon."  

There are trends in our industry that baffle IT folks.  The above are just a few examples.  There are (somewhat) hidden reasons for these trends.

Accounting "gimmicks".  

If you are involved in software development or IT it behooves you to understand this stuff. YOUR CAREER AND FUTURE IS AT STAKE.  

Let's dive right in.  

Quick Accounting Primer on Expenses

Companies spend money on stuff.  These are expenses.  Expenses are always classified as either Capital Expenses or Operational Expenses.  A table is a good way to visually represent the differences.  

  Capital Expense Operational Expense
aka capex opex
definition

the cost of developing or providing non-consumable goods that support the business.

the ongoing cost of operating a business.  
"value" the expense has a usable life of more than a year, hence it has longer-term value the expense is consumed within a year and then has zero value
example a 3D printer the toner material for the 3D printer
another example a new delivery truck toilet paper  (it has no value once consumed)
one last example building a new warehouse that the company owns leasing a new warehouse from another company that owns the land and erected the building for you
accounting treatment it is added to an asset account and the company's cash flow statement as an investment.   shown as a current expense, recorded immediately and subtracts from income, reducing net profit and thus taxes
affect on profits and taxes is deducted from earnings/reduces profits over its usable life.  this is called depreciation deducted from earnings and will reduce profits and taxes in the year it is paid/incurred
extremely simple example a truck is purchased for 100K and is added as a fixed asset with a useful 10 year life.  10K may be deducted to offset profit/taxes in each year for the next 10 years (VERY simplified example).  The annual 10K allotment is handled similarly to opex for that year.   Toilet paper purchases are expensed and deduct from profits/taxes in the year the tp is used.  

One quick note...R&D Expenses

The above table shouldn't be difficult to understand.  If it is, just trust me, or do your own research.  

Now let's start to get a bit tricky.  GAAP (Generally Accepted Accounting Principles) has a classification for expenses called "research and development".  These expenses are opex.  This is sometimes counter-intuitive for people.  If I'm a pharma company my R&D could lead to a breakthrough drug that nets me millions in profits over decades.  Shouldn't these be capex?  Is this an "investment"?  

Not generally.  The rationale is that at any point an R&D project might be abandoned and there will be no "asset" to the pharma company.  

If you work in software development then you may consider yourself R&D.  Right?  

Not always.  

But let's not get ahead of ourselves.  

Which is better...capex or opex?

First of all, there are a lot of gray areas where a given expense might be classified as capex or opex depending on just how you are willing to bend the GAAP rules and justify your position.  

In situations where an expense could be either capex or opex some companies will prefer one or the other based on how it will benefit them the most. 

A good auditor/investor/accountant can learn a lot about a company's management and goals by looking at how it handles decisions around whether an expense is on-the-books as capex and opex.  Frankly, many expenses, especially IT, like developing and maintaining software, could be either capex or opex depending on your world view.  Here are some generalities that I believe that others may disagree with:

Entity Capex Opex Reasoning
pre-IPO companies x   Pre-IPO companies prefer large expenses to be capex because the expense can be split amongst many years which has the appearance of inflating current year's profits at the expense of future profits.  Pre-IPO companies want to look as profitable as possible to get investors excited.  
Investors   x See above.  Investors would rather see higher opex because nothing is "hidden" by using depreciation methods that hide expenses in later years.  
Companies interested in minimizing taxes   x Costs are accounted for sooner.  This also has a "smoothing" affect on profits.  Forecasting is easier and you won't see as many huge "one time charges" to profits for some big capex item.  (Note that investors don't want to lose their profits to taxes, which is why investors like companies that don't try to hide big capex expenses).  
software development expenses (R&D) at ISVs (Independent Software Vendors)   x Many will disagree with me on this.  I'll discuss this more below.  
IT department expenses (including software development) for non ISVs (banks, pharma companies, finance, etc) x   Here is the sticking point and what I feel is the answer to all of those questions at the beginning of this post.  Companies want to move to Opex nowadays.

Some companies may want to defer as much expense as possible to look profitable (those pre-IPOs).  Those companies will want to capitalize as much as they can.  Otherwise, generally, companies nowadays prefer opex to capex.  

Even within a company there are conflicting interests.  Some folks want opex, some want capex.  A good CFO/CEO will favor opex because that is what their investors want to see.  But within those companies the CIO/CTO may feel differently.  Many companies view IT costs as out-of-control.  How better to make the budget look smaller than to shift expenses to capex?  So now you have the CIO/CTO working at cross-purposes with the goals of the company.  

Isn't this stuff complicated?  

The Rules for Various IT Expenses

Here are some "rules" for various IT expenditures.  Look at the list carefully and you'll see that the trends in our industry today is to move away from capex and towards opex.  This has been the general trend in business since at least Enron and the Dot Com Bubble.  

Expenditure Capex Opex Reasoning
software licenses x   They have a usable life of more than one year.  
software subscriptions   x You are only "renting" the software.  Fail to pay the rent and you have no asset.  
purchased laptop x   Has a usable life of more than one year.  
leased laptop   x No asset after the lease expires.  
"cloud"   x "renting"...no asset...do you see the trend?
data centers x   Huge upfront costs.  This is an asset.  
software licenses for software deployed in the cloud   x Yup, you can buy a license for an IDE and deploy it on AWS and move it all to opex.  I'll say it again.  If you purchase software that would otherwise run in your data center, yet deploy it on a VM in the cloud, magically the capex becomes opex.  

 

The Rules for Software Development Expenses

There are actually FASB and IFRS rules that govern how to expense software development.   They are very complex.  This post is a simplification of the issues.  Feel free to use google to confirm.  You may find a lot of information that conflicts what I have here.  I suggest you actually read the rules and you may find your company is doing things incorrectly, at its detriment.  But I'm not an accountant nor do I play one on TV.  

First your company's primary business activity determines whether you should be using capex or opex for software development/IT costs.   

  ISV non-ISV
Core Business The company's core business is developing software to sell to others.   The company's core business is not software development but it undertakes software development to increase efficiencies in its core competencies.  
Example DaveSQL LLC creates tools that are sold to other companies to aid them in SQL Server DBA tasks.  Dave BioPharma, Inc creates new drugs.  It buys DaveSQL's products to help manage its servers.  
IT expenditures should be... always opex.  People disagree with this...they are wrong.  Go read the rules.  R&D is always opex.  If the ISV cancels the development effort at any time there is no asset.   at times this is capex, at other times, opex.  More in the next section.  

For non-ISVs...when is software development capex vs opex?

Remember, all software development costs (well, most I guess) should be opex for an ISV.  This table is solely for IT shops at traditional companies.  The "phase" of the software development project at a non-ISV determines how the expense is handled.  

Expenditure Capex Opex Reasoning
Functional design/"Evaluation Phase"   x If the project is not feasible and is scrapped there is no asset, so it is R&D, which is opex in the traditional sense.  
Development Phase including detailed technical specifications x   The outcome is an asset.  Even if the software is useless or obsolete by the end of this phase and is scrapped, it is still capex.  There is still an asset.  That asset may be worthless and can't be sold, but it is still an asset.  
Post-implementation   x This one should be obvious.  This is production support.  

 

 

 

 

 

Expenditure

Capex Opex Reasoning
software licenses x    
software subscriptions   x You are only "renting"
"cloud"   x You are "renting" and therefore there is no asset after the lease expires.  
data centers x   Huge upfront costs.  
software licenses for software deployed in the cloud   x Yup, you can buy a license for an IDE and deploy it on AWS and move it all to opex.  

The Rules for Software Development Expenses

Expenditure Capex Opex Reasoning
software licenses x    
software subscriptions   x You are only "renting"
"cloud"   x You are "renting" and therefore there is no asset after the lease expires.  
data centers x   Huge upfront costs.  
software licenses for software deployed in the cloud   x Yup, you can buy a license for an IDE and deploy it on AWS and move it all to opex.  

The Rules for Software Development Expenses

If you haven't noticed thus far in this post, there is a tendency for most companies to prefer opex over capex.  This is not an iron-clad rule, but that is the trend in the business world today.  So, if we were accountants/CFOs/analysts/investors we would want to figure out ways to get more opex and less capex from our software development efforts.  This helps us pay less taxes.  

First thing you should note is that the last table is very waterfall-ish in its "phases".  Design to development to ops.  But what if we were agile and used cross-functional teams?  Could we make some of that capex into opex?  Yep.  And there's the trick.  

Waterfall generates assets too quickly under accounting rules.  It has detailed design documents after all...and those are assets.  So there's another reason why agilists tout "Working Sofware over comprehensive documentation".  I'll bet you didn't know that.  Agile, if practiced properly and understood by your Finance Guys, will have less capex.  

Agile is the best way I've ever seen to shift almost all non-ISV software development costs to opex.  Just get some working software out the door and then bug fix it.  That probably seems like an oversimplication to a technologist, but not-so-much to an accountant.  

Bending the Rules Even More

You can justify anything you try hard enough.  For instance, you can front-load opex using waterfall if you lump that comprehensive documentation as part of your "evaluation phase" documentation.  Using that trick we could re-classify just about anything.  

Please note that pre-IPO companies can also bend the rules in the reverse direction to generate more capex to make their current year profits higher.  Like I said at the beginning of this post, this is all "accounting gimmicks".  

The Ultimate Rule Bender...the Cloud

Quick thought experiment...your customer comes to you and says, "Your software doesn't work because it doesn't do X properly."  You decide that you agree and proceed to rectify it.  Is this work capex or opex?  Here is the rule...upgrades and enhancements to non-ISV software is capex...maintenance and bug fixes are opex.  So, is the work you are about to undertake capex or opex?  That depends.  Your customer would probably label the "issue" a bug (hence opex), but your company may disagree and deem it a "requirements change", hence an enhancement, hence capex.  

But wait, we don't want capex...we want opex, so do we have to admit our software is buggy to get an opex classification?

Nope.  

Enter the cloud.  

All cloud application development, even enhancements and upgrades, is opex because the software is rented.  Nice.  Now you can get an opex expenditure and never admit that your software was buggy.  

More on the Cloud and Software Subscriptions

With traditional release-based licensing an ISV would only make money when the next release was available.  This had an unpredictable effect on profits.  If you missed a release date you may not make any money.  Subscription-based licensing fixes that by "smoothing" out the profits.  Recently Adobe moved their software packages to a subscription-only model.  When they first released their earnings under this model their profits were down radically based on where most of their customers were in the release cycle.  They basically picked an inopportune time to change their model.  

The buyer of software loves subscriptions for the same reason.  "Smoothed" expenses and no capex.  

Open Source Software and "Services"

I'm convinced that the opex/capex debate is one of the key reasons for the rise in the open source software (OSS) movement.  Most OSS vendors offer a version of their software for free and then try to make money by offering services.  To the user of OSS this is very appealing.  There is no upfront cost for the software (capex) and the customization services are opex.  

Not all OSS uses this model, but it is prevalent.  

Think of every blogger that offers free software to do performance analysis for SQL Servers.  Altruism aside, they do this to get you to use their tools hoping that you will attend their seminars to learn more.  Or purchases their consulting services.  It's really a great model.  

 

A History Lesson and Concluding Thoughts

Back in the Dot Com Days every company preferred capex to defer the hit to profits.  And times were good for IT guys who didn't look quite so expensive because their salaries were more-or-less capitalized.  Then the dot com bubble burst, the economy tanked, Enron blew up the world, and Sarbox came along.  Now most companies want to be as transparent as possible with their accounting.  And that means opex and less capex "one-time" charges to earnings.  

Every trend in our industry in the last 15 years is geared toward the move to opex.  

  • Why is there a push to be more agile and to use Kanban?  To get us to opex faster.  
  • Why are developers asked to do support work...a la "DevOps"?  To get more people under the support budget (opex).  
  • Why is every company tearing down their 10 year old data centers and moving to the cloud?  (Do I have to say it?)  
  • Why are ISVs moving to subscription-based software models?  So that their customers can opex the "rent".  (This also "smooths" the ISV's monthly recurring revenues too).  
  • Why is your company trying to move your on-premise SQL Servers to SQL Azure (or whatever it is called now)?  I think you got it.  

It behooves all technologists to understand the basics of accounting and economics.  Many of the trends in our industry can be traced to how those trends will ultimately affect profits.  You should be designing your software accordingly.  I have no clue what the future holds in our industry, but I sleep better knowing that I understand the economics of the decisions being made.  


You have just read "Understand accounting and you'll understand the trends in software development and IT" on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

DevOps in the real world: Who should be responsible for index maintenance?

Larger ISVs usually employ at least two types of data professionals, the data developer and the database administrator.  In some situations, especially if DevOps is practiced, the lines may be a bit blurred regarding responsibilities.  I find it interesting in my consulting engagements which data professional is responsible for index maintenance.  Who is responsible for indexes at your company?  Who decides what indexes should exist?  Who creates them?  Who ensures proper index hygiene occurs (defragmentation, manual update stats)?  Which group should be responsible for these tasks?  

Informally I've found at my engagements that when both groups exist that it is the DBAs that are responsible for all aspects of indexing.  I think this is just flat-out wrong.  And I'm probably in the minority on this though.  Here's my reasoning:  

  • DBAs probably don't know all of the data access paths, and the most important ones we wish to optimize in the application.  Unfortunately your developers may not either, but that's a different issue.  
  • If you practice some variant of "database change review", where any and all schema changes to a database go through a review system to ensure they follow standards are are modeled sensibly, then you should have a good feel for what indexes should exist prior to those tables hitting your production systems.  
  • If you are doing performance testing then your developers should further understand the data access patterns.  Performance testing won't find every issue but the most egregious issues should be caught.  
  • Too many shops allow production DBAs to alter indexing schemes without consulting the data developers at all.  In EVERY case I've seen disaster because of this.  Here's a few examples:
    • we implemented queue tables as heaps and the DBAs didn't like that so they added clustered indexes and our queues experienced odd locking and concurrency behavior.  Contrary to google, sometimes heaps are good.  
    • DBAs applied indexes based on recommendations in the missing index DMVs and the plan caches.  The net result was cases where the same column in a table was the leading index column for 5 or more indexes, differing only by INCLUDED columns.  These DMVs must be countered with common sense and knowledge of your data.  
    • DBAs removed indexes because the cardinality at the time when they checked the index was LOW.  It's common for some tables to have a ProcessedStatusCode column that is binary (processed/not processed).  Throughout most of the day every row is in the processed state (1) until overnight batch processing starts.  Indexes on a ProcessedStatusCode column are critical, albeit counter-intuitive.  Many people believe that indexes on low cardinality tables are useless, but in situations like this they are required.  
    • DBAs removed indexes because sys.dm_db_index_usage_stats showed that data access was heavily skewed to updates and very little to seeks/scans/lookups.  The problem is, the indexes were needed during month-end processing and were suddenly gone when they were required.  

Contrary to what you are probably thinking, this blog post is not an "us vs them", anti-DBA rant.  "But you said above that it is flat-out wrong that DBAs are responsible for indexing."  

Right!

The resolution to all of the issues above is simple.  Collaboration.  Take the last example where important indexes were missing during month-end processing.  Clearly the DBAs had some valid reasons to remove the indexes and the data devs had valid reasons for needing them.  After our month-end processing went from 12 hours to 7 days due to the missing indexes, and after everyone stopped playing the blame game and hootin and hollerin, we collaborated and determined that we could actually build those indexes a day or so before month-end processing and then remove them later.  These indexes are huge and the chargeback for the SAN storage was not insignificant.  

This was a non-zero-sum event.  We managed to save on storage, our system was faster during critical month-end processing (those indexes were less fragmented), and the system was marginally faster during standard daily processing where the update overhead to those indexes could be eliminated. So, here's a case where a bit of collaboration led to one of those truly rare "win-win" situations.  And that's what we all want during conflict resolution.  We don't want to hurt anyone's feelings.  

(I found this graphic on an EXTREMELY popular DevOps site.  Maybe I'm being picky but I still see a separation of Dev and Ops.  Where is the collaboration?  I don't understand how this graphic couldn't be titled "Waterfall")

This is just another example where DevOps works.  My last few blog posts I pointed out that there is a misconception that DevOps means combining developers and ops guys.  It isn't.  It is about about situations like this.  Nobody is perfect.  I've worked in just as many shops where EVERY column in a table had an index because the developer figured that if a few indexes are good then more must be better.  Indexing is one of those cases where it really is difficult to determine which group should have ultimate responsibility, the developers or the DBAs.  Simple collaboration and following DevOps is the key.  

You have just read another DevOps post on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

DevOps: WIP and final thoughts

Another DevOps post today...this one discusses WIP.  WIP is an abbreviation for "Work In Progress".  Limiting the amount of WIP you have (in other words, the amount of work you have started but is not yet completed) is an excellent way to increase throughput in your software development pipeline.  

Work In Progress

The theory is that people and processes cannot multi-task as well as they think they can.  You can't really code two modules at the same time, or test two modules at the same time, or work on two sets of documentation at the same time.  When you work on two tasks at the same time you aren't really multi-tasking, you are context switching.  When you reduce the amount of WIP in the system you will reduce "inventory" (think of it as the amount of unsold work that is sitting in the warehouse half-assembled).  By reducing inventory you are limiting the number of bottlenecks that can occur, meaning your cycle time (the amount of wall clock time needed to move a piece of work from "started" to "done") should improve.  

DevOps improves upon Kanban's notion of WIP by also removing needless work out of the system.  In DevOps you should always ask the question, "Is this work really needed?"  It might have been needed last week but not this week, so we should eliminate that work.  Outcomes are what matters, not the process, controls, or the work you complete.   We need to protect the company by not putting meaningless work into the system.   To use the manufacturing analogy again, if you have partially completed work as inventory then you'll find your inventory depreciates as it gets older and is no longer desirable by your customers.  In this case the money you spent partially assembling that inventory is wasted.  This is sometimes considered "just-in-time manufacturing" and the equivalent in our industry is "just-in-time" software engineering.  

Here is a classic example of work that was needed but isn't anymore.  A new security patch is released for your DBMS version.  You add it to the backlog but a few months pass before you have bandwidth to work on regression testing the patch.  In the interim a new DBMS version is released that you want to upgrade to and it does not require the patch.  If you eliminate the work required to test the patch and simply apply that time to the DBMS version upgrade, then you've eliminated work.  It's even worse if you started regression testing the patch but didn't have time to fully complete it.  That time would've been wasted.  

DevOps is particularly concerned with this because we know we can eliminate a lot of operational work if we spend some time automating upfront by assigning dev resources to work with ops guys.  In this case we've eliminated work that isn't needed from the system.  

"Stop starting and start finishing"

...is an aphorism that sums up wip limits quite well.  Obviously there are times when you may have a lot of WIP due to factors outside of your control.  For instance, you begin work on a task and determine that the requirements are not clear enough to continue.  You need to get a BA to refine the reqs.  In this case hard-core Kanban-ers would say that you cannot begin work on another task because if you do you'll never get a BA to refine the requirements properly.  Instead, you should simply wait and force the "steel thread" to the BA resource to get a better requirement.  In some cases that may work but the problem is that this doesn't take into consideration that the BA resource may have other, more pressing issues.  Basically, your team's emergency may not be another team's emergency.  I handle these situations by moving the "card" to the BA resource so it affects that resource's WIP limits and then have my blocked resource work on something else.  I don't like to make WIP limit adherence a religious dogma.  Rather, it is a guide to get us the best throughput.  Something as simple as denoting the steel threads with a special colored card is enough to get management to prioritize better.  

Most importantly WIP limits simply ensure that we don't have a bunch of tasks forever sitting at 80% complete.  It's very easy for a developer that needs requirement classification to simply sit the task aside and move on to something else.  WIP limits ensure there is accountability for task completion.  In the real world, in my experience, WIP limits break down when you have "steel threads" from one team to another.  This is by far the Number One Reason for WIP in large software engineering organizations.  As I said before, just because Team A is blocked by Team B doesn't mean the blocked task is a priority for Team B.  In organizations that struggle with steel thread management, such as large monolithic ISVs, WIP limits will probably not work because there are cultural and political reasons why teams don't collaborate.  Only when the culture is forced to change will WIP limits work.  

The Need for Idle Time 

Everyone (or every resource) needs idle time or slack time.  It should be built in to the sprint schedule.  If no one has slack time then WIP gets stuck in the system in queues.  We've introduced unneeded bottlenecks and constraints to the system.  Therefore it is best to always allow for slack time in your development processes.  

There are other ways we can simultaneously reduce work and increase slack time.  Every process has constraints that we can identify and wish to protect.  I discussed this in a previous DevOps post.   These resources, once identified, can benefit from a "keeper".  A keeper is basically a helper that can move tasks around from work center to work center without wasting the constraint's time.  We need the constraint to be working on the important work, not on tasks that another resource could handle (checking-in/merging code, filling out paperwork, attending useless meetings).  The keeper can be an intern or a junior member of your team.  

Minimum Marketable Feature

Limiting WIP is one of the most important tenants of DevOps.  By limiting WIP we find that cycle times are reduced and throughput is improved.   This means that we can get the software to the customer faster.  This generates an invaluable feedback loop.  This is why one of the premises of agile is to release early and often.  It is also why Lean software development preaches understanding and delivering only the "minimum marketable feature" (MMF).  This is also sometimes called the Minimum Viable Product.  We only want to do enough to get sufficient feedback and make profits.  Anything else is just polishing the apple.  

"The Best Way to Get Work Done is to Not Do Work"

I'm certain I didn't invent this saying but I use it daily.  If you work in an ISV shop like I do then you are probably innundated every day with sales guys and BAs that want more and better functionality.  Many times the requirements are half-baked, at best.  Before we start doing work on these items I like to give these tasks some soak time.  I find that after some amount of time that there is no real demand for these tasks and they die before anyone has wasted any time on them.  Or I find that work that was deemed necessary a month ago is no longer necessary (see my patching example above).  In these cases, don't do the work.  I know this sounds a little waterfall-ish (we only start work when it is fully documented) but this is why I never adhere to development practices as if they were dogma.  Every organization is different and every situation is unique.  But if you suffer from half-baked requirements that never get to a customer then consider a more waterfall-ish approach.  You are wasting time creating inventory that is never sold and WIP that never gets to done.  

How do you move to a DevOps Reference Architecture?

Here are my final thoughts on DevOps.  Obviously I think this is a software engineering method that shows some promise and isn't loaded with a bunch of useless platitudes and metrics gathering that offer no hope of getting better software to customers.  This is how I've approached selling and implementing a DevOps approach at organizations I've worked with.  

  1. Read up.  There is no DevOps Manifesto like the Agile Manifesto.  There's a lot of disjoint stuff out there.  My theories on DevOps tend to radically deviate from those who think DevOps is simply getting your developers to manage your operations.  
  2. DevOps is more culture change than process change.  That makes it different from, say, implementing scrum.  You really can begin to adopt DevOps from the bottom-up if you have to.  Start by having some Dev and Ops teams collaborate.  
  3. When you get some small wins using bottom-up then advertise that fact and try to get the top to buy-in and make the cultural changes to the entire organization.  The best way to get small wins is by having the dev guys help to automate some of the ops tasks that bottleneck the release process.  Management loves to see that stuff.  
  4. Remember that DevOps is not the end, rather the means to the end (which is profit).  Too many people adopt DevOps to claim that they are DevOps.  Meanwhile, their software sucks.  This is the same as people who say "We are an agile shop and don't do comprehensive documentation".  That's just agile for the sake of agile.  
  5. Don't rely too much on metrics.  How do you measure whether your team is collaborating more with DevOps?  You can't.  The metrics must be softer.  Less-sucky software is a good metric, but difficult to measure.  Or fewer meetings where Ops complains about the dev folks.  
  6. Not everyone in your organization needs to do DevOps.  A few good developers can start collaborating with a few senior ops guys.  The more mature IT professionals will see the value immediately.  The ones who incessantly complain because they feel Ops work is beneath them will only be a distraction.  Prove that DevOps works in your organization before shoving it down everyone's throats.  It may not work for you.  (If that is the case, then, of course, nothing has probably ever worked for you either).  
  7. Focus on your CM team first.  At every large ISV that I've worked for the CM team (change review board or whatever else you want to call it) was the life-sucking entity that everyone hated the most.  It was the bottleneck.  The CM process is viewed as a process with negative value in the value stream by the guys in the trenches.  And management hates the CM team too because too many bugs are still getting through.  The Review Board really has an important job to do but is relegated to merely being a rubber-stamp organization.  Change that perception and you've solved a lot.  
  8. Don't fall for the NoOps trap unless you are already a NoOps shop.  

Good luck with your journey to DevOps.  I hope this blog series helps you in your quest for greater profits and better software.  

 

You have just read "DevOps:  WIP and Final Thoughts" on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

Tags: 

DevOps: Theory of Constraints

Another DevOps post today.  This one is on the Theory of Constraints and how it applies to DevOps.  None of this post is specific to DevOps so even if you don't practice DevOps you might learn something here to apply to your workplace.  The Theory of Constraints applies just as equally to DevOps as it does to improving the performance of a SQL Server.  If you understand the concepts and learn to recognize the symptoms that constraints impose on a system you'll find you are a much better IT professional.  

In most companies there are a very small number of resources (men, material, or machines) that dictate the output of the entire system.  These things are called "the constraints."  The company needs a trusted system to manage the flow of work to the constraints.  The constraints are constantly being wasted which means that the constraint is likely being dramatically underutilized.  

That is an extremely strong paragraph and you should re-read it and think about your workplace.  Odds are your shop has a few go-to people who, if they quit or were run over by a bus, would be next-to-impossible to replace.  These people are always busy solving problems and consulting with others.  These people are the constraints.  These people waste most of their important time dealing with tasks that do nothing to better the business or relieve the constraint.  Think again about those go-to people at your workplace.  Do you see them being assigned menial tasks?  Are they constantly being hounded to get to meetings on time?  To do their timesheet on time?  These people are not working to maximum capacity even though they are the busiest people at your shop.  That is why they are actually under-utilized.  

What to do with constraints

Follow these steps to alleviate constraints.  Use DevOps concepts I've already outlined in other post to help you.  

Step 1:  Identify the Constraint.  Surprisingly most managers and scrum masters do not know where the most egregious constraints are on their team.  They'll point to those people that are complaining the most that they are overworked.  Sometimes it isn't them.  The constraint will be the person(s) where the work piles up.  Kanban boards can help to show you this.  Once identified, protect the constraint.  Do not allow the constraint to be sidetracked by time-wasting activities.  The constraint is likely the bottleneck because it is being mal-utilitized.  Make sure the constraint is never wasted.  

Step 2:  Throttle release of work to the constraint.  Your constraint likely works on numerous projects or wears many hats.  Reduce the number of work centers where the constraint is required.  Note that this flies directly in the face of those DevOps proponents who claim that DevOps is merging your teams together.  That will make things worse!  Is your constraint really needed on all of those projects?  Generate documentation or automate the constraint to relieve the burden.  Or train.  

Even if you hire or get more units of the "constraint" you will never be able to actually increase throughput.  This is what Fred Brooks tried to teach us 40 years ago in the Mythical Man Month.  If it takes one painter an hour to paint a room then it doesn't follow that 20 painters can paint the room in one hour.

Just as important as throttling the release of work is managing the handoffs.  This is waittime that must be eliminated.  Example:  you've fast-tracked a bug fix through development but now you have to wait for a QA resource to free up.  You should be looking ahead to determine what work is coming soon.  

Step 3:  Subordinate the Constraint.  At this point the constraint should be less constraining on your system.  You need to keep it that way.  Remember that forcing more work through the system will not increase throughput, it will merely cause the constraint to start bottlenecking again.  Instead, build buffers or slack into the system to ensure that the unforeseen spikes in work will not cause bottlenecks.  

Do whatever it takes to ensure that maintenance occurs to increase availability of the constraint.  Improving daily work is more important that doing daily work. The whole goal of this step is to make the constraint less constraining.  

Step 4:  Automation/Elimination of the Constraint.  After Step 3 the constraint can still be a constraint under poor circumstances.  We don't want that.  One way to eliminate the constraint is via automation.  If your constraint is certain team members that are overworked/poorly-utilized then you can eliminate by cross-training.  For Ops people the best approach is always automation.  Automate as much as you can and then those constraints should only need to be responsible for maintaining and enhancing the automated system.  

Step 5:  Rinse and Repeat.  By now you've done as much with that given constraint as you can.  It's best now to observe your improvements and spot the next constraint and begin again with Step 1.  

Summary

Constraint Theory works well with DevOps.  Understanding which resource is mal-utilized is the first step in improving your IT organization.  You need to protect any constrained resource to ensure it isn't performing any work that another, unconstrained resource could handle.  After that you need to other resources that can assume some of the work of the constraint.  Lastly, automation is an excellent way to finally eliminate the constraint.  


In most companies there are a very small number of resources (men, material, and machines) that dictates the output of the entire system.  These things are called the constraints.  The company needs a trusted system to manage the flow of work to the constraint.  The constraint is constantly being wasted which means that the constraint is likely being dramatically underutilized.  

Step 1:  identify the constraint.  Once identified, protect it.  Make sure the constraint is never wasted.  

Step 2:  throttle release of work to the constraint.  Reduce the number of work centers where the constraint is required.  Generate documentation or automate the constraint.  Even if you hire or get more units of the "constraint" you will never be able to actually increase throughput.  Just as important as throttling the release of work is managing the handoffs.  This is waittime that must be eliminated.

Step 3:  subordinate the constraint.  

This can be nicely summarized in the concept of Drum-Buffer-Rope (DBR) which is a scheduling solution to the ToC.  The "drum" is the scarse resource in a plant that sets the pace for all other resources (think of a drummer in a marching band that can't keep up with the group).  The "buffer" is added to protect the performance of the drum.  It resolves the peak load and provides some slack for the constraint.  Once the drum is working at optimal capacity we then synchronize and subordinate all other resources and decisions to the activity of the drum.  Essentially, a "rope" is stretched from the drum both backwards and forwards.  We can then estimate when work can actually be started and completed based on the drum availability.  We use the rope to pull the work through the system.  

 

Do whatever it takes to ensure that maintenance occurs to increase availability of the constraint.  Improving daily work is more important than doing daily work.  

 

Work In Progress

Kanban is concerned heavily with limiting WIP.  The theory is that people and processes cannot multi-task as well as they think they can.  You can't really code two modules at the same time.  It's not really multi-tasking, it's context switching.  We want to reduce the amount of WIP in the system, which reduces inventory and improves cycle time.  With DevOps we want to take needless work out of the system too.  We should always be asking the question, "Is this work really needed?"  It might have been needed last week but not this week, so we should eliminate that work.  Outcomes are what matters, not the process, controls, or the work you complete.   We need to protect the company by not putting meaningless work into the system.   

Everyone (or every resource) needs idle time or slack time.  If no one has slack time then WIP gets stuck in the system in queues.  The resources that are constraints can benefit from a "keeper", basically a helper that can move tasks around from work center to work center without wasting the constraint's time.  We need the constraint to be working on the important work, not on tasks that another resource could handle (checking in code, filling out paperwork, attending useless meetings).  

By limiting WIP we find that cycle times are reduced (basically, how long it takes a story to get from inception to done).   This means that we can get the software to customer faster.  This generates an invaluable feedback loop.  This is why one of the premises of agile is to release early and often.  It is also why Lean software development preaches understanding and delivering only the "minimum marketable feature" (MMF).  

 

the best way to get work done is to not do work.  

Unfortunately, operations is considered to
be successful if the metrics are stable and unchanging, whereas development is only applauded if
many things change. Because conflict is baked into this system, intensive collaboration is unlikely.
 
Development teams strive for change, whereas operations teams strive for stability (the
definitions of change and stability will be discussed in Chapter 2). The conflict between devel-
opment and operations is caused by a combination of conflicting motivations, processes, and
tooling. As a result, isolated silos evolv
 
 
 
5 Steps in the Theory of Constraints
1.  identify the system's constraints
2.  Decide how to exploit the constraints
3. Subordinate everything else to that decision
4. Elevate the system's constraints (automation)
5. Find the next constraint and repeat the process.  
 
 
With Agile, testers and programmers
became developers, and with DevOps, developers and operations become DevOps. 
 
How do you move to a DevOps Reference Architecture?
  1. Read up.  There is no DevOps Manifesto like the Agile Manifesto.  There's a lot of disjoint stuff out there.  
  2. DevOps is more culture change than process change.  That makes it different from, say, implementing scrum.  You really can begin to adopt DevOps from the bottom-up if you have to.  Start by having some teams collaborate.  
  3. When you get some small wins using bottom-up then advertise that fact and try to get the top to buy-in and make the cultural changes to the entire organization.  
  4. Remember that DevOps is not the end, rather the means to the end (which is profit).  Too many people adopt DevOps to claim that they are DevOps.  Meanwhile, their software sucks.  
  5. Don't rely too much on metrics.  How do you measure whether your team is collaborating more with DevOps?  You can't.  The metrics must be softer.  Less-sucky software.  Or fewer meetings where Ops complains about the dev folks.  
  6. Not everyone needs to do DevOps.  A few good developers can start collaborating with a few senior ops guys.  The more mature IT professionals will see the value immediately.  The ones who incessantly complain because they feel Ops work is beneath them will only be a distraction.  Prove that DevOps works in your organization before shoving it down everyone's throats.  It may not work for you.  (If that is the case, then, of course, nothing has probably ever worked for you either).  
  7. Focus on your CM team first.  At every large ISV that I've worked for the CM team (change review board or whatever else you want to call it) was the life-sucking entity that everyone hated the most.  The CM process is viewed as a process with negative value in the value stream.  The Review Board really has an important job to do but is relegated to merely being a rubber-stamp organization.  Change that perception and you've solved a lot.  
  8. Don't fall for the NoOps trap unless you are already a NoOps shop.  

You have just read "One Last Git Gotcha:  The Merge Pattern" on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

Tags: 

DevOps: The Ways

This post in my DevOps series will cover "The Ways".  

You'll almost never hear The Ways mentioned with DevOps.  That's ashame because when you hit the Third Way, then DevOps is working for you...your development and operations teams are thinking as one.  The Ways is often associated with Kanban.  The Ways sounds like some kind of Kool-Aid cult blathering, but if you understand it, it works.  


 

 

 

 

 

(in this case it should be cleear that the bottleneck is the middle column...whatever that is)

The First Way is the visual management of work and pulling work through the system.  A good Kanban board can do this.  We want to see the fast flow of work through the system.  An outsider can look at it and immediately see where the bottlenecks are in your process without any prior knowledge of your business or Kanban.  If your Kanban board isn't doing this then you need to study Kanban some more.  You can think of the First Way as "systems thinking".  How do we get work out the door?  It's kind of very waterfallish.  Things move from left to right.  At this phase of DevOps there is still clear delineation between the developers and the Ops guys.  In this phase the bottlenecks tend to be after development and generally falls with the Ops guys.  This is because we still have developers who are rolling out code without much thought as to how it will be operationalized and who will support it.  

The Second Way is to make waittimes visible.  You need to know when the work takes days sitting in someone's queue.  This may not necessarily be a bad thing but it should lead you to ask some difficult questions.  Do we have enough resources at that phase?  Are those resources being utilized at the right "workstation"?  Or is that resource getting pulled in too many directions.  When your manager complains that it takes too long to do a certain task, perhaps it is due to waittimes?   

The Second Way is sometimes thought of as moving DevOps into the "feedback loop" phase.  The Ops guys should be providing feedback to the devs regarding what worked and what didn't so that the next release is smoother and there are fewer bottlenecks.  In other words, your "DevOps maturity" is improving at this phase and DevOps should begin to show you dividends.  

I like my definition of the Second Way better (make waittimes visible).  In larger organizations where inertia makes dev and ops collaboration politically impossible it is better to think of the Second Way as figuring out what is wrong with the First Way.  Queuing Theory is very concerned with understanding why service times are long and generally you'll find the biggest culprit is waiting on some unavailable resource.  This holds equally true whether you are examining the performance of a SQL Server that is waiting on IO subsystems or development teams waiting on QA resources.  

The Third Way is to ensure that we are continually putting tension into the system, so that we are continually reinforcing habits and improving something.  It doesn't matter what you improve, as long as something is being improved.  If you are not improving then entropy guarantees that your competitors will beat you.  I've heard IT people tout this as the equivalent of simple DR fire drills.  I think that is too simplistic.  The fact is, a DR fire drill tends to involve only Ops guys.  I don't know too many developers getting involved in these.  And remember that DevOps is NOT a simple melding of your developers and Ops guys.  It is collaboration, not job redefining.  

You'll know when you are ready for the Third Way when your people start to get bored.  That means they understand DevOps and Kanban and are not being challenged by it anymore.  Here's a simple way to move into the Third Way.  When you see entropy set in then make an announcement that some simple feature request will be handled as though it were a P0 ops emergency.  The request will be fast-tracked through the system to see how long it takes to move a request from a BA through to operationalization.  It may take 5 days.  Wait a few weeks and take a similarly-scoped request and repeat.  See if you can get your team to reduce the cycle time to 3 days.  Then figure out what the bottlenecks were that you removed.  

Summary

There is so much that can be written about The Ways.  To be succinct, think of your IT team as working on a factory floor and you as the manager are overseeing the activity from the catwalk.  You want to see no bottlenecks, no unplanned work, and no excessive waittimes as the work moves along toward doneness.  Then you want to introduce a defect and watch the floor workers swarm to fix the flow of work through the system.  When the defect has been resolved the factory floor should return to its previous state.  Granted, for an IT manager or scrum master it is difficult to think of your IT staff as factory workers, but this method does work when you really need to determine who and what is your bottleneck.  


You have just read "DevOps: The Ways" on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

Tags: 

DevOps and the Concept of Work

This is another post in my series on DevOps.  In this post I'm going to focus on the finer details of DevOps and how it relates to Kanban.  I've found that as organizations implement DevOps they are also implementing Kanban at the same time.  In the last post I wrote about the common precepts of DevOps and how DevOps gets conflated with other, unrelated ideas.  Kanban has similar issues.  People hear some Kanban buzz phrases and think that they know Kanban.  They don't.  Instead of this being a post on Kanban I'm going to make this a post more about "advanced DevOps" and it will borrow heavily from Kanban.  Don't worry if you've never worked with Kanban, that isn't required to understand the basics for this post.  

What is "Work"?

DevOps is so much more than bringing developers and operations people together.  DevOps is REALLY about understanding work.  If you understand "work" then you'll understand how to make work better.  

Categories of Work

  1. Business Projects:  this is what makes us profits.  This should ALWAYS be a company's primary category of work.  If it isn't then...you've got a problem. 
  2. Internal Projects:  this makes our operations better.  Automation, code refactoring, etc.  This work category doesn't make us top-line money, but it saves us costs.  Controlling costs is another way to increase profits.  
  3. Change:  This type of work occurs whenever we need to alter a system.  Maybe fix a bug, or upgrade an OS, or apply patches.  DevOps always questions EVERY change to ensure it is really needed.  You should be questioning every request for a change too.  In many cases the change really isn't needed or can be delayed and rolled into some other change.  
  4. Unplanned Work:  The first 3 categories above are "planned work."  Unplanned work is anything that interrupts your planned work.  But it is more than just unexpected emergencies and outages.  It is sometimes the direct result of technical debt.  

Technical Debt

"Technical debt" is a metaphor for the eventual consequences of poor software or infrastructure within your organization.  It is "debt" because it is really work that needs to be done before a particular project can be considered complete.

Technical debt isn't always bad, but it must be recognized for what it is.  In some cases it is better to recognize some technical debt and decide to fix it later so your code can get out the door and make you money.  But you MUST recognize when unplanned work is the direct result of technical debt and then prioritize it so it stops being unplanned.  

If you don't pay down technical debt then your unplanned work will continue to increase.  Left unchecked, technical debt will ensure that only the work that gets done is unplanned work.  

Why is technical debt and unplanned work so important to DevOps?  Here's an example...often your developers do not know about operational pain that development decisions are causing the ops folks.  If the teams work together the technical debt (and unplanned work) can be paid down.  

And this is why I really love DevOps.  It recognizes that development decisions that do not take operational concerns into account often result in lots of unplanned work.  Likewise, if Ops is making a change to a system that should be run past Dev.  As an example, just because a new version of SQL Server has been released doesn't mean our software needs to be updated to support it.  If there is no new feature that will make us money, then it can be postponed.  Not forever of course, but this is a case where pushing back and denying change is a good thing.  

Two more examples

I worked at a company where a developer decided to introduce Service Broker into his application without any operational input.  EVERY customer upgrade failed because SB was not enabled.  The developer was eventually instructed to put together operational documentation that the DBAs would carry out for every upgrade.  

The document was 3 pages long.  Our upgrades already averaged 19 hours and I determined that this "step" of the upgrade averaged 90 minutes because the DBAs did not understand what each step was supposed to do.  The DBAs didn't understand Service Broker.  

At the same time I was working on a different project in the company where I also needed to use SB.  I created an automated script that ensure SB was enabled and all of the plumbing was in place.  If anything was wrong it would attempt to fix the problem without operational intervention.  It added ZERO time to my project's upgrades.  And it worked flawlessly.  

I was unaware of the first project with its cumbersome SB instructions.  Eventually an Ops person asked me if my process could also be used to automate that process.  We plugged in my process with slight tweaking and solved all of those operational issues.  

In my project I realized that operationalizing something like Service Broker was non-trivial.  You can't just throw some new software onto your stack, you have to understand how that is operationalized.  I knew that the unplanned work around upgrade events and SB would be too much technical debt for us to bear.  So, with some forethought we ensured that operationalizing SB tasks was not de-scoped by a project manager trying to cut corners.  

"The Best Way to Do Work is to Not Do Work"

That is a saying I (think) I created that I use almost daily.  When I hear our people saying that they are doing repetitive tasks then I know that it is time to consider paying down that technical debt.  The best way is through automation.  If a developer can spend a 80-hour sprint automating some task that takes ops people 80-hours per quarter in effort, then it is a no-brainer that we need to pay down that technical debt.  The goal is not to do that work AT ALL.  

In the next post I'll cover some additional DevOps concepts you likely won't hear from many other DevOps authorities.


You have just read "[[DevOps and the Concept of Work]]" on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

Tags: 

DevOps

There is a lot being written about DevOps, most of it is not accurate.  The biggest inaccuracy is that DevOps is a simple merging of your development group with your operations group...ie,"dev" and "ops".  Nothing could be further from the truth.  I am a huge proponent of DevOps, and if it is properly understood and implemented it can turn around a faltering IT organization.  I have seen this and experienced it first hand at both large shops (120 developers/50 admins) and small ones (13 devs/3 ops guys).  

DevOps is the melding of Development and Operations

Let's just get this one out of the way first.  I see everywhere on the internet that DevOps is the merging of your developers and ops guys.  This is a dangerous over-simplicatioin of DevOps.  IMHO.  Yes it is true that DevOps takes the best-of-breed approach of both worlds and makes them more compatible, but we aren't just redefining job roles.  

I can think of few things worse than having my development staff also handle operations activities.  That's a recipe for disaster.  This misconception probably started with with small startups (or smaller groups in larger organizations) that didn't have the money or time to hire dedicated ops guys.  The easiest solution is to have your developers eat their own dog food.  But this isn't really DevOps.  It's NoOps.  It works great for NetFlix but might not work great for you.  At small shops NoOps is a reality and a necessity when you don't have the headcount for anything more than getting the product shipped to a happy customer so you can generate revenue.  Nothing wrong with that.  

But at some point the jack-of-all-trades model won't scale anymore.  Your product will become too big (in terms of code volume or customer volume) for everyone to wear every hat.  It is a core premise of DevOps that their is a degree of cross-training and understanding of every role, but that does NOT mean every engineer should be doing every role.  

Further, some people are better suited to be either a developer or a ops guy.  This is basic division of labor.  I don't want my smartest developer patching a system, just like I don't want my CEO taking out the garbage.  It's not that developers are better than ops, or the CEO is more important than the janitor, it's just the reality that we want our people working on what they are best suited for.  

Are devs and ops guys diametrically opposed?

I've worked at large ISVs where the ops guys spend all of their time keeping things running smoothly and the dev guys spend all of their time making changes for the next release.  Change is the enemy of the former and the friend of the latter.  Sales guys need whiz bang features and product management is tasked with having the devs deliver it.  But after every release the ops guys are scrambling because the systems are no longer stable.  Performance probably sucks again and web containers need hourly recycling.  While product management is declaring victory with the release the ops guys are getting no sleep dealing with the new issues.  Who can't forgive an ops guy for wanting fewer releases?  But isn't that anti-agile?  

At these large shops I've seen entire management teams that bicker over whether the next release should even be released, and when it should be released, and what it will do to system stability.  At the last minute risky features are removed which causes more risk because this new configuration was never tested.  All of this friction ensures that no real work gets done.  New features aren't being developed and operational tasks aren't being completed.  That is the core of DevOps...stop the bickering and let's ensure the teams are working together toward the same goal -- profitability.  

Lastly, there is far more to DevOps than just devs and ops guys.  QA is a part of it.  Change review boards need to be integrated into DevOps.  Product management, scrum masters, sales...everyone.  

So, What is DevOps?

For me, in a nutshell, it has each group try to think and behave like the other group and instill best practices in each other.  That begs the question, "what are the best practices?"

DevOps embraces tools, methods, and technologies, that can meld your operations and development staff so they work together better.  Some examples:

Example Behavior How DevOps Helps
Your developers use Visual Studio for stored procedure development but your DBAs use management studio.   Get your Ops people using VS.  
Your developers have a build/deploy process for their code but your DBAs don't follow SCM for their processes.   Your DBAs should have the same build/deploy process.  Visual Studio makes it simple to keep operational scripts under the same source control as everything else, or you could use a third party build/deploy tool like MD3.  Everyone follows the same processes.  
Your developers use Git, your ops people use nothing, and your QA folks use a separate Git repository, or SVN.  The Change Review Board uses a file share with neither versioning nor permissioning.   Everyone uses the same repository and follow the same policies.  
Your developers add indexes based on their dev findings.  Your DBAs drop them and add new ones with included columns or whatever.   DBAs need to educate developers as to why their indexing choices are not working and developers need to LISTEN.  
Develops code against systems that aren't properly patched, have varying versions of the application stack installed, use ancient 32 bit systems that were prod 10 years ago.   As part of a build deployment consider deploying on a new VM with the approved stack.  This ensures that if a developer needs to upgrade WebSphere that it is also upgraded in the VM build.  Nowadays this is easy to script, especially if you use something like Aptitude.  This is called "infrastructure-as-code" or "software-defined environments."  If your ops guys tell you their architecture is too complex to do this then you need to educate them that more things are moving to the cloud because of this mentality.  It's not the 1970s anymore.  
Your developers need to change a piece of infrastructure and there is nowhere to version control their script that an Ops person can use.  For instance, you use SQL Replication but there are no scripts that a developer can alter to change an article and re-snapshot.  Or the developer doesn't even know that replication is installed, therefore a table change they make breaks production.   Everything should be scripted and available to your developers.  Good developers can help the Ops people do this.  Good Ops people use the scripts for everything.  No settings are changed via SSMS GUI.  Everything is scripted.
Your Ops people ask your developers to automate a process and they code it in C# and your Ops people don't know C#.  (Hint:  your dev guys should've used PoSH and your Ops guys better learn it if they don't).   Everyone should use a scripting language like PoSh for these tasks.  PoSh is simple for Ops people to learn to do a modicum of modifications, and it is very similar to C# so it should not be a huge learning curve for the dev people.   
Non-prod deployments are totally automated with TeamCity or Jenkins, but when we move to prod then it is a manual process.   Everyone uses the same processes.  More on this below.  
Your developers use VersionOne for defect management.  Your Ops guys use Jira.   Use the same tools for bug tracking.  
Your developers pair with your BAs on new features.  Since you are Agile you don't document the nitty gritty details of each business process.  Your QA folks struggle to test it because they know nothing of what is not documented.   You could immediately state that all requirements be documented, or you could simple expand your pair to include a QA person that can then understand the decisions and possibly document them as a QA artifact.  

If this seems common-sensical then good for you.  However, if you are a developer who couldn't care less about how your code is operationalized, then maybe you should learn more about DevOps.  But, the above just scratches the surface of what DevOps is.  In the next few blog posts I'll cover the far more important concept of "work" and how to manage it properly in a DevOps environment. 

Even simpler, DevOps is an extension of agile where the ops guys get to participate in agile development processes too.  

 

How Does DevOps Affect Existing Processes

Whenever a new software management method arrives on the scene it freaks people out.  We all know certain parts of our software organization is broken but we don't want wholesale change either.  Here is a list of items that are important to large ISVs and the impact of DevOps upon them:

  • Change Management: if your people think your cm process is broken, then it probably is.  If it is viewed as just a "process" where people know they are going to be harangued because they weren't consulted prior to the change, then the cm process really shows no value.  Collaboration can fix this.  So can automation.  If you automate the change process then people have more confidence.  
  • Release Management:  at large companies there is usually total automation for non-prod envs.  Click a button in Jenkins and your build is deployed to your env.  When ops guys move the change to prod the process is thrown out the window and there is manual intervention again.  The rationale is that there needs to be backups and swapping of hardware, etc.  The fact is if there are special "prod-only" processes then those should be baked into the automation process.  Get the Ops guys to collaborate with the devs to make this a reality.  

 

Summary

This post was my summary on the basics of DevOps.  I clearly feel DevOps is not just another software management method that we can jam down the throats of our people.  It does contain a lot of new and controversial ideas, like having your Dev and Ops teams work together.  This is tough to do in highly-siloed organizations.  But in my experience DevOps works VERY well.  In the next post I'll cover some additional tenants of DevOps that you don't read about much that can also help you on your DevOps journey.


You have just read "nodetitle" on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

The Cargo Cults of Software Development Management

Managing the software development process is broken.  Just my opinion.  I have yet to put my finger on exactly what needs to be fixed, and guess what?  No one else has either.  If they did the software dev process wouldn't be broken.  Yet corporations and management constant search for the elusive fix, like Francisoco de Orellana searching for the Lost City of Gold.  Fred Brooks' classic, The Mythical Man Month, comes the closest to flat-out telling us, "There is no El Dorado."  

This is a lead-in post to a DevOps series I'm going to do. DevOps is the latest software development method out there.  It is one that I truly believe can deliver on its promises of fixing some of the ills of our industry.  

The Cargo Cult

Sometimes people observe a successful outcome and believe that if they replicate the circumstances that they too can reproduce the same outcome.  When success isn't forthcoming it is because the circumstances had absolutely nothing to do with the outcome.  This is a fallacy called, post hoc ergo propter hoc...after this, therefore because of this.  Like any fallacy, understanding when you are susceptible to it is half the battle.  And in my opinion, software architects primary responsibility is understanding when their organizations are falling for fallacious concepts.  

So why is it a Cargo Cult?  World War II brought western wealth to indigenous peoples of once-isolated, tiny Pacific islands.  These people had little exposure to the West and led more primitive lives.  In many cases these islands didn't see the bloodshed of war, instead they were used as supply depots.  Some jungle would be cut down and a landing strip would be made with landing lights to guide the supply planes.  The cargo was often food that wasn't always easy to get on these islands.  After the war these islands were abandoned by the war powers and suddenly the indigenous people didn't see food and cargo any more.  There are documented cases where these people would cut down more trees and build more landing strips and then light bonfires and wave discarded signal flags, yet no planes would land.  This baffled the people who assumed that these circumstances were responsible for their food.  

And it is this cargo cult behavior that I see today when adopting new software development methods.  Management says (or hears) that some process (agile maybe) was involved in some wildly successful software project.  Management believes that therefore if they adopt the process that they too will experience success.  

Other Cargo Cult Examples

Hilarious!

 

Various Software Development Methods

I've worked in, and think I understand, *at least* the following development methods:  

Method Dime Tour Does it work?
Waterfall actually I'm not sure about this one.  I've never had any company actually claim to be waterfall.  I think that is because "waterfall" has a negative connotation.  But some of us know when we've worked in a waterfall environment.  And it isn't all bad.  I've worked in waterfall environments that were wildly profitable and fun.  And I've seen waterfall projects be death marches too.  
Scrum/Agile I lump these together.  They are separate and unique but in every environment I've worked in the practitioners claim they practice both.  Works ok but practitioners sometimes adhere to the tenants like it was a religion.  It isn't, don't drink the Kool-Aid
XP another variant of agile Works for me.  
ICONIX I started out on this.  I guess this is roughly waterfall. Worked for me.  
Rational Unified Process ok, maybe I didn't understand this one when I was forced to use it.  Kinda waterfall-ish I guess.  We shipped software with it and were profitable, and that's the metric I use for a working method.  
Kanban a just-in-time method.  It attempts to focus developers on what is really important and on getting that done.  Runs the software dev process like a manager would from a catwalk on a manufacturing floor.   Works well when people understand more than the simple platitudes like "stop starting and start finishing" and really understand how "work works"
Microsoft Operations Framework I actually worked at a place that took MOF and made it into a software engineering process.  M$ even came in and did a lecture series on how to make it work.   Seemed ok...we made money...I got home at a decent hour most nights.  
Inception this seems to be the "old new thing" right now.  Really, it is part of RUP.  Inception is the first phase of software development where you try to broadly grasp the problem and approximate how much effort will be required.  So, there is an up-front focus on requirements gathering and design.  Um, that sounds kinda waterfall-ish to me (but don't tell an Inception Guru that).  I worked at a place that spent millions to bring in a bunch of Incepticons (Inception Consultants) to teach us how to do Inception right.  They were gone within a year, and so was Inception.  We went back to scrum/agile.  We would an entire sprint "incepting" only to realize by the end of a release that the product looked nothing like what we incepted.  To save face in the wake of wasting millions, management decided that we should use "just-in-time inception" instead, which is a mini-inception at the beginning of each sprint. Um, I call that "basic planning".   Complete failure if you follow the religious dogma.
DevOps a lot like kanban but with an emphasis on developers working with ops people to deliver more supportable products.  I'm actually going somewhere with this post...it's going to be the first of many posts on DevOps and how to move towards DevOps.    This is a game-changer.  

So, which one is the best?  

Any of these processes can work if practiced properly.  I have no affinity to any of them (except DevOps which I'm starting to really like).  They each have their good and bad points.  I can work and succeed in any of them.  Or fail in any of them.  

I would NEVER make any of these statements though:

  • "Our project succeeded because we finally realized the truths in <insert software development method here> and practiced it.  "
  • "We failed to meet our objectives because we didn't follow <insert software development method here> the way we should have.  "

There are common denominators that each process has.  Reliance on structure, process, and measurement for example.  But this isn't radical.  

HPCs (Highly Paid Consultants)

If you are looking for a new method, maybe because your software team is not as productive as you feel it should be, you should be aware that you'll find lots of HPCs who will want to sell you their expertise in a given method.  These shills will ALWAYS exhibit these characteristics:  

  • "If you adopt this approach then all of your development problems will go away. " 
  • "We are consultants that would love to come in and train your people on The Way.  We don't come cheap but you can't afford not to learn from our experience."
  • "Every other process is a fraud and will never work.  We have lists of how our process is better than whatever process you currently use.  Use this one to ensure success."
  • The "Who We Are" slide of their deck is always a mugshot of each of the two HPCs giving the presentation.  The first HPC is in the process of a deep belly laugh.  This connotes how they will make your failing projects a joy to work on.  The second HPC mugshot is a contemplative pose..."we have wisdom."  
  • One of the slides will tell you how they've been training The Way for the past x years with success.  They won't list specific companies or software, but they'll tout their successes with both small and large firms.  They don't know how to code in any specific language or domain, but they can manage the process.  They'll have lots of photos of their teams huddled around kanban boards collaborating over a bunch of color-coded Post-It notes, trying to determine what they are going to fix next...world hunger or peace?  They'll conveniently "forget" to mention how this collaboration is going to work with offshoring.  
  • They'll always provide you with lots of swag.  It may be "planning poker" cards with their logo on it, a board game they devised to use in their training classes, or bags with catchy logos like "Stop Starting...Start Finishing".  

All of this is a complete waste of time and money.  Just my two cents.  A HPC cannot possibly know your business domain, your software, or your people.  To assume that what worked previously for the HPC will work again for you is very Cargo Cult-ish.  

So, what does work?  

I don't like to complain without offering alternatives.  This is what I feel are the most important concepts that any method should espouse:

  • Never shackle people.  If your framework commands, "Thy shalt have daily standups where we discuss 'pulling cards from the left' (Kanban)", but the team likes to do Scrum-style standups, then allow people some latitude.  Let the team alter the framework to be assistive, not restrictive.  
  • No multi-tasking.  I call multi-tasking "faulty-tasking", because it DOES NOT WORK.  Human beings are not computers, they cannot multi-task.  Humans, instead, can context switch.  A developer can work on one and only one piece of code at a time.  Allow your IT people to complete a task, do not make them context switch.  
  • Don't rely too much on metrics.  Burn down charts and kanban cards and points are nice, but the focus should be on customer satisfaction.  
  • Don't spend millions on HPCs or special "Kanban TVs" for each team room.  It annoys people when large sums of money are spent on this nonsense yet raises were non-existent the last 2 years.  
  • Ensure everyone is allowed to participate.  Too often I've seen cases where a new method was instituted for the onshore workers, but not for the offshore guys...they aren't important after all.  
  • Practice MBWA.  Management By Wandering Around was practiced by HP back in the 1970's.  It's simple...managers get out of their offices and visit employees randomly, staying quiet and observing.  Good managers will quickly determine lots of little things that are impeding process and can remove those barriers.  
  • Consider adopting a new framework every other year on a trial basis.  Have your people learn the new method and bring it in-house.  Then do retrospectives often to see what is working and what should be discarded.  It's always a good idea to try new things.  I never thought software development management theory could improve but then I started to learn DevOps and realized that it in fact does contain a lot of useful concepts that are obvious, but under-employed by most organizations.  

Over the course of my career I've seen the companies I've worked for bring in lots of management frameworks and paradigms to try to solve what is wrong with their IT departments.  All of these "tools" have a bit of truth and value, but I hold that you can't solve a people problem with a slick framework.  The software development process is broken because fundamentally it is a people problem, not a process problem.  There are patterns and anti-patterns that I list above that I've seen work universally.  Don't be dogmatic about any process and don't be part of a Cargo Cult.  


You have just read The Cargo Cults of Software Development Management on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

Metrics That Count (and it ain't "points")

I've lived through my fair share of software productivity and management frameworks.  Kanban, agile, scrum, XP, waterfall...there's probably more that I'm trying to subconsciously suppress as I write this.  Each of these is concerned in some respect with metrics.  How do we evaluate how much work is getting done and who is doing it?  How do we use metrics to improve "cycle times"?  How do we improve "burn down"?  How can we use these metrics at performance review time?  Well, IMHO, none of these "metrics" really matter.  What matters is shipped software, delighted customers, rising stock prices, and stress-free employees who get to go home at the end of the day and spend quality time with their spouses, significant others, and/or kids.  Nothing else matters.  

Management thinks it needs hard numbers and metrics to determine if the program is meeting its software development goals.  It also needs numbers to determine which developers are meeting expectations and which are unsatisfactory.  One problem is that software development is not assembly line work.  In a Toyota factory, management has various mechanisms to determine efficiency of individuals.  Number of defects, speed of the line, number of work stations mastered, the ability to train new employees, etc etc.  

Art vs Science

Software development is *not* assembly line work, no matter what new language or Big Data system the cool kids are all using.  Software development is more "art" than "science".  And management of "art", with its inherent lack of metrics, is so much harder to do than managing something with defined metrics and "numbers"...something like science or math.  

Think I'm exagerrating?  Do you think Pope Julius II evaluated Michelangelo based on the number of cherubs he painted on the ceiling of the Sistine Chapel everyday?  It's true that they argued over the scope of the work and budget, but the Pope never tried to evaluate The Master based on some concocted metric. 

There is so much in software development that simply cannot be measured up-front.  We generally call these things the "non-functional requirements."  Some shops call them "-ilities".  Performance is generally considered a non-functional requirement.  We all try very hard to evaluate the performance of our software before it ships, often using tools such as LoadRunner.  But more often than not we find that we have not met the necessary performance metrics once the software is in the customer's hands.  So, how do you measure a performance metric early?  You really can't.  So, do we ding the team or individual for failing this metric? 

The only metric that matters

... in software development is working, released features that a customer wants.  If the feature has not shipped then you get zero credit.  There is no A for Effort.  Even if you are 80% feature-complete, you get no credit.  If you shipped it but the customer doesn't like it, you get no credit either.  I hear developers complain that it isn't their fault that the product failed...the requirements from the analysts were wrong and the developers merely implemented the requirements as given.  I appreciate that argument, and I feel your pain, but the only metric that counts is a happy customer.  When your company goes bankrupt because your product failed because of bad requirements, I'm not sure your mortgage company is going to care.  

Other Metrics

There are lots of metrics management uses to evaluate us.  Here are a few and my rebuttal as to why they don't work for project evaluation:

  • Tickets closed:  I've worked at shops where each branch of code needed its own ticket for check-in purposes.  And we always had 4 supported versions/branches open at any time.  So a given bug may need 4 tickets.  That's called "juking the stats."

  • Lines of code written:  so now we are incentivizing people to write more code instead of elegant, short-and-sweet, supportable code.  More lines of code = more bugs.  

There are two ways to design a system: One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies

  • Story Points:  A quick google search for "what is a story point?" yielded this article which pretty much concludes that you shouldn't use story points for metrics.  Oops.  
  • Velocity:  this supposedly shows management the "rate of progress" in our development.  In a perfect world our velocity will improve when we develop new tools that help us work smarter and faster, such as automation.  But many times velocity is merely going up because developers are incentivized to make the velocity improve and they do this the simplest way possible...cut corners.  
  • Code Test Coverage:  there are lots of tools that will analyze how many lines of code you have that have no unit tests.  I covered this in my blog post Paradox of Unit Testing?.  This leads to people juking the stats again...writing a bunch of tests to make the code coverage analysis tool happy.  
  • Unit Tests Written:  see nodetitle again.  I have worked with people who have refused to add new features to their code because there were too many unit tests that would need to be rewritten.  

The last two are the WORST offenders.  Most developers realize that lines of code, points, and tickets closed are ridiculous metrics, but many otherwise thoughtful developers fall for static code analysis and unit tests.  I've seen whole teams spend entire sprints writing unit tests for code that was 5 years old with no reported bugs because there was no test coverage.  It sucks the life out of the product and the team.  

I once tried to remedy this situation, and I hate to admit this, by merely adding empty test bodies to get the metrics acceptable.  And I've seen lots of people merely comment out broken tests to avoid getting weenied for that.  

Why do we rely on metrics?

Numbers suggest control.  Management likes control.  But this is totally illusory.  I've worked on teams where every sprint had some task called something like "refactor the Widget interface settings" that was assigned 15 points.  If the team had a bad sprint they merely claimed these points to make the numbers like good.  No work was ever really getting done and management had no idea.  That same team, after a 12 month release cycle, had ZERO features to contribute to the product's release.  Management was not happy.  But every sprint showed progress and burndown.  

Heisenberg and Perverse Incentives

When something is measured too much then the measurement itself will skew the system under measurement.  Loosely, this is known as the Heisenberg Uncertainty Principle.  I've worked on teams where there was an over-reliance on points as the metric to determine productivity.  People knew they were being measured and they geared their activities to those things that could generate the most points.  This usually meant pulling simple cards with small points that were low-risk.  The more important, higher point but longer duration "architecture" cards were never worked on.  They were too risky, you either got all of the points, or none of them.  

Summary

I'm sorry about this long rant on software development metrics.  Every team is unique and will determine how best to structure itself for optimal efficiency.  So many of these metrics are an effort to shoe-horn the team into a structure that management is comfortable with, even if that is forsaking the goals of the program.  Let the team figure out what metrics it needs.  Management's only metric of concern should be shipped features.  Nothing else matters.  When management evaluates its development teams on that metric I believe you will see teams that are super-productive because they are no longer juking the stats and wasting time on anything that is not leading them to more shipped features.  

But nothing is going to change.  It is too risky for management to not have a solid metric in which to evaluate us and our products.  Very depressing.   


You have just read "Metrics That Count" on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

Pages

Subscribe to RSS - DevOps