Size: 25436
Comment:
|
Size: 37995
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 163: | Line 163: |
= QSM: Volume 1.2: Why Software Gets In Trouble = == Part IV. Fault Patterns == === Chapter 1: Observing and Reasoning About Errors === Summary 1. One of the reasons organizations have trouble dealing with software errors is the many conceptual errors they make concerning errors. 2. Some people make errors into a moral issue, losing track of the business justification for the way in which they are handled. 3. Quality is not the same thing as absence of errors, but the presence of many errors can destroy any other measures of quality in a product. 4. Organizations that don't handle error very well also don't talk very clearly about error. For instance, they frequently fail to distinguish faults from failures, or use faults to blame people in the organization. 5. Well functioning organizations can be recognized by the organized way they use faults and failures as information to control their process. The System Trouble Incident (STI) and the System Fault Analysis(SFA) are the fundamental sources of information about failures and faults. 6. Error-handling processes come in at least five varieties: detection, location, resolution, prevention, and distribution. 7. In addition to conceptual errors, there are a number of common observational errors people make about errors, including Selection Fallacies, getting observations backwards, and the Controller Fallacy === Chapter 2: The Failure Detection Curve === Summary 1. Failure detection is dominated by the tautology that the easiest failures to detect are the first failures to detect, so that as detection proceeds, the work gets harder, producing a characteristic Failure Detection Curve with a long tail. 2. The long tail of the Failure Detection Curve is one of the principal reasons managers misestimate failure detection tasks. 3. Because the Failure Detection Curve represents a natural dynamic, there is nothing we can do to perform better than it says. We can, however, perform much worse, if we're not careful of how we manage the failure detection process. 4. The Failure Detection Curve is not all bad news. The pattern of detected failures over time can be used as a predictor of the time to reach any specified level of failure detection, as long as nothing is happening to undermine test coverage. 5. Some of the things that can undermine test coverage are blocking faults, masking faults, and late releases to test. 6. Late finishing modules may arise from a cycle of poor coding, which means that they are more likely to be fault-prone modules. Management policies designed to speed testing of late finishing modules may actually make the problem worse, and may account for much so-called "bad luck" estimating. === Chapter 3: Locating The Faults Behind The Failures === Summary 1. System size has a direct effect on the dynamics of fault location, but there are indirect effects as well. We use divide and conquer to beat the Size/Complexity Dynamic, and we also divide the labor to beat delivery time. These efforts, however, lead to a number of indirect effects of system size on fault location time. 2. You can learn a great deal about its culture by observing how an organization handles its STIs. In particular, you can learn to what degree its cultural pattern is under stress of increased customer or problem demands. 3. An important dynamic describes the circulation of STIs, which grows non-linearly the more STIs are in circulation. 4. Process errors such as losing STIs also increase location time. 5. Political issues, such as status boundaries, can also contribute non-linearly to extending location time. Management action to reduce circulation time by punishing those who hold STIs can lead to the opposite effect. 6. In general, poorly controlled handling of STIs leads to an enlarged administrative burden, which in turn leads to less poorly controlled handling of STIs. When STIs get out of hand, management needs to study what information that gives them about their cultural pattern, then take action to get at the root causes, not merely the symptoms. === Chapter 4: Fault Resolution Dynamics === Summary 1. Basic fault resolution dynamics are another case of Size/Complexity Dynamics, with more faults and more complexity per fault leading to a non-linear increase in fault resolution time as systems grow larger. 2. Side effects add more non-linearity to fault resolution. Either we take more time to consider side effects, or we create side effects when we change one thing and inadvertently change another. 3. The most obvious type of side effect is fault feedback, which can be measured by the Fault Feedback Ration (FFR). Fault feedback is the creation of faults while resolving other faults. Faults can be either functional or performance faults. 4. The FFR is a sensitive measure of project control breakdown. In a well-controlled project, FFR should decline as the project approaches its scheduled end. 5. One way to keep the FFR under control is to institute careful reviewing of fault resolutions, even if they are "only one line of code." The assumption that small changes can't cause trouble leads to small changes causing more trouble than bigger changes. 6. There are a number of ways in which a system deteriorates besides the addition of faults and performance inefficiencies, and these ways do not show up in ordinary project measurements. For instance, design integrity breaks down, documentation is not kept current, and coding style becomes patchy. All of these lead to a decrease in the system's maintainability. 7. When the integrity of a modular, or "black box," design breaks down, the system shows a growing "ripple effect" from each change. That is, one change ripples through to cause many other changes. 8. If we are to avoid deterioration of systems, they must not only be maintained, but their maintainability must also be maintained. 9. Managers and developers often show overconfidence in the initial design as protection against maintenance difficulties. This kind of overconfidence can easily lead to a Titanic Effect, because the thought that nothing can go wrong with the code exposes the code to all sorts of ways of going wrong. == Part V. Pressure Patterns == === Chapter 5: Power, Pressure, and Performance === Summary 1. The Pressure/Performance Relationship says that added pressure can boost performance for a while, then starts to get no response, then leads to collapse. 2. Pressure to find the last fault can easily prolong the time to find the last fault, perhaps indefinitely. 3. The Stress/Control Dynamic explains that we not only respond to the external pressures, but to internal pressures we place on ourselves when we think we are losing control. This dynamic makes the Pressure/Performance Relationship even more non-linear. 4. Breakdown under pressure comes in many forms. Judgment may be the first thing to go, especially in response to peer pressure to see things their way. 5. As people leave a project, either physically or mentally, it adds pressures to the remaining people, who are then more likely to leave themselves. 6. Managers may create a Pile-On Dynamic by choosing to give new assignments only to those people who are already the reigning experts. This adds to their load, and their expertise, which makes it more likely they'll get the next assignment. 7. Some people respond to stress with a Panic Reaction, even though the situation is not anything like life-threatening. Such people must not be in high-stress projects, or they will only add to the stress. 8. Pressure can be managed. It helps if the workers are self-regulating, the managers are empowering, and that responsiveness, rather than performance, is used to measure readiness for more pressure. === Chapter 6: Handling Breakdown Pressure === Summary 1. Software projects commonly break down when the reality of time finally forces them to realize where they actually are. When this happens, however, the symptoms displayed are unique to each project and each individual. 2. Many symptoms are equivalent to shuffling work around, accomplishing nothing or, even worse, actually sending the project backwards. One such backwards dynamic is the attempt to beat Brooks's Law through splitting tasks among existing workers. 3. Ineffective priority schemes are common ways of doing nothing. These including setting everything to number one priority, choosing your own priority independent of project priority, or simply doing the easiest task first. 4. A final way of doing nothing is to circulate "hot potatoes," which are tasks that management counts against you if they are on your desk when "measurement" time comes. 5. There are a number of ways to observe that managers are actually doing nothing. They may * be accepting poor quality products * not be accepting schedule slippage * be accepting of resource overruns * be unavailable to their workers * assert that they have no time to do the project right 6. A sure sign that a project is breaking down under time pressure is when managers and workers start short-circuiting procedures. This invariable creates a boomerang effect in which the very quality the manager intended to improve is made worse by the short-circuiting action. 7. The decision to ship poor quality to save time and resources always creates a boomerang effect. Bypassing quality assurance is similar. Both of these tactics lead, among other things, to destruction of the development process, more emergencies and interruptions, and devastation of morale. 8. When morale deteriorates into project depression, process quality will not be maintained, let alone improved. Trust built before the crisis will help an organization recover more quickly, but attempts to build trust during the crisis will probably backfire—especially if they are in the form of telling: "Trust me!" 9. Multiple customers increase the pressure on the boomerang cycle, up to the point that the resultant poor quality drives away customers, thus stabilizing the organization—or killing it. === Chapter 7: What We've Managed To Accomplish === Summary 1. In spite of the impression we might get from studying our failures, we've managed to accomplish a great deal in the past 4 decades of the software industry. 2. One of the reasons we've accomplished a great deal is the quality of our thinking, which is the strongest asset many of us have, when we use it. 3. Our industry has probably suffered because of the process by which we select our managers. People who select themselves into programming work probably are not the best "naturals" for management jobs. Nevertheless, they could learn to do a good job of managing, if they were given the training. As long as we don't honor management, however, they're not likely to receive one-tenth the management training they need. 4. The accomplishments of the software industry are much greater than you would believe if you listened to the purveyors of software and hardware tools. It is in their interest to make us believe that we're not doing very well, but that their tool will be the magic bullet we need. 5. We tend to be suckers for magic bullets because we want to accomplish great things, but great things are usually accomplished through a series of small steps, contrary to the popular image. 6. We may fail to recognize how much our productivity has increased because we are so ambitious. Once we succeed in doing something well, we immediately attempt something more grand, without stopping to take stock of our accomplishments. 7. Each pattern has contributed to the development of our industry. Pattern 0 has made computers less frightening to the general public. Pattern 1 has made many innovations that have contributed to our productivity. Pattern 2 has strung these innovations together into methodologies that make it possible to complete many larger projects in routine ways. Pattern 3 has taught us what is needed to keep even larger projects under control. The contributions of Patterns 4 and 5 are still more in terms of visions of possibilities, but that's as important to progress as actual accomplishments. 8. Meta-patterns are the development patterns of the culture of the industry as a whole. Once again, each pattern has contributed to the development of meta-patterns, and we are not only learning to handle software, but are learning how to learn to handle software. |
QSM Vol 1
QSM: Volume 1.1: How Software is Built
Part 1. Patterns of Quality
Chapter 1: What Is Quality? Why Is It Important?
Summary
- Quality is relative. What is quality to one person may even be lack of quality to another.
- Finding the relativity involves detecting the implicit person or persons in the statement about quality, by asking, "Who is the person behind that statement about quality."
- Quality is neither more nor less than value to some person or persons. This view allows us to reconcile such statements as,"Zero defects is high quality." , "Lots of features is high quality." , "Elegant coding is high quality." , "High performance is high quality." , "Low development cost is high quality." , "Rapid development is high quality." , "User-friendliness is high quality." All of the statements can be true at the same time.
- Quality is almost always a political/emotional issue, though we like to pretend it can be settled rationally.
- Quality is not identical with freedom from errors. A software product that does not even conform to its formal requirements could be considered of high quality by some of its users.
- Improving quality is so difficult because organizations tend to lock on to a specific pattern of doing things. They adapt to the present level of quality, they don't know what is needed to change to new level, and they don't really try to find out.
- The patterns adopted by software organizations tend to fall into a few clusters, or subcultures, each of which produces characteristic results.
- Cultures are inherently conservative. This conservatism is manifested primarily in
- the satisfaction with a particular level of quality
- the fear of losing that level in an attempt to do even better
- the lack of understanding of other cultures
- the invisibility of their own culture
Chapter 2. Software Subcultures
Summary
- Philip Crosby's "Quality is Free" ideas can be applied to software, though perhaps with several modifications.
- In software, conformance to requirements is not enough to define quality, because requirements cannot be as certain as in a manufacturing operation.
- Our experience with software tells us that "zero defects" is not realistic in most projects, because there is diminishing value for the last few defects. Moreover, there are requirements defects that tend to dominate once the other defects are diminished.
- Contrary to Crosby's claim, there is an "economics of quality" for software. We are not searching for perfection, but for value, unless we have a psychological need for perfection not justified by value.
- Any software cultural pattern can be a success, given the right customer.
- "Maturity" is not the right word for sub-cultural patterns, because it implies superiority where none can be inferred.
- We can identify at least six software sub-cultural patterns:
- Pattern 0: oblivious
- Pattern 1: variable
- Pattern 2: routine (but unstable)
- Pattern 3: steering
- Pattern 4: anticipating
- Pattern 5: congruent
- Hardly any observations exist on Patterns 4 and 5, as almost all software organizations are found in other patterns.
- In this book, we shall be concerned primarily with Patterns 1-3—how to hold onto a satisfactory pattern or move to a more satisfactory one.
Chapter 3. What Is Needed To Change Patterns?
Summary
- Each pattern has its characteristic way of thinking and communicating.
- The first essential to changing a pattern is changing thought patterns that are characteristic of that pattern.
- Thinking patterns consist of models, and new models can be used to change thinking patterns
- In the less stable patterns, models need not be precise, but merely convincing. Indeed, precise models wouldn't make any sense without first establishing stability.
- Models help us to:
- discover differences in thinking, before they have big consequences
- work on ideas together, to facilitate team-building
- understand the reasons for various project practices
- record communication so newcomers can get productive much faster
- maintain a record we can use to improve our processes for the next time
- be creative, because projects will never be routine.
- Before you set about choosing a better pattern, you should always ask, "Is our present pattern good enough?"
- The pattern you choose depends on a tradeoff among organizational demands, customer demands, and problem demands. These tradeoffs can be represented by choosing a point in "pattern space."
- There is always a temptation for a software organization to stagnate by not choosing a new pattern, but instead reducing customer demands or problem demands.
- The process of recognizing that a new pattern is needed is hindered by circular arguments that close the organization to the information it needs.
- The key to opening closed circles is the question, "Is your rate of success okay?" Closed circles, however, tend to prevent this question from being asked.
- Lack of trust tends to keep this key question from being answered truthfully, so organizational change often begins with actions for developing trust.
Part II. Patterns Of Managing
Chapter 4: Control Patterns for Management
Summary
- The Aggregate Control Model tells us that if we're willing to spend enough on redundant solutions, we'll eventually get the system we want. Sometimes this is the most practical way, or the only way we can think of.
- The Feedback Control Model tries for a more efficient way of getting what we want. A controller controls a system based on information about what the system is currently doing. Comparing this information with what is planned for the system, the controller takes actions designed to bring the system's behavior closer to plan.
- The job of Engineering Management is to act as controller in engineering projects. Failures of engineering management can be understood in terms of the Feedback Control Model. Pattern 2 managers often lack this understanding, which often explains why they experience so many low quality, or failed, projects.
- Projects can fail when there is no plan for what should happen.
- Projects can fail when the controller fails to observe what significant things are really happening.
- Projects can fail when the controller fails to compare the observed with the planned.
- Projects can fail when the controller cannot or will not take action to bring actual closer to planned.
Chapter 5 Making Explicit Management Models
Summary
- Every manager and programmer has models of how things work in their software pattern, though many models are implicit in their behavior, rather than stated explicitly. Things go awry in software projects because people are unable to face reality and because they use incorrect system models.
- Linear models are attractive because of additivity. Linear systems are easier to model, easier to predict, and easier to control. Managers often commit scaling fallacies because linear models are so attractive.
- The diagram of effects is a tool for helping model system dynamics to reveal non-linearities. Being a two-dimensional picture, it is more suited than verbal descriptions to the job of describing non-linear systems.
- One way of developing a diagram of effects is to start with the output—the variable whose behavior you wish to control. You then brainstorm and chart backwards effects from that variable—other variables that could affect it. From these, you chart backwards again, unveiling secondary effects, which you can trace through the primary effects to the variable of interest. You may want to explicitly indicate multiplicative effects because of their importance.
- Non-linearity is the reason things go awry, so searching for non-linearity is a major task of system modeling.
Chapter 6: Feedback Effects
Summary
- The Humpty Dumpty Syndrome explains one reason why project managers are unable to be courteously stubborn to their mangers, and what happens as a result.
- Projects run away—explode or collapse—because managers believe two fallacies: The Reversible Fallacy (that actions can always be undone) and The Causation Fallacy (that every cause has one effect, and you can tell which is cause and which is effect.)
- One reason management action contributes to runaway is the tendency to respond too late to deviations, which then forces management to big actions which themselves have non-linear consequences. That's why it's necessary to "act early; act small."
- The effect of Brooks's Law can be made worse by management action. Moreover, the same pattern of management action can lead to a Generalized Brooks's Law, which shows how management action is often the leading cause of project collapse.
- One reason management action contributes to runaway is the tendency to respond too late to deviations, which then forces management to big actions which themselves have non-linear consequences. That's why it's necessary to "act early; act small."
- Negative feedback is the only mechanism that has the speed and power to prevent runaway due to positive feedback loops in a system. The Pattern 3 controller has two major negative feedback loops with which to exercise control—one involving resources and one involving requirements.
Chapter 7: Steering Software
Summary
- Many otherwise good ideal methodologies fail to help prevent collapse because they don't prescribe negative feedback actions to be taken when the project deviates from the ideal model.
- When the methodologies do prescribe feedback, they often speak only of the product level, or feedback steps that are too large. To be effective for control, feedback must operate in small increments, at all levels, personal, product, process, and cultural.
- Software professionals often overlook the human decision point in models of effects. One reason is their inability to visualize certain states at all, often because they are "other outputs" of the process, and not directly connected with the product.
- To control a project successfully, you have to learn that you need not be a victim of the dynamics. When human decision points are involved, it's not the event that counts, it's your reaction to the event.
Chapter 8: Failing to Steer
Summary
- Many project managers fail to steer well because they believe they are victims, with no control over the destiny of their project. You can easily identify these managers by their use of "victim language."
- Brooks's Law doesn't have to be a victim law if the manager recognizes where the managerial control is, and that this control can take different forms.
- A common dynamic is punishing the messenger who brings accurate but bad news about project progress. This intervention avoids "negative talk," but also diminishes the chance of the manager's making effective interventions needed to keep a project on the road.
- Since the time of the Greeks, people have not only gotten their interventions wrong, they've gotten them backwards. Laying out a clear diagram of effects can help you sort out a situation in which two parties are driving each other to destruction, all the while thinking they are helping the situation.
Part III. Demands That Stress Patterns
Chapter 9: Why It's Always Hard to Steer
Summary
- Human intervention dynamics are those over which we potentially have control, but there is always a set of "natural" dynamics which put a limit on how good a job any controller can do. A large part of the controller's job is devising intervention dynamics that can keep the natural dynamics under the best control possible, which can never be perfect.
- The Square Law of Computation says that computational complexity grows non-linearly as the number of factors in the computation grows.
- Control can be thought of as a game that the controller plays against "Nature." Even games of "perfect information," such as Tic-Tac-Toe and Chess, require non-linear increases in brainpower to play perfectly as the size of the "board" increases.
- Simplification is always needed, because controllers are always playing a "game" well outside their mental capacity. Simplification takes the form of rough dynamic models and approximate rules such as "Always break a project down into modules."
- Software engineering management is harder than Chess, because controlling a project is a game of "imperfect information," and the size of the "board" is not fixed.
- The Size/Complexity Dynamic appears in many forms throughout software engineering, forms such as the Fault Location Dynamic and the Group Interaction Dynamic.
Chapter 10: What It Takes To Be Helpful
Summary
- Our brains will never be big enough for our ambitions, so we'll always need thinking tools, such as size/effort graphs.
- Size/effort graphs can be used to reason about the Size/Complexity Dynamic, such as when estimating a project or comparing the impact of two different technologies. Graphs, however—such as log-log graphs—can also distort or conceal the non-linear nature of the dynamic. We must learn to see the stable meaning through the variations in the data and the method of presentation.
- Because of the Size/Complexity Dynamic, it's easy to write requirements that the most competent programmers cannot satisfy.
- A single method or tool is seldom the best over the entire range of problem sizes. Size/effort graphs can help managers combine two methods into a composite pattern that adopts the best range for each one.
- "The bottom line" doesn't dictate all technology choices. Managers are often willing to pay a lot on the bottom line to reduce the risk of failure. The size/risk graph can help in reasoning about these choices, especially when used in conjunction with the size/effort graph.
- If you set out to change an organization, the first rule should be the one given to physicians by Hippocrates" "Do no harm."
- We are all subject to the Size/Complexity Dynamic, so interactions intended to be helpful often wind up being irrelevant, or actually destructive. It's a good idea to assume that regardless of how it looks or sounds, everyone is trying to be helpful.
- We can help most when we apply the The Principle of Addition to add more effective models to a person's repertoire.
Chapter 11: Responses to Customer Demands
Summary
- The relationship with customers is the second important factor driving organizations to particular software cultural patterns.
- Simply increasing the number of customers can wreak vast changes on an organization, such as
- increasing the development load
- increasing the maintenance load
- disrupting the pattern of development work
- On the other hand, a software development organization can be extremely disruptive to its customers. That's why customers try to be controllers of the software development organization, leading to a situation of multiple controllers. The more controllers, the more "randomness" there appears to the other controllers.
- The cast of outsiders who may influence software development is enormous, including such roles as
- customers and users
- the marketing department
- other surrogates
- programmers as self-appointed user surrogates
- testers as official and unofficial surrogates
- other unplanned surrogates
- Many of these outsider roles are planned as attempts to reduce the effective number of customers.
- Because some of the surrogates are much more intimate with the development system, they may negate their reduction of the effective number of customers with the force and frequency of their interactions.
- Interactions with customers are fraught with peril as the number of customers grows. Interruptions increase. Meetings increase in size and frequency. Time lost because of interrupted meetings increases. All of these increases are non-linear.
- With more customers comes more configurations to support. More configurations means additional coding, more complex testing, less effective test coverage, and longer repair times.
- Releases are needed whenever there are multiple customers. As soon as a product is released to customers, it assumes an entirely different dynamic than when it was held in the shadow of the development organization.
- Multiple versions of a software product complicate maintenance enormously, but more customers means more versions, whether official or unofficial. Frequent releases complicate the development/maintenance process, but so do infrequent releases, so that almost all software cultures tend to stabilize releases at around two per year.
QSM: Volume 1.2: Why Software Gets In Trouble
Part IV. Fault Patterns
Chapter 1: Observing and Reasoning About Errors
Summary
- One of the reasons organizations have trouble dealing with software errors is the many conceptual errors they make concerning errors.
- Some people make errors into a moral issue, losing track of the business justification for the way in which they are handled.
- Quality is not the same thing as absence of errors, but the presence of many errors can destroy any other measures of quality in a product.
- Organizations that don't handle error very well also don't talk very clearly about error. For instance, they frequently fail to distinguish faults from failures, or use faults to blame people in the organization.
- Well functioning organizations can be recognized by the organized way they use faults and failures as information to control their process. The System Trouble Incident (STI) and the System Fault Analysis(SFA) are the fundamental sources of information about failures and faults.
- Error-handling processes come in at least five varieties: detection, location, resolution, prevention, and distribution.
- In addition to conceptual errors, there are a number of common observational errors people make about errors, including Selection Fallacies, getting observations backwards, and the Controller Fallacy
Chapter 2: The Failure Detection Curve
Summary
- Failure detection is dominated by the tautology that the easiest failures to detect are the first failures to detect, so that as detection proceeds, the work gets harder, producing a characteristic Failure Detection Curve with a long tail.
- The long tail of the Failure Detection Curve is one of the principal reasons managers misestimate failure detection tasks.
- Because the Failure Detection Curve represents a natural dynamic, there is nothing we can do to perform better than it says. We can, however, perform much worse, if we're not careful of how we manage the failure detection process.
- The Failure Detection Curve is not all bad news. The pattern of detected failures over time can be used as a predictor of the time to reach any specified level of failure detection, as long as nothing is happening to undermine test coverage.
- Some of the things that can undermine test coverage are blocking faults, masking faults, and late releases to test.
- Late finishing modules may arise from a cycle of poor coding, which means that they are more likely to be fault-prone modules. Management policies designed to speed testing of late finishing modules may actually make the problem worse, and may account for much so-called "bad luck" estimating.
Chapter 3: Locating The Faults Behind The Failures
Summary
- System size has a direct effect on the dynamics of fault location, but there are indirect effects as well. We use divide and conquer to beat the Size/Complexity Dynamic, and we also divide the labor to beat delivery time. These efforts, however, lead to a number of indirect effects of system size on fault location time.
- You can learn a great deal about its culture by observing how an organization handles its STIs. In particular, you can learn to what degree its cultural pattern is under stress of increased customer or problem demands.
- An important dynamic describes the circulation of STIs, which grows non-linearly the more STIs are in circulation.
- Process errors such as losing STIs also increase location time.
- Political issues, such as status boundaries, can also contribute non-linearly to extending location time. Management action to reduce circulation time by punishing those who hold STIs can lead to the opposite effect.
- In general, poorly controlled handling of STIs leads to an enlarged administrative burden, which in turn leads to less poorly controlled handling of STIs. When STIs get out of hand, management needs to study what information that gives them about their cultural pattern, then take action to get at the root causes, not merely the symptoms.
Chapter 4: Fault Resolution Dynamics
Summary
- Basic fault resolution dynamics are another case of Size/Complexity Dynamics, with more faults and more complexity per fault leading to a non-linear increase in fault resolution time as systems grow larger.
- Side effects add more non-linearity to fault resolution. Either we take more time to consider side effects, or we create side effects when we change one thing and inadvertently change another.
- The most obvious type of side effect is fault feedback, which can be measured by the Fault Feedback Ration (FFR). Fault feedback is the creation of faults while resolving other faults. Faults can be either functional or performance faults.
- The FFR is a sensitive measure of project control breakdown. In a well-controlled project, FFR should decline as the project approaches its scheduled end.
- One way to keep the FFR under control is to institute careful reviewing of fault resolutions, even if they are "only one line of code." The assumption that small changes can't cause trouble leads to small changes causing more trouble than bigger changes.
- There are a number of ways in which a system deteriorates besides the addition of faults and performance inefficiencies, and these ways do not show up in ordinary project measurements. For instance, design integrity breaks down, documentation is not kept current, and coding style becomes patchy. All of these lead to a decrease in the system's maintainability.
- When the integrity of a modular, or "black box," design breaks down, the system shows a growing "ripple effect" from each change. That is, one change ripples through to cause many other changes.
- If we are to avoid deterioration of systems, they must not only be maintained, but their maintainability must also be maintained.
- Managers and developers often show overconfidence in the initial design as protection against maintenance difficulties. This kind of overconfidence can easily lead to a Titanic Effect, because the thought that nothing can go wrong with the code exposes the code to all sorts of ways of going wrong.
Part V. Pressure Patterns
Chapter 5: Power, Pressure, and Performance
Summary
- The Pressure/Performance Relationship says that added pressure can boost performance for a while, then starts to get no response, then leads to collapse.
- Pressure to find the last fault can easily prolong the time to find the last fault, perhaps indefinitely.
- The Stress/Control Dynamic explains that we not only respond to the external pressures, but to internal pressures we place on ourselves when we think we are losing control. This dynamic makes the Pressure/Performance Relationship even more non-linear.
- Breakdown under pressure comes in many forms. Judgment may be the first thing to go, especially in response to peer pressure to see things their way.
- As people leave a project, either physically or mentally, it adds pressures to the remaining people, who are then more likely to leave themselves.
- Managers may create a Pile-On Dynamic by choosing to give new assignments only to those people who are already the reigning experts. This adds to their load, and their expertise, which makes it more likely they'll get the next assignment.
- Some people respond to stress with a Panic Reaction, even though the situation is not anything like life-threatening. Such people must not be in high-stress projects, or they will only add to the stress.
- Pressure can be managed. It helps if the workers are self-regulating, the managers are empowering, and that responsiveness, rather than performance, is used to measure readiness for more pressure.
Chapter 6: Handling Breakdown Pressure
Summary
- Software projects commonly break down when the reality of time finally forces them to realize where they actually are. When this happens, however, the symptoms displayed are unique to each project and each individual.
- Many symptoms are equivalent to shuffling work around, accomplishing nothing or, even worse, actually sending the project backwards. One such backwards dynamic is the attempt to beat Brooks's Law through splitting tasks among existing workers.
- Ineffective priority schemes are common ways of doing nothing. These including setting everything to number one priority, choosing your own priority independent of project priority, or simply doing the easiest task first.
- A final way of doing nothing is to circulate "hot potatoes," which are tasks that management counts against you if they are on your desk when "measurement" time comes.
- There are a number of ways to observe that managers are actually doing nothing. They may
- be accepting poor quality products
- not be accepting schedule slippage
- be accepting of resource overruns
- be unavailable to their workers
- assert that they have no time to do the project right
- A sure sign that a project is breaking down under time pressure is when managers and workers start short-circuiting procedures. This invariable creates a boomerang effect in which the very quality the manager intended to improve is made worse by the short-circuiting action.
- The decision to ship poor quality to save time and resources always creates a boomerang effect. Bypassing quality assurance is similar. Both of these tactics lead, among other things, to destruction of the development process, more emergencies and interruptions, and devastation of morale.
- When morale deteriorates into project depression, process quality will not be maintained, let alone improved. Trust built before the crisis will help an organization recover more quickly, but attempts to build trust during the crisis will probably backfire—especially if they are in the form of telling: "Trust me!"
- Multiple customers increase the pressure on the boomerang cycle, up to the point that the resultant poor quality drives away customers, thus stabilizing the organization—or killing it.
Chapter 7: What We've Managed To Accomplish
Summary
- In spite of the impression we might get from studying our failures, we've managed to accomplish a great deal in the past 4 decades of the software industry.
- One of the reasons we've accomplished a great deal is the quality of our thinking, which is the strongest asset many of us have, when we use it.
- Our industry has probably suffered because of the process by which we select our managers. People who select themselves into programming work probably are not the best "naturals" for management jobs. Nevertheless, they could learn to do a good job of managing, if they were given the training. As long as we don't honor management, however, they're not likely to receive one-tenth the management training they need.
- The accomplishments of the software industry are much greater than you would believe if you listened to the purveyors of software and hardware tools. It is in their interest to make us believe that we're not doing very well, but that their tool will be the magic bullet we need.
- We tend to be suckers for magic bullets because we want to accomplish great things, but great things are usually accomplished through a series of small steps, contrary to the popular image.
- We may fail to recognize how much our productivity has increased because we are so ambitious. Once we succeed in doing something well, we immediately attempt something more grand, without stopping to take stock of our accomplishments.
- Each pattern has contributed to the development of our industry. Pattern 0 has made computers less frightening to the general public. Pattern 1 has made many innovations that have contributed to our productivity. Pattern 2 has strung these innovations together into methodologies that make it possible to complete many larger projects in routine ways. Pattern 3 has taught us what is needed to keep even larger projects under control. The contributions of Patterns 4 and 5 are still more in terms of visions of possibilities, but that's as important to progress as actual accomplishments.
- Meta-patterns are the development patterns of the culture of the industry as a whole. Once again, each pattern has contributed to the development of meta-patterns, and we are not only learning to handle software, but are learning how to learn to handle software.
주요 쳅터
Acknowledgments Preface I Patterns of Quality 1. What Is Quality? Why Is It Important? 1.1 A Tale of Software Quality 1.2 The Relativity of Quality 1.3 Quality Is Value to Some Person 1.4 Precision Cribbage 1.5 Why Improving Is So Difficult 1.6 Software Culture and Subculture 1.7 Helpful Hints and Suggestions 1.8 Summary 1.9 Practice 2. Software Subcultures 2.1 Applying Idea to Software 2.2 Six Software Subcultural Patterns 2.3 Pattern 0: Oblivious 2.4 Pattern 1: Variable 2.5 Pattern 2: Routine 2.6 Pattern 3: Steering 2.7 Pattern 4: Anticipating 2.8 Pattern 5: Congruent 2.9 Helpful Hints and Suggestions 2.10 Summary 2.11 Practice 3. What Is Needed to Change Patterns? 3.1 Changing Thought Patterns 3.2 Using Models to Choose A Better Pattern 3.3 Opening Patterns to Information 3.4 Helpful Hints and Suggestions 3.5 Summary 3.6 Practice II Patterns of Managing 4. Control Patterns for Management 4.1 Shooting at Moving Target 4.2 Aggregate Control Model 4.3 Patterns and Their Cybernetic Control Models 4.4 Engineering Models 4.5 From Computer Science to Software Engineering 4.6 Helpful Hints and Suggestions 4.7 Summary 4.8 Practice 5. Making Explicit Management Models 5.1 Why Things Go Awry 5.2 Linear Models and Their Fallacies 5.3 Diagram of Effects 5.4 Developing a Diagram 5.5 Nonlinearity Is The Reason Things Go Awry 5.6 Helpful Hints and Suggestions 5.7 Summary 5.8 Practice 6. Feedback Effects 6.1 The Humpty Dumpty Syndrome 6.2 Runaway, Explosion, and Collapse 6.3 Act Early, Act Small 6.4 Negative Feedback - Why Everything Doesn't Collapse 6.5 Helpful Hints and Suggestions 6.6 Summary 6.7 Practice 7. Steering Software 7.1 Methodologies and Feedback Control 7.2 The Human Decision Point 7.3 It's Not the Event That Counts, It's Your Reaction to the Event 7.4 Helpful Hints and Suggestions 7.5 Summary 7.6 Practice 8. Failing to Steer 8.1 I'm Just a Victim 8.2 I Don't Want to Hear Any of That Negative Talk 8.3 I Thought I Was Doing the Right Thing 8.4 Helpful Hints and Suggestions 8.5 Summary 8.6 Practice III Demands That Stress Patterns 9. Why It's Always Hard to Steer 9.1 Game of Control 9.2 Size / Complexity Dynamic in Software Engineering 9.3 Helpful Hints and Suggestions 9.4 Summary 9.5 Practice 10. What Helps to Stay in Control 10.1 Reasoning Graphically About the Size / Complexity Dynamic 10.2 Comparing Patterns and Technologies 10.3 Helpful Interactions 10.4 Helpful Hints and Suggestions 10.5 Summary 10.6 Practice 11. Responses to Customer Demands 11.1 Customers Can Be Dangerous to Your Health 11.2 The Cast of Outsiders 11.3 Interactions with Customers 11.4 Configuration Support 11.5 Releases 11.6 Helpful Hints and Suggestions 11.7 Summary 11.8 Practice IV Fault Patterns 12. Observing and Reasoning About Errors 12.1 Conceptual Errors About Errors 12.2 Misclassification of Error Handling Process 12.3 Observational Errors About Errors 12.4 Helpful Hints and Suggestions 12.5 Summary 12.6 Practice 13. The Failure Detection Curve 13.1 The Difference Detection Dynamic 13.2 Living with the Failure Detection Curve 13.3 Helpful Hints and Suggestions 13.4 Summary 13.5 Practice 13.6 Chapter Appendix: Official Differences Between the Pair Pictures in Figure 13-1 14. Locating the Faults Behind the Failures 14.1 Dynamics of Fault Location 14.2 Circulation of STI's Before Resolution 14.3 Process Faults: Losing STI's 14.4 Political Time: Status Walls 14.5 Labor Lost: Administrative Burden 14.6 Helpful Hints and Suggestions 14.7 Summary 14.8 Practice 15. Fault Resolution Dynamics 15.1 Basic Fault Resolution Dynamics 15.2 fault Feedback Dynamics 15.3 Deterioration Dynamics 15.4 Helpful Hints and Suggestions 15.5 Summary 15.6 Practice V Pressure Patterns 16. Power, Pressure, and Performance 16.1 The Pressure / Performance Relationship 16.2 Pressure to Find the Last Fault 16.3 Stress / Control Dynamic 16.4 Forms of Breakdown Under Pressure 16.5 Management of Pressure 16.6 Helpful Hints and Suggestions 16.7 Summary 16.8 Practice 17. Handling Breakdown Pressures 17.1 Shuffling Work 17.2 Ways of Doing Nothing 17.3 The Boomerang Effect of Short-Circuiting Procedures 17.4 How Customers Affect Boomerang 17.5 Helpful Hints and Suggestions 17.6 Summary 17.7 Practice 18. What We've Managed to Accomplish 18.1 Why Systems Thinking? 18.2 Why Manage? 18.3 Estimating Our Accomplishments 18.4 What Each Pattern Has Contributed 18.5 Meta-Patterns 18.6 Helpful Hints and Suggestions 18.7 Summary 18.8 Practice
1장: 품질이란 무엇인가? 왜 중요한가?
품질은 상대적이다. 품질은 사람에 연관되어 있기 때문이다.
- 품질은 요구사항을 만족하는 것이다.
품질은 누군가의 요구사항을 만족하는 것이다.
따라서 품질에 대한 정의는 정치적이고 감정적 측면을 포함한다. 품질은 어떤 사람의 의견이 중요한지에 대한 일련의 결정을 항상 포함하고 있기 때문이다. 그러나 품질에 사람이 연관되어 있다는 사실은 간과되기 쉽다. 정치적, 감정적인 요소는 소프트웨어 세상에서는 잘 다뤄지지 않기 때문이다. 반대로 사람이 연관되어 있기 때문에 품질이 상대적이라는 사실을 이해하면 품질에 대해 사람들이 서로 모순되는 생각을 가지고 있는 상황이 설명된다.
품질을 높이는 것은 쉽지 않다. 왜 그럴까?
- 그렇게 나쁘지는 않아: 외부에서 압력을 받지 않는 한 품질 개선 동기가 생기지 않는, 정체상태의 조직 때문에.
- 그건 불가능해: '품질의 가치는 측정하는 것이 불가능하다'라는 생각.
- 속박 효과: 조직이 처해있는 현재의 여러가지 상황이 여러 부분에서 변화를 속박함.
소프트웨어 조직이 어느 특정 품질 수준에 속박되어서, 조직의 변화가 '문화의 보수적인 속성'으로 인해 방해받을 수 있다.
- 현재의 품질 수준에 대한 만족
- 개선을 시도하다가 오히려 품질 수준을 잃는 것에 대한 두려움
- 다른 문화에 대한 이해 부족
- 자기 자신의 문화에 대한 불가시성
위와 같은 요인 때문에 변화가 어렵다. 고품질 소프트웨어의 새로운 문화를 달성하기 위해서는 개발자나 관리자가 이러한 요인을 효과적으로 처리하는 능력을 배워야 한다.
그러기 위해서는,
- 나의 현재 수준에 대한 인식 및 다음 수준에 대한 관심/흥미 지원
- 현재 수준을 잃는 것에 대한 두려움보다 개선을 시도하는 과정에서 얻을 유익함/즐거움/도전에 대한 동경/갈망
- 다른 수준/문화에 대한 접촉/경험 확대
나 자신에 대한 끊임없는 가시성 확보 (Journal, Where Am I 확보)
를 해야 한다.
연습문제
제품의 초기 버전이나 경쟁 제품을 사용해 본 다음, 특정 소프트웨어가 시간의 흐름에 따라 그 가치와 품질에 대한 정의가 어떻게 변화했는지를 논의하라.
- 개발 초기: 배경화면이 예뻐야 한다. 빠른 속도로 열람할 수 있어야 한다.
- 사용자들의 요구: 배경화면의 종류가 다양해야 한다.
- 사용자들의 요구: Crop 기능이 올바로 동작해야 한다. 내 폰의 크기에 맞는 배경을 제공해야 한다.
당신의 조직의 하드웨어 아키텍처를 표준화 할 경우에, 조직이 속박될지도 모르는 일련의 특성들의 리스트를 작성하라.
- OS
- Python library, Python version
- 담당 개발자
- 컴파일러
- 컴파일해서 쓰고 있는 NginX 웹서버
- IDC의 응급 지원 서비스
- iPhone, Android 시장 선택
여러분이 속해 있는 조직의 사람들이 제품의 품질 수준에 정말 만족하는지를 나타내는 증거는 무엇인가? 그 수준에 불만을 표시하는 사람들을 조직은 어떻게 다루고 있는가?
만족에 대한 증거
- 접속자 수 증가
불만 표시자 다루는 방법
- ...