Search This Blog

Friday, February 2, 2007

Software Project Cost Estimates Using COCOMO II Model

Introduction

This article provides a sample of COCOMO II cost estimate for a real project, and concentrates on outlining basic how-to when project manager needs some advice on making simple cost estimates and tools that can be used.

Audience

The article is intended for those who are new to project cost estimation techniques, and those who would like to have a feedback on COCOMO II model. My objective is to describe in a simple way basic cost estimation steps, tools and assumptions, having a real project in mind, and supplying only necessary details on the project itself.

Sample Project Description and Scope

A bioinformatics company, providing advanced methods for data mining of genetic information, intends to construct a distributed application for analysis and navigation of biological networks. As part of this project, a database provider that exposes simple interfaces to UI programmer and hides complexities of the data layer should be build. As soon as the scope of this task is broadly defined as such, it is sliced into a separate project. The content of this article targets cost estimation perspective of the latter project.

The company has already made work on inception phase, and provided the document describing the project concept. The customer has the following preferences:

  1. Transfer existing SQL Server database (~1 GB) hosted on Windows to PostgreSQL data end hosted on Linux.
  2. Build components for this project using Java. The main component is database provider.
  3. Since the project is small, the elaboration phase (or detailed design phase) is not necessary.
  4. Project has time constraints.

Said above is a short scope definition for the project provided by the company. As in many projects, the next important step is to make cost estimates.

Introduction to COCOMO II Estimates

COCOMO (Constructive Cost Model) is a model that allows software project managers to estimate project cost and duration. It was developed initially (COCOMO 81) by Barry Boehm in the early eighties. The COCOMO II model is a COCOMO 81 update to address software development practices in the 1990's and 2000's. The model is by now invigorative software engineering artifact that has, from customer perspective, the following features:

  • The model is simple and well tested
  • Provides about 20% cost and 70% time estimate accuracy

In general, COCOMO II estimates project cost, derived directly from person-months effort, by assuming the cost is basically dependent on total physical size of all project files, expressed in thousands single lines of code (KSLOC). The estimation formulas have the form:

Effort (in person-months)     =    a x KSLOCb

where coefficient a is about 3 (2.94), and scaling factor b is close to 1 (1.0997).

There are similar COCOMO formulas for project duration (expressed in months) and average size of project team. Interestingly, project duration in COCOMO is approximately cube root of effort (in person-months).

In practice, COCOMO parameters can be greatly different from its typical values. COCOMO II provides classification of factors that can have an influence on project cost, and lets you make better approximation of coefficients and scaling factors for your particular project. As the result of adjustment, “a” coefficient value falls between 0.056 – 120. See Appendix A for COCOMO II list of adjustment factors that affect first parameter. Model-driven adjustment of scaling factor “b” is new in COCOMO II model and reflects latest trends in software engineering. You can see scaling factors descriptions in Appendix B.

Many project managers used to negotiate project costs with trade-off triangle and trade-off matrix in terms of product functionality, quality, and schedule. If this is a case for you, you might be intrigued how COCOMO II adjustment parameters fit into this picture. The answer is that COCOMO II parameters can be viewed as two sets of parameters. The first set is external and can be loosely matched to trade-off triangle/matrix view and its vocabulary is frequently used while negotiating costs with stakeholder, and the second set is COCOMO II internal and usually cannot be used for this purpose. In this trade-off triangle/matrix perspective, schedule is loosely corresponding to SCED (required development schedule), quality to RELY (required reliability), and functionality to a combination of CPLX (product complexity), DATA (database size), TIME (execution time), DOCU (documentation match to life-cycle needs), and occasionally RUSE, STOR, PVOL parameters. COCOMO II internal parameters such as parameters for evaluation of personnel capabilities/experiences, used project tools and others are obviously important for project cost estimates, but usually are not a subject of cost communications with stakeholder.

There are well-approbated COCOMO II tools on the market that calculate these parameters for you, asking questions about particular project in natural language, such as “How experienced is the development team?” or “How tough is the project schedule?” and, thus, hiding model details in the background.

COCOMO II supercedes earlier version of COCOMO such as COCOMO 81, Ada COCOMO, which are considered by now as outdated. The model simplifies inception phase cost estimates by reducing the total number of parameters to seven (from 15 in the original COCOMO model), and suggests to use functional points for inception phase, and SLOC for later, more accurate phases. In fact, COCOMO II reduces controversy of what project metrics to use – SLOC or functional points – making the new model more flexible.

Comparing to classical COCOMO 81, COCOMO II introduces five scale factors, at least three of them are directly related to PM activities, and, thus, raises the role of project management in reducing project costs:

  • Takes into account process maturity in the organization (CMM levels)
  • Takes into account the degree to which project architecture exists and is stabilized before construction phase
  • Takes into account relationships perspective: team cohesion, relations with stockholder

Now, I will demonstrate basic estimation steps in COCOMO II model. To make actual estimates.

If you have project source code, and some project details, this is all you need to make a simple cost estimate. Surely, you don’t have a source code before starting your project, and this approach is more like “post-mortem analysis”. OK, this is my intention! By doing so, some chronological accuracy will be lost, but real cost estimate will be more correct.

Counting SLOC in the Project

To make actual estimations, you need at first to figure out SLOC count for each of the components and use these counts in COCOMO model.

To put it simple, SLOC (single line of code) is a physical line of code in your program except comments. Of course, you can use your favorite IDE and look onto the status bar in order to see how much line the code contains. It might be acceptable as a first-time estimate of SLOC. To be more accurate, you need to exclude comments and blank lines. Albeit, it is an easy procedure for a small project; for a larger project with many thousands of SLOC and multiple files, the choice of automated SLOC counting tool would be a more natural option.

I decided to use for these estimates the CodeCounter program that has a friendly UI, and can count SLOC in many languages including Java, C/C++, SQL (and flavors), HTML/XML. The CodeCounter program can be downloaded for 30-day evaluation from here.

The case study source project files reflect four tasks accomplished during this project:

  • Create PostgreSQL database, indexes, write pgSQL (close to PL/SQL) stored procedures.
  • Build a Java component (composed of multiple JavaBeans) that encapsulates an application’s basic logic, and serves as database provider.
  • Build a servlet, considered as a separate component, to dispatch UI calls to the core Java component.
  • Write simple HTML pages to test database provider.

In other words, the sample project source files correspond to four ingredients written in SQL, Java, and HTML/XML. It is convenient before counting to make separate folders corresponding to the project components and place there files to count.

The result of separate SLOC count in all project folders is the following:

Folder

Total SLOC

1

SQL Files

414

2

Java DB Provider Files

345

3

Java Servlet

156

4

Web Files

113

Table 1.

COCOMO generally recommends using standard SLOC counting programs that can perform line counting for code written in different languages. One of these programs is C-based CodeCount (command-line program without UI). You can download its source from Center for Software Engineering (you need to compile it by yourself too!), and count SLOC in C, C++, Java, and other languages. The benefit of CodeCount is that it is free, and was used for many years by many companies for code counting.

Making COCOMO II estimates Using Costar Estimation Tool

Of course, you can use COCOMO II as it is: choose model, formulas, figure out the values for parameters, and manually calculate project costs. I believe it is a matter what tool you prefer in every spring, filling out tax forms – just simple calculator or automated software tools. Allegedly, you prefer (you can opt to skip this section if I’m wrong!) the latter choice, and therefore, I would cover in this section one of the specific tools for making COCOMO II estimates.

I’ve selected Costar 7.0 COCOMO II tool since it has a convenient interface and concise help. The program is licensed to over than one million customers. Costar 7.0 30-day demo can be downloaded from here.

Generally, you create a main component and subcomponents corresponding to separate tasks of your project. The main component serves basically as the Costar project root for all other components, and is a place to summarize all estimates. It is not supposed to be associated with your project files. You need to create subcomponents for every project task, and type manually total SLOC count for the task from the SLOC counting program.

For this sample project (see Costar’s costar_dbprovider.cst project file for this project in cocomo2.zip), I’ve created the main (root) component and four subcomponents:

  1. db_provider
  2. servlet
  3. database_scripts
  4. ui_web_testing

Now, when initial state of components is set up in the Costar project, we can start making actual COCOMO II estimates. Albeit the total cost of the project in COCOMO models is largely determined by total SLOC count, adjustment and scaling parameters for a real project can vary project costs in hundreds of times. As was mentioned above, there are two sets of additional parameters that are used to make COCOMO II estimates more accurate. The first set, 17 cost drivers, are largely inherited from COCOMO 81 model, and the second set, 5 scale drivers, are new in COCOMO II model.

The method of setting these parameters in a real project is rather straightforward. As a project manager, you need to gather information about most important sides of your project such as required product characteristics, required schedule, required product quality, experience and capability of project team, project infrastructure readiness and maturity. Next comes the delicate part of making assumptions about the values of COCOMO II parameters that correspond to the information collected.

Since my aim is to make an introduction to COCOMO II techniques and terminology based on real project, I will only briefly describe assumptions that I made for the case study project. All COCOMO II parameters not mentioned below are set to their nominal value equal to 1. The project was characterized by:

Scaling factors and product characteristics

  • Accomplished inception phase, but the major lack of documented architecture since stockholders considered this project as rather small and not made elaboration phase as a separate phase. (RESL scale factor is very low).
  • The development team was familiar with basic concepts of software development methodology, but the software process maturity was close to initial CMM stages (PMAT scale factor, describing CMM level, is low).
  • Medium-size database backend (~1 GB) with about 20 tables and 100 indexes (DATA cost driver is very high).

Schedule and personnel

  • The project schedule was characterized by fixed time limitations, limited time for new team members to accommodate themselves to the environment and tasks, which means that schedule can be considered as tough (SCED scheduling parameter is high).
  • Team members were experienced more than average in programming languages such as C/C++, but were new to Java development (PCAP, programmer capabilities is high, and LTEX, language/tool experience, cost driver is low).
  • Team members were experienced more than average in Windows environments, but were not especially experienced in the latest Linux developments (PLEX, platform experience, cost driver is low).

Interestingly, COCOMO II predicts that selection of highly capable development team albeit not particularly experienced in language, tools, and platform justifies itself for a project:

PCAP [high=0.88]x LTEX[low=1.09]x PLEX[low=1.09] =1.04 ~ 1
  • There was no separate role for an analyst. This factor decreases team’s analytical capabilities. This is not a standard situation in COCOMO model, which has ACAP parameter to account for analyst capabilities. It seems possible to take into account not filled analyst position in COCOMO II by setting analyst capabilities parameter to lower value (ACAP, analyst capabilities is low).

Finally, the pictures 2 and 3 taken from Costar project show all COCOMO II cost drivers and scale factors along with their values:

Picture 2.

Picture 3.

Notice that Costar lets you see equations that were used for final cost estimates, as on picture 4:

Picture 4.

Summary of Cost Estimation Results

The overall result of COCOMO II estimates demonstrated in the report obtained from Costar project:

Picture 5.

The report says that project cost is $23,000 and project duration is 4.6 months ($5000 was taken as an average monthly developer salary).

Conclusion

This brief article shows how to make cost estimates for a sample project, and outlines basic steps, terms, and tools used. In a small–size or even intermediate–size project, there is a temptation to cut off traditional sides of development that were widely accepted in the nineties and actively used now. Obviously, ad hoc estimates are prone to error. Cost estimation tools make it easy for you to clarify not only an expected project cost and duration, but also prompt you to verify all basic sides of a software project by providing clear, compact, and concise terms, methodology, which are tested on a wide range of real-life projects, and, thus, reduce essentially project risks, and provide reasonable grounds for communication with a project stockholder.

References

  1. “Software Cost Estimation With COCOMO II”, Prentice Hall, 2000.
  2. “Software Engineering Economics” by Barry Boehm, Prentice Hall, 1981.

Appendix A: Cost Drivers

There are multiple factors that effect project cost. COCOMO II model defines 17 parameters called cost drivers that have a major influence on project cost.

Project manager can determine cost drivers based on project specifics, and use them to adjust first coefficient in formula:

a =  2.94 x EAF

where EAF (Effort Adjustment Factor) is obtained by multiplication of 17 parameters, which:

  • All have a nominal value equal to 1.0.
  • Most of them can have values: very low, low, nominal, high, very high (extra low, extra high can be also considered).
  • The range of majority of cost driver values falls between 0.5-1.5, and maximum value exceeds minimum value no more than 50%. Two exceptions are product complexity (CPLX) and analyst capability (ACAP), where max/min value can be as high as 2.
  • SITE, RUSE, DOCU, PVOL, PCON factors (marked with bold) are new in COCOMO II, and reflect trends in software development of late ninetieth, and 2000+ years.

The following table provides a list of COCOMO II adjustment factors. Notice that the second column shows factors for this case study project.

Cost Driver

Sample Project Value

Description

1

DATA

high

Database size.

2

CPLX

nominal

Product complexity.

3

TIME

nominal

Execution time constraint.

4

STOR

nominal

Main storage constraint.

5

RUSE

nominal

Required reusability.

6

DOCU

nominal

Documentation match to life-cycle needs.

7

PVOL

nominal

Platform volatility.

8

SCED

very high

Scheduling factor.

9

RELY

nominal

Required reliability.

10

TOOL

nominal

Use of software tools.

11

APEX

nominal

Application experience.

12

ACAP

low

Analyst capability.

13

PCAP

high

Programmer capability.

14

PLEX

low

Platform experience.

15

LTEX

low

Language and tools experience.

16

PCON

nominal

Personnel continuity.

17

SITE

nominal

Multisite development.

Appendix B: Scale Factors

Scale factors are new in COCOMO II. They modify second coefficient in formula 1 (coefficient b). The effect of scale factor is in 1.01 – 1.26 range. The second column shows factors for this case study project.

Scale Factor

Sample Project Value

Description

1

PREC

nominal

Precedence.

2

PMAT

CMM Level I (upper)

Process maturity.

3

TEAM

nominal

Team cohesion.

4

FLEX

nominal

Development flexibility.

5

RESL

little (20%)

Architecture and risk resolution.

No comments: