Saturday, October 24, 2009

Object-oriented programming

posting by : eny rusidah
nim : 124079061
By: Huff, Sid L., Business Quarterly, 00076996, Winter93, Vol. 58, Issue 2

A new method of building and maintaining computer software emerging. It promises substantial savings in time and money

Most everyone, from three-year-olds to Senior Executives, has at one time or another played with lego sets. Truly amazing structures can be assembled out of collections of simple, standardized pieces. Building complex structures out of simple pieces also characterizes many other, commercially useful activities such as the design of electronic circuits. Why can we not build software the same way -- by assembling collections of simple, reusable "pieces" of code? The concept of software reuse is not new, but it has met with almost no success until quite recently. Constantly reinventing the wheel has characterized most software development until now. A somewhat radical new method, different enough from the traditional approaches to programming to be termed a new paradigm, is emerging out of specialized software labs and into the pragmatic mainstream. It is called the "object-oriented" approach to software development. Reflecting its newness and somewhat controversial nature, object-oriented programming goes by the descriptive acronym OOP. It opens a door, which, over the long term, promises to lead adopting organizations out of today's low-productivity, maintenance-dominated, software crisis.

SOFTWARE BECOMING STANDARDIZED

At a conference sponsored by NATO in 1969, the term "software engineering" was coined. Prior to that time, the designing and writing of computer programs had been treated as an art. No two programmers wrote even a small, simple program the same way. Great emphasis was placed on elegance; very efficient code was prized more highly than was easy-to-understand code. Such attitudes and practices made software difficult and expensive to create, and even more difficult and expensive to maintain, a legacy that is still with us today.

The central concept underlying software engineering has been that software development and maintenance should be made more like traditional engineering work. Emphasis should be placed on standardized approaches, and more care taken at the early design stages, since the cost of fixing design errors escalates rapidly as the development process unfolds. Furthermore, software should be reused wherever possible.

Considerable progress in converting computer programming into an engineering discipline has occurred during the past 20 years. Nearly all organizations have adopted software development methodologies, which encompass sets of rules and guidelines for creating programs in a standardized fashion. Development tools such as Computer Aided Software Engineering (CASE) toolsets have moved programmers and analysts further away from the low-level details of specific computers, allowing them to work more productively at higher levels of abstraction. The huge improvements in the price/performance ratio of computing equipment (greatly increased computing power per dollar) that have occurred in the past two decades, have made possible a shift in the emphasis of software developers from efficiency to productivity. Today, it rarely makes economic sense for software professionals to spend time to make their code more efficient.

Thus, many of the original tenets of the software engineering philosophy have become standard operating procedure in most companies today. one key goal of software engineering has not materialized significantly to date, however -- finding a viable and cost-effective way to allow software components to be reused. The reusability problem has become so serious that today it is widely viewed as the holy grail of the software development world. object orientation offers the long-term promise of a high degree of software reusability, and the possibility of improving the productivity of software development as much as the improved methods of the previous 20 years have changed its process.
WHY IS SOFTWARE HARD TO REUSE?

Think about a company. What does a company consist of? There are people (employees), pieces of equipment (trucks, say), buildings, documents (cheques, customer statements), other companies (customers, suppliers) and so forth. Now look at the company's computer software. There you will find programs to do things such as issue an order, pay an employee, create a budget, register a payment and keep track of inventory. Application software is fundamentally process-centred. Its purpose is to allow a company to record, organize and manage information about changes to things.

This process orientation is so deeply embedded in the thinking of nearly all computer professionals that it is very difficult to "step out of the box," and see it for what it is: simply an artifact of history. The first computer software was conceived as the expression of an algorithm, that is, a set of mathematical steps for performing a calculation. The fixation of early computer engineers with algorithms -- in the late 1940s and early 1950s -- is still the basis of the world's understanding of software.

A process-centred approach to software amounts to focusing on the problem's solution (the how) rather than the problem itself (the what). When software is designed this way, the various pieces all tend to be shaped towards the solution in the designer's mind. Thus, even when the software itself is built in a modular fashion, the various modules represent pieces of the solution, that is, components of the process being represented. Such pieces cannot easily be plucked out of one application and reused in another that has a different solution. A process orientation greatly restricts the reusability of software. It has only taken 35 years to come to realize that there might he a hefter way!
A BETTER WAY IS POSSIBLE

Suppose, rather than having programs that represented paying employees, issuing orders or keeping track of inventory, you had pieces of software that represented things such as employees, orders, items that had been manufactured and so forth. For each employee, there would he a separate and distinct chunk of software on the company's computer; similarly there would he one for each customer order, for each piece of company equipment, and so on. Such a software environment would be thing-centred, as opposed to process-centred. That is the essence of the object-oriented approach to software.

In the terminology of this approach, each little chunk of software is termed an object. An object can he thought of as a self-contained combination of code and data. A software object that represents me, for example, would likely he a specific instance of the broad class of objects called employees (of UWO). The data contained within this particular object might include my name, my employee i.d. number, my telephone number and the like -- in other words, data about me. The same type of data would be included in every other employee object.

Now, suppose I get a new telephone number. How is the representation of my phone number within the object changed? As part of the code that comprises the rest of the object, there would he a small routine named, perhaps, ChangePhoneNumber. That routine would he "called" by having some other object send a message to the SidHuff object. The message would instruct the SidHuff object to execute its ChangePhoneNumber routine (or method, as it is called), and would pass the new phone number to the object as part of the message. Other routines would be included in each employee object to perform other actions upon the object's data. The object itself, then, is just a collection of the routines, together with the encapsulated data.

In an object-oriented world, each object appears as a black box to the rest of the software. All that needs to he known about an object is its name, and the various types of messages it can accept. In principle, there is no need to look inside an object to make use of it. A useful parallel here is that of electrical engineers designing a new circuit. They know they can buy prefabricated chips from various suppliers, and incorporate them into their overall design. They do not need to know how each chip works internally -- only what the appropriate inputs and corresponding outputs are.

Similarly, an object-oriented software designer only has to know the types of objects available, and the types of messages each object can receive and act upon. Programming, in an object-oriented environment, consists of defining classes of objects, then stringing together appropriate objects to perform whatever task one sets out to accomplish. The focus for an object-oriented software developer is not so much solving the problem (the old process perspective), hut rather, determining what objects are required, what the various object types should do (that is, what methods each should embody) and how the various objects relate to each other. It is, in short, a very different perspective on software development.
NEED TO STANDARDIZE

Once a full library of objects has been defined, building software systems can be done extremely rapidly, almost unbelievably so for someone schooled in the traditional procedural approach to software creation. Developing a library of objects, in a real-world setting is a major undertaking, however. For this reason, various groups, such as the open object Foundation, have been formed during the past few years to try to set standards for many types of common software objects, to standardize the language and concepts surrounding the object-oriented approach, and especially to build sharable libraries of reusable coded objects.

It may at first seem surprising that different organizations are able to share software objects. From an object-oriented perspective, however, much commonality exists across organizations, even those in very different lines of business. While some details will differ, an account object in a manufacturing firm has much the same structure as an account object in an insurance company. The same is true for customer objects, supplier objects, inventory stock objects, employee objects and so forth. Thus, a generic, standardized library of object types (or classes) can greatly accelerate the learning and startup of an organization adopting the object-oriented approach.

Obviously, some industries will have specialized requirements. Insurance companies need to define and employ policy objects, for instance, while banks require loan objects. Even here, however, insurance company associations could collaborate on a generic definition for policy objects that could be shared, and perhaps tailored by specific firms.
BENEFITS OF THE OBJECT-ORIENTED APPROACH

A number of benefits are expected to accrue to firms adopting the object oriented approach to software, including:

* A dramatically improved productivity resulting from the reusability of the software, once a company's library of object classes has been fully developed;
* Higher quality programs, because the bulk of a company's software will be built out of reusable objects, which will have been thoroughly tested and debugged well ahead of time;
* Improved software flexibility, since, just as with toys made from Lego pieces, applications programs will be able to be disassembled and reassembled quickly and easily as the organization's shape and requirements change;
* Simplified and reduced software maintenance;
* Enhanced data integrity, resulting from the black box nature of software objects;
* Improved ability to handle complexity, since, as experience to date has shown, it is possible to build far more complex and challenging systems, with fewer people, when they are built using an object-oriented approach.

THERE ARE SOME DRAWBACKS

The adoption of the object-oriented approach is not something that can be undertaken quickly; it requires a long-term perspective. Some of the potential problems an organization faces in moving to it include:

* The object-oriented approach requires a company's software professionals to learn a whole new way of thinking. In particular, it requires new, unfamiliar programming methods and languages, such as C++, Smalltalk or perhaps visually-oriented languages, such as Visual Basic or Objectvision.
* At present, libraries of object classes are limited and far from complete. Companies adopting the object-oriented approach will have to spend considerable time investing in the design and creation of their own object class libraries. Sharing the effort with other firms will mitigate this problem somewhat, however.
* The object-oriented approach is not compatible, at least today, with most purchased software.
* There is a human harrier to be overcome. Many computer professionals are still skeptical of the value of the object-oriented approach, much as they have been, until recently, skeptical of the benefits of personal computers. They are often unwilling to surrender the years of investment they have made in the process approach to software creation, to learn the new methods.

BEGIN BY BEGINNING

The drawbacks to adopting the object-oriented approach are considerable and should not be underestimated. Nonetheless, as a long-term solution to the problem of software reuse and development productivity generally, the object -oriented approach offers a compelling alternative - indeed, arguably the only realistic alternative-to current process-oriented methods. Appropriate steps you might consider for your organization to develop a better appreciation for object-oriented methods include:

* Get your organization involved with one or more groups developing standardized objects;
* Provide appropriate training for a group of your computer professionals to get them up to speed in object-oriented development methods;
* Arrange some small-scale experiments that will allow this advance team to try out different object-oriented techniques and languages;
* Undertake a series of pilot projects of small but increasing scale, in which the advance team applies the new methods to real problems in the organization;
* Work towards architecting and developing your organization's own object class libraries;
* Actively seek out other organizations with which you can share your firm's experiences in the object-oriented approach.

OOP IS NOT A QUICK FIX

Very few organizations, other than software companies, have yet undertaken wholesale conversion to object orientation. Partly this reluctance is due to the huge mountain of existing software most companies currently own, which was built using traditional methods and tools, and which must be maintained for years to come. Even if the world experienced a massive conversion to the object-oriented philosophy tomorrow, it would take many years to replace the existing inventory of company programs with new object-oriented systems. Any talk of instant benefits from adoption of the object-oriented approach, therefore, is ridiculous. Conversely, for years companies have sought solutions to their software crises - costly development, low productivity, difficult maintenance, inability to reuse their code - with little to show for it. The object-oriented approach, an entirely new paradigm for software creation and maintenance, holds the key to a door leading out of the current morass. While not a short-term fix, by enabling the sharing and reuse of software, the object-oriented approach offers a long-term solution to today's software crisis.

BQ Reprint # BQ93202

PHOTO (COLOR): A Computer Diskette

PHOTO (BLACK & WHITE): SID L. HUFF

~~~~~~~~

By SID L. HUFF

Sid L. Huff is the Hewlett Packard Professor of Information Technology Management at the Western Business School. He is the MIS Area Editor for the Canadian Journal Administrative Sciences and a member of the Editorial Board of the Journal of Strategic Information Systems and of the Computer Personnel Research Journal. He studied at Queen's University and MIT.
Readmore »»

Wednesday, October 21, 2009

Gale Supports Michigan Library Association Providing Bus Service to Rally Scheduled for September 10, 2009

Farmington Hills, Mich., August 27, 2009 – Gale, part of Cengage Learning, will support the Michigan Library Association’s September 10th rally to oppose state budget cuts to library funding. The rally, occurring on the steps of the state capitol, will encourage legislators in the House of Representatives to vote against library funding cuts proposed in the Senate passed budget.


“The Senate version of the state budget calls for slashing library funding to $7.5 million even though state law calls for it to be funded at approximately $15.4 million,” explained Gretchen Couraud, executive director, Michigan Library Association. “If these budget cuts are enacted, the entire statewide resource sharing and interlibrary loan system could collapse, eliminating the savings realized through group purchasing. This will put greater pressure on local communities to fund resources, or citizens will lose this information altogether. We are proud to have Gale joining us in this fight to keep these important services alive and invite librarians and library supporters across the state to take action.”

“With our headquarters located in Michigan, we see first-hand how our state is experiencing some of the highest unemployment in the nation,” said John Barnes, Gale’s executive vice president of strategic marketing and business development. “This is not the time to cut vital library services as more Michigan residents than ever are relying on the resources of their libraries for job-searching, skills development and life resources.

“Libraries across the state report increases in library use, a testament to how our libraries are a vital part of our state’s transitions to a knowledge-based economy dependent on technology and access to information,” continued Barnes. “We urge the legislature to protect library funding and continue to provide free access to valuable resources.”

Gale will provide bus service for more than 250 people from two Michigan locations: its Farmington Hills headquarters and Otsego County Library in Gaylord. Those interested in using the free bus service should register in advance at www.gale.com/michlibraries.

“Gale opposes the proposed budget cuts that threaten essential library resources, and we encourage librarians and library advocates alike to take advantage of the free Gale bus service and join us in Lansing for the September 10th rally,” said Barnes.

The Library of Michigan brings libraries together in many ways with statewide group purchasing and resource sharing, saving millions of dollars and benefiting all Michigan residents. Additionally, the Library of Michigan is responsible for the Michigan Electronic Library (MeL.org), which provides online resources to students, faculty, small businesses, job seekers and all other Michigan residents. Included in the free services are iMeL Tests and Tutorials providing online GED, police, fire and nursing certification tests and more. MeLCat, a statewide interlibrary loan system, allows residents free access to a wealth of resources from their homes, offices or libraries at tremendous cost savings. MeLCat also enables libraries to share books, CDs, DVDs and more from other communities.

More information about the proposed cuts and their impact is available at www.milibrariesforthefuture.com/.

For more information, please contact Linda Busse at linda.busse@cengage.com.
Readmore »»

nama : Dany Anggoro Kusumo

Scientists advance knowledge gained from empirical and modeled data and observations. It follows that scientists who do not publish or release their data are compromising scientific development and, arguably, leaving their work unfinished. Considering that science is based on observations, it is astonishing that the publication of primary data is not a universal and mandatory part of science. The reasonable expectation of society that science will make data available for further research-especially if the research that produced the data was publicly funded-is supported by a wide range of international and national policies, and in principle by the science community and publishers. If data are not made publicly available or lodged in a permanent archive like any unpublished research, they are likely to be lost over time (Heidorn 2008).

The availability of data from local to global scales is critical for dealing with current issues affecting society, such as climate change, public health, and biodiversity loss. Society expects that scientists will make their data available because most data are paid for directly (i.e., government funded) or indirectly (e.g., university salaries) by public funds, or are collected for the public good (e.g., public health, product safety, environmental monitoring data). This interest of society is demonstrated by the Guardian newspaper campaign for the release of government-funded geographic data in Britain (www.guardian.co.uk/technology/freeourdata and www.freeourdata.org.uk/blog) and by the emergence of organizations such as the Open Knowledge Foundation (www.okfn.org).

Many international, intergovernmental, and funding agencies have policies calling on member countries or grant recipients to make data available (Edwards et al. 2000, Arzberger et al. 2004a, 2004b, Costello et al. 2008); among these agencies are the Organization for Economic Cooperation and Development, the International Council for Science (ICSU), the Intergovernmental Oceanographic Commission (IOC) of the United Nations Educational, Scientific and Cultural Organization, the Global Biodiversity Information Facility (GBIF), the European Research Council, UK Research Councils, and the US National Science Foundation and National Institutes of Health, and even some treaties (such as the Antarctic Treaty System; www.scar.org/treaty). Some journals, including Science and Nature, explicitly expect data to be made publicly accessible, and they list suitable repositories for certain types of data. The Association of Learned and Professional Society Publishers and the International Association of Scientific, Technical and Medical Publishers (2006) recommend public access to data that support publications.

A comparison of national policies regarding the availability of government data showed that open access conferred significant economic benefits by stimulating entrepreneurial use of the data by commercial companies (McMahon 1996, Weiss 2002). In contrast, restrictive data-release policies and fees for data use (which provide negligible financial return) discouraged innovation and development of data products. However, despite the recognized societal benefits, most primary data remain unavailable. For example, a recent review of national ocean data centers, part of a 30-year-old network established by the IOC, found that the centers generally had less than half the data they should have for each country, and many countries still lack such data centers (Kohnke et al. 2005). More than 70 percent of the organizations publishing data through GBIF and the Ocean Biogeographic Information System (OBIS) are from government organizations (including museums), and less than 20 percent are from universities and individual scientists, which may reflect the greater influence of government and international policies on the former group of organizations. The policies and calls for data sharing have not been sufficient to make data sharing the normal practice throughout science.

More than 70 countries and 50 other organizations make up the Group on Earth Observations (GEO; www.earthobservations.org), which aims to establish a Global Earth Observation System of Systems (GEOSS) by 2015. The GEOSS will cover all observation data, from climate to biodiversity, including those recorded by satellites, buoys, in situ sampling, and observations; GEOSS data will be available and integrated through a common portal (www.geoportal.org). However, this system will be successful only if all data are readily available: historic data will have to be digitized and the world scientific community will have to contribute data through common standards, protocols, and open-access agreements (Scholes et al. 2008). Although it has recognized this problem, GEO has not yet proposed a solution.

When new species, proteins, gene sequences, microarrays, cell lines, and bacterial strains are described, the scientific community expects that type specimens will be deposited in suitable collections (e.g., museums, herbaria) and molecular data will be deposited in specialist data centers (e.g., Protein Data Bank in the United States, Cambridge Crystallography Data Centre in the United Kingdom, GenBank), and most journal editors make such action a prerequisite for publication of a print paper. Howe and colleagues (2008) list 21 molecular biology databases on model organisms and 3 additional ones with data on numerous species. GenBank comprises mirror sites at the European Molecular Biology Laboratory, the National Center for Biotechnology Information in the United States, and the DNA Databank of Japan. Each database is government funded and all the data are freely accessible online. Thus, making data publicly available is already part of the culture in some sciences, such as physics (e.g., the arXiv.org preprint series), astronomy, climatology, and molecular biology (RIN 2008). However, even in such well-established fields as bioinformatics, in which one can get a degree and become a "biocurator," incentives such as improving the citability of contributions have been called for (Howe et al. 2008).

Very large databases are curated by professional data managers, because the highly standardized data in them (collected automatically by sensors on satellites, buoys, or other platforms) demand it (Heidorn 2008). A similar amount of more diverse data spread through many small data sets and individual scientists is not being professionally curated (Heidorn 2008), yet the size of a data set is not necessarily an indicator of the data's value to science now or in the future. If some of these small data sets could be standardized, they could be published through facilities such as GenBank and GBIF. The development of more standards for publication of different data types is thus to be encouraged.

Although governments, funding agencies, and the scientific community appreciate the benefit of making data publicly available, individual scientists may not find the benefits quite as evident. This is because individual scientists' concerns about making data openly available and introducing measures to motivate online publication have not been addressed (Klump et al. 2006, Parr 2006, Blagoderov et al. 2008, Heidorn 2008, RIN 2008). The main obstacle to making more primary scientific data available is not policy or money but misunderstandings and inertia within parts of the scientific community. In this article, I seek to answer the responses I have heard repeatedly from scientists when asked why they do not publish their data online. Their reservations must be addressed to change scientists' behavior from data hoarding (and occasional data sharing) to online data publication.

Some benefits of data publication

Online data publication will boost scientists' recognition, generate invitations to meetings, present consulting and collaboration opportunities, and increase citation rates because their productivity will be more visible (box 1; Froese et al. 2004, Eysenbach 2006, RIN 2008). Compared with publishing data in print media or archiving it in libraries, publishing data online is less expensive and it exposes the author's work to a far wider audience. Making data available online maximizes the potential return on the investment in research, and those data can be repatriated to the countries from which they may have been collected by foreign scientists. The cost of saving and reusing data published online is also likely to be lower than the cost of collecting them again (Heidorn 2008). Without the ability to reanalyze the original data from which a scientific conclusion was reached, the conclusion cannot be independently tested (Cassey and Blackburn 2006), and some data cannot be replicated because of unique combinations of environmental conditions (Heidorn 2008). Furthermore, making data availability mandatory may help discourage or expose scientific misconduct (Klump et al. 2006). Concern over the modification of images published in science journals has led to recommendations that the primary data and images be made available (Couzin 2006). Such calls would be unnecessary if primary data, whether alphanumeric, sound, or images, were automatically available on the Internet by the time of print publication.

Data publication can also bring benefits at a corporate level. If an organization is required to provide data to the public upon request, making data publication a routine practice can eliminate the tedium of attending to individual data requests piecemeal. Efforts to disseminate data sets through license agreements can also be time consuming, and because user needs vary, it can be difficult to standardize these agreements online without raising questions about liability should the data be incorrectly used (Freeman et al. 1998). Instead of licenses, "publication" is simpler conceptually and practically, and responsibility for use of the data more clearly lies with the reader.

In contrast to interpretations and opinions derived from data, the value of primary environmental and ecological data grows in time as they become harder to replace. Such data are inevitably a sample of what could be collected at different spatial scales and over time. Comparing new data with other data collected in the same or different places and times may reveal previously unknown patterns over larger areas and timescales. This immediate added value can be further multiplied by the opportunities provided for unforeseen uses and benefits, as found for genomic and proteomic data (Smalheiser 2002).

Why more data are not publicly available

In box 2 are a dozen reasons scientists gave me for not making their data publicly available online. They have been compiled from numerous meetings with researchers over the past decade. Although these statements do not constitute a quantitative survey of the community, they are considered representative. Indeed, some of these reasons were also reported in a survey of ocean data centers (Kohnke et al. 2005), and all arose in a survey of UK researchers (RIN 2008) published while this article was under review. The relative frequency or importance of the reasons is not considered, because they have a common solution-namely, to follow the practice of publication rather than data sharing.

There may be valid reasons for not publishing in any form, such as significant errors in the data, protection of individual privacy with medical or survey data, threats from overexploitation of species or resources, national security concerns, or matters subject to legal action. However, these concerns can be overcome by delaying publication for appropriate periods (Glover et al. 2006), generalizing the data in some way (e.g., giving only a region for the location of a rare species), or not publishing all of the data (e.g., excluding data allowing personal identification).

Too often, scientists release or make data available with conditions that restrict their use or distribution, and thereby create obstacles to their use. Such conditions may be the requirement that the data not be used without the author's permission, that they not be used for commercial purposes, and that any use requires coauthorship on any publications that arise from the data. The same scientists make no such conditions when they publish in print media, and they usually sign away copyright to the publisher and pay the publisher page charges for this service. In contrast, organizations involved in online data publication let copyright remain with the data providers, and to date they have not charged publication fees. The quasi-release of data by attach ing conditions to their use is unnecessarily cumbersome, contrary to the scientific publication process, a disincentive to others to explore their potential, and often impractical to enforce.

The term "commercial" is rarely defined and is subject to different interpretations. A developing country may argue that any knowledge gained by scientists in a developed country profits their "knowledge economy" and may result in direct or indirect commercial benefits. A scientist may profit personally by gaining professional promotion or obtaining research funding as a consequence of a paper published from the data. Furthermore, if a person or organization should use the data to produce new knowledge or products that can be sold, they should be compensated for creating added value. It may be difficult to distinguish what is commercial on the basis of a scientist's employer. Some research institutes are legally commercial companies (e.g., the National Institute of Water and Atmospheric Research in New Zealand), and government agencies and universities often do contract work for commercial companies.

As is the case for print media, there should be no discrimi nation as to who should have access to data published online. In turn, the requirement for data publication should apply to all instances in which the data served as the basis for published papers, regardless of who funded or conducted the study. For example, if a company wishes to publish a paper with graphs and statistics demonstrating the safety and efficacy of its new method or product, it should also be required to publish the data on which the results were founded.

How to motivate online data publication

The primary motivations for individual scientists to publish in print are to demonstrate their contribution to science, and the consequent peer-recognition that influences one's reputation and employment opportunities, promotion at work, and ability to win further research funding. Other factors may also exist, such as personal satisfaction in completing a study and enthusiasm about communicating findings and opinions to society. These motivating factors should also be brought to bear on data publication.

One common metric of peer-recognition is citation of papers. Citation also shows who is responsible for the information cited and provides its authority, a key aspect of quality assessment. There is a concern that data sets will not be cited in the same way that print publications should be when they are the source of information. This concern is justified, as most online databases do not provide a citation for each data set in a manner similar to that of print media, and data users tend to cite the Web site URL (Uniform Resource Locator) where the data set is found rather than the actual data set and its authors or editors, regardless of whether this information is available. Such incorrect citation is equivalent to authors' citing a journal rather than the papers published in that journal.

There is a precedent for this failure to cite the original source. The publications that describe new species are rarely cited when the species are mentioned in subsequent studies. Indeed, even the practice of citing identification guides and sources of species nomenclature in scientific publications seems to have waned (Agnarsson and Kunter 2007). If they were cited, taxonomic papers, revisions, and identification guides would be among the most highly cited publications, and they would have very long citation lives (Minelli 2003). To better recognize the contributions of taxonomists to science, different metrics are required, such as how often a species name is used both overall and in particular fields of study, such as agriculture or genetics.

Data are diverse in origin and format. They may (a) be bio logical, chemical or physical; (b) constitute environmental or physiological measurements by instruments, experimental results, or observations of species, animal behavior, and phenomena; (c) be derived or modeled from primary data; and (d) take the form of numerical, text, sound, or image files. Their value may be in being a reference or baseline, or in their potential for combination with other data to create new data sets (RIN 2008).

Data linked to species names can be published in GBIF, OBIS, Scratchpads (Roberts et al. 2008), and related systems that integrate standardized species data (e.g., Mayo et al. 2008). Physiochemical ocean data (including primary and model data) can be archived in IOC's network of ocean data centers, which increasingly make these data available online. The ICSU World Data Centers can accept a wide range of biological and environmental data. The old, pre-Internet model of data centers as archives of data is changing to one that provides an editorial service of quality control-which adds value-and online data publication. Users should have the opportunity to examine the original data, and to easily combine the data with other data.

Tracking data

The increased interoperability and linking between online resources can mean that data may be visible from several locations. The original source of data should be the basis for citation. To facilitate citation, data centers should track data access using automated tools and should display the results on their Web site (Costello and Vanden Berghe 2006, Blagoderov et al. 2008, Heidorn 2008, Roberts et al. 2008); that is, an index should be maintained that tracks data viewed, searched, downloaded, linked to, or cited. Thus, providers can refer to the Web site to see how often their data set was accessed. Authors of publications that use such data should cite the data sets in their list of references, as they do for print media.

There are several methods to track the origin of data. The unique ISBN (International Standard Book Number) and ISSN (International Standard Serial Number) assigned to a printed publication can be used to track and locate the product in bookshops and libraries. However, ISBNs and ISSNs are not assigned to individual articles within a journal. Because the URLs used for Web addresses change over time, registration systems for unique and persistent identifiers of items published on the Internet are being developed (Beit-Arie et al. 2001). A centralized registry now provides and administers a unique identifier for geoscience samples, the International Geo Sample Number, or IGSN (www.geosamples.org). The Handle System (www.handle.net) codes resources- whether journal articles or metadata-so that if their location changes, users can use these codes to find the items at the new URL. A development from the Handle System is the Digital Object Identifier (DOI), which is now widely used by journals and abstracting services to identify papers and their appendices published online; DOIs link to a full citation (i.e., author, title, etc.), and although the DOI is unique to the publication, more than one DOI representing the same item or object may arise (e.g., as would happen if different indexing services assign DOIs to the same publication). The PANGAEA information system at the World Data Center in Germany uses DOIs for primary data sets (Klump et al. 2006); corrected or updated versions of a data set receive a new DOI.

Automated methods to assign globally unique Life Science Identifiers (LSIDs) have been demonstrated for species names (Page 2006). Resolvable LSIDs for tracking species names have been implemented for the Catalogue of Life, Index Fungorum, and ZooBank using ontology standards developed by the Taxonomic Data Working Group. The LSIDs could, in principle, be used for data sets (Orme et al. 2008), but some organization would have to assume responsibility for creating and maintaining the registration system to ensure automated resolution of the identifying numbers, or the community could adopt one of the existing systems, such as the DOI.

For online published data to be cited and abstracted as scientific print papers are, the data set would need to clearly display the following information: author or editor, author's address, the data set's informative and unique title, abstract, keywords, a list of publications related to the data (e.g., publications describing methods or analyses derived from the data), and the name of the online publication Web site (Testa 2004). The data publisher should demonstrate scientific editorial standards, including transparency of the editorial process, names and addresses of editorial board members, quality control procedures, a peer-review system, and a list of data sets published and details about them; the online data publication should be open to international contributions (i.e., it should not be an in-house publication). The publisher should archive the data publication indefinitely at a publicly accessible location, such that future researchers can access the data that were used by others. Online data publications can conform to most of the typical publication standards for print journals, but there are important differences. Notably, in contrast to print papers, a data set published online may be corrected or enlarged over time and thus have several versions, and its size is better measured in data units or bytes than in pages. The more dynamic nature of electronic publications allows them to improve in quality and quantity over time.

The future for online data publication

Printing machines were invented more than 500 years ago. Anyone with the means could print anything they wished. In time, editorial and peer-review systems for scholarly publications came into being and quality improved. Similarly-but within the past 20 years-the Internet has allowed many people to publish whatever they wish on the World Wide Web. Editorial and peer-review systems are now evolving, and they will set a quality mark for online publications. Already, most print-based science journals publish online. Scholarly online data publication should include editorial oversight, standard formats and vocabularies, quality control checks, the ability to correct data found to be in error, quality indicators, and peer review (before or after publication). As with print media, the online data publication process must ensure that data survive and are accessible, that their integrity is maintained, and, critically, that they are citable.

Increasingly, environmental data collected by instruments on, for example, monitoring stations, satellites, buoys, and research ships can be immediately and automatically uploaded to a data center (e.g., Glover et al. 2006; see also National Ecological Observatory Network, www.neoninc.org/about-neon/overview.html). This ensures that the data are backed up, timely, and ready for use immediately (where appropriate, as with weather data) or for release after a certain period. Thus, where possible, the automated publication of data immediately upon collection is to be encouraged.

Journals that specialize in data publication are emerging, such as Acta Crystallographica E in chemistry, Data Briefs of the electronic earth science journals Geochemistry, Geophysics, Geosystems (G5 ) and Earth System Science Data, and Ecological Archives in ecology. Nonnumerical data, such as text and images describing species, can be published online using Scratchpads (http://scratchpads.eu/; Roberts et al. 2008). Ideally, as in these examples, data should be open access and in standard formats if these exist for the type of data published. Such journals publish data sets with a citation, abstract, and associated information, as papers are published. This information gives clear credit to the data creators and makes it possible to search for the data sets through bibliographic databases. Because users with this information are likely to cite the online data sets just as they do print papers, the data sets will enter the system of citation statistics. There is no reason in principle that data centers could not similarly provide conventional citations. Indeed, Scratchpads and OBIS do so, and GBIF is considering it.

Copyright issues are less likely to compromise data publication than they are in the print media, because facts, names, and short statements are not copyrightable, although some names and phrases may be trademarked. Thus, information is routinely extracted from the literature without infringing copyright, and may then be compiled into databases through manual or automated means. For example, descriptions of species are not "literary and artistic works" in the sense of copyright legislation, because they are formulated in a standardized language along standardized criteria. They can therefore be excerpted without infringing copyright and republished (Agosti and Egloff 2009). They may then be reaggregated into databases to provide guides to species identification and facilitate online taxonomic collaboration (Mayo et al. 2008).

Data centers usually add significant value to data sets through quality control procedures, ensuring adequate metadata, aggregating data from different sources, and providing online tools to explore, visualize (e.g., maps, graphs), and download the data in formats suitable for further research. Libraries may also archive data in print (and perhaps electronic) form, and some institutions now provide archival services for data. However, data deposited in libraries or institutional archives (or repositories) and published as appendices to journal articles do not get the same editorial quality control and peer-review attention as either journal articles or data lodged in special data centers. In other words, data centers can provide quality control as publishers do for print media and archiving as provided by libraries, and they add value through data integration, indexing, exploration, and visualization services. Preferably, data holders will publish not on their own Web site-where long-term maintenance can be an issue-but in international specialized data centers (e.g., GenBank, GBIF) that will maximize data availability and give it added value. This is the policy of the American Geophysical Union for its journals, and journals such as Proceedings of the Royal Society and Nature. The latter requires data to be sent to the journal for publication; data "cannot be hosted solely on the authors' own websites."

When data sets are published, they may be described using a standard set of information fields such as the "Dublin Core" metadata (and by an extension of it called "Darwin Core" if it includes biological species information). Increasingly, authors are required to enter their names, contact details, keywords, and abstracts into Web-based forms when submitting papers for publication. One can envisage this metadata being extended to provide standard descriptions of online data sets and key terms (e.g., name of a species newly described), which can be forwarded to abstracting services and other databases. This metadata is invaluable for allowing people to discover data sets that may be useful to them, but the metadata may not be sufficient to enable them to use the data. Procedures for publication of "use metadata" were recently described by publishers of geochemical journals at a meeting of Editors Roundtable on 16 July 2008 at the Goldschmidt Conference in Vancouver, Canada.

Data should not only be published, they should be published in a way that facilitates integration with other data, that is, in a standard, atomized format on the World Wide Web. Although not all data are easily integrated with other data sets, such as laboratory experiments, the low cost of online publication means that these data can still be published in a nonintegrated way (e.g., as an online appendix that future integration services may use). Where suitable online publishers do not exist for data, authors may publish them in data centers and, less ideally, as online appendices. The latter are generally not as useful as data centers because they lack standards for file formats, data organization, and metadata (Santos et al. 2005).

New data integration services are emerging, such as for geo logical maps (www.onegeology.org). In addition to the physical and geochemical sciences, scientists with interests in evolutionary, ocean, and biodiversity data have initiatives under way to further the publication of data of interest to them: the (a) National Evolutionary Synthesis Center in the United States, (b) Scientific Committee on Ocean Research and IOC's International Oceanographic Data and Information Exchange (Costello et al. 2008), and (c) GBIF, respectively.

As is the case in print media, researchers and journal editors need to judge which data merit publication. These decisions could be guided by criteria such as whether specialist publishers exist for the data in question, whether others have published similar data, and whether the data are needed to enable independent reproduction of study findings.

The only valid reasons for scientists not to publish their data online are the same as for not publishing in print media-namely, the data are of such poor quality that they could have no useful purpose, scientists lack the competence or time-management skills required to prepare data for publication, or publication is not a priority in the scientists' work or career. Thus, those who fail to publish data online should be viewed in a similar light as those who do not publish in print media. Withholding data after they have been analyzed and a study has already been published, with the intention of professionally profiting further, raises ethical concerns about whether the scientist is really motivated to advance science.

The next steps for data publication

The well-established and successful contemporary model of publishing scientific findings should be complemented by a system of data publication, ideally through data centers, in a way that enables the scientific creators to be credited and cited (figure 1). Greater accessibility and reuse of data will provide additional resources for research, and hence greater benefits to science and society. However, benefits to individual scientists will be fully realized only if the data are published formally and cited by users (box 1). The following actions are critical for full data availability:

* Before data collection, principal investigators must plan for data publication so the preparation of the data for publication is simplified and low cost.

* Scientists involved in the peer-review process should ask that, where appropriate, the data on which studies were based be publicly accessible (without preconditions) so they may be subject to independent analysis and their findings reproduced.

* Journal editors should require authors to publish their data online in standard formats, and, where available, through data centers that offer integration and archival services.

* Online data centers should publish clear, standard citations for data sets; track data-set access; and develop editorial processes to maximize data quality, data integration, accountability, visibility, and usability.

* Authors must cite online data sources as they would print publications.

* Citation services must include online data publications in their metrics.

* Employers of scientists must recognize the efforts of those who publish their data online as they do those who publish in print media, question why scientists have not published their data online, and include data publication as a measure of productivity and performance.

* Funders of research must (a) ensure that research proposals have a data management plan and an appropriate budget for data publication, (b) contractually require data publication upon completion of a project, and (c) withhold further funding from contractors who have failed to fulfill this requirement.

* Governments must financially support online data publication centers.

The main problem in data availability is not a lack of policy, technology, financial resources, or publication outlets, although data centers do need financial support (Merali and Giles 2005). Rather, it is that the science reward system has not kept pace with the new opportunities provided by the Internet, and does not sufficiently recognize online data publication. A change in science culture as a result of the Internet is under way (Kinne 1999, Costello and Vanden Berghe 2006), and we must adapt approaches to scholarly publication accordingly. A confluence of the availability of open-access online resources with the quality control systems that professional editorial processes bring, may be the optimal way forward. Readmore »»

Crash course

sent by hanief fahmiana

The government is obsessed with creating databases, says Ross Clark, but its failure to use IT effectively will cost us billions

I have some native sympathy with the lackeys struggling to handle the Inland Revenue's computers which, like a berserk one-arm bandit, have just spewed out an excess £1.9 billion in tax credits. I am not sure I am the best-qualified person to expound on the inadequacy of government IT systems. My own computer bears the large indelible bootprint of the Clark school of systems technology. It was imprinted a fortnight ago when the machine crashed, erasing two years' worth of work, or at least sending it somewhere deep into the bowels of the hard drive where it could only be recovered by the kind of forensic nerds who do kiddie-porn investigations. It is fair to say that if I were put in charge of some government computer system, it would have found some way of transferring the nation's currency reserves to Botswana before I finally took to the thing with an axe.

But something tells me that the government's super-nerds ought to have a slightly better affinity with computers. And at present there is scant sign that they do. The fiasco of child tax credits is merely the latest in a long list of government computer failures. Last November the Department for Work and Pensions's computer system, built by American IT contractor EDS Systems and Microsoft, crashed during an attempt to upgrade it. A £50 million Capita IT system designed to handle the government's Individual Learning Accounts had to be scrapped in 2001 after it was discovered to be fatally open to fraud. There was the Passport Agency's computer breakdown in 1999, and the failure of the Home Office's notorious computer system for handling asylum applications, which led to a huge backlog of applications and the resulting wave of illegal immigration, in 2000-01. That is not to mention the Probation Office IT system, which was scrapped after £120 million had been spent, nor the NHS computer system, whose costs have already run to more than three times its original £6.2 billion estimate. And on it goes.

Admittedly, computer failures are hardly unknown in the private sector. An American study recently claimed that only a quarter of IT systems installed by businesses performed as they were supposed to. But government IT failures seem to be on a scale way beyond that which would be tolerated in the private sector. In the first 20 months of operation of the Child Support Agency's new £456 million computer system, four out of every five claimants were sent the wrong payments, which resulted in the resignation of Doug Smith, the head of the CSA, last November. If the computers of a high street bank had made errors in 80 per cent of customers' statements, it would be a bank no longer.

It isn't easy to find people in the IT business prepared to speak on the record about the government's inability to use computers effectively - perhaps not surprisingly, given their eagerness to secure a slice of the £2.3 billion which the public sector spent on IT contracting last year. However, one whose company has been involved in government work was prepared to have a blast at the public sector's bad planning off the record. 'It is possible to build computer systems on the scale of the government's systems and to have them delivered on time and on budget; Mastercard and Vodafone do it,' he says. 'But what it requires is good project management and no political interference while the system is being built. What tends to happen with IT systems in the public sector is that every minister wants to ride his own hobbyhorse. The project suffers mission creep, which leads to greater complexity as bits have to be "bolted on". This is going to be the problem with the ID card scheme: the government has failed to articulate its goals for the scheme; first it was about fighting terrorism, now it seems to be about identity fraud. By the time the scheme is up and running its scope will have been changed so many times that it is bound to function less well than it would have done had it been designed for its ultimate purpose at the outset. Even if the ID database fails in just 1 per cent of 1 per cent of cases, with 40 million people on the system it could still mean 4,000 people fitting the same description.'

Another IT contractor makes the point that the public sector condemns itself to lousy work by always accepting the lowest bid, regardless of quality. In spite of the public sector's obsession with national databases, it is failing to use IT in simpler, more proven ways which could cut administrative costs considerably. 'All our orders are now processed electronically,' says the head of one IT firm. 'By doing so it is possible to realise huge productivity gains without any increase in staff. Yet e-commerce isn't being used to anything like the same extent in government yet.'

Last year the National Audit Office (NAO) published a damning report on the Libra project, an IT system to link all 22,000 staff employed by magistrates' courts. The Lord Chancellor's department, found the NAO, failed to take any external advice when signing up for a system which was supposed to cost £146 million when commissioned in 1998. Moreover, civil servants handled negotiations with the contractor, ICL, about as feebly as an octogenarian faced with a cowboy plumber armed with a monkey wrench. Twice ICL came back and demanded more money; yet rather than telling ICL to get on with the job, the Lord Chancellor's department simply agreed to pay more. ICL never did finish the project, and in the end taxpayers paid £390 million for the system.

Coming from a government which has leaned over backwards to preach the virtues of the e-this and e-that, it is an abysmal record. Three years ago Tony Blair stated his dream of having every Briton online by 2005, to which end the government handed out free computers to the inhabitants of Liverpool. Showing their usual entrepreneurial talents, some of them promptly flogged their machines for £100 a time. That incident sums up the government's starry-eyed attitude to IT. But the shocking thing is that ministers never seem to learn. Having suffered a litany of computer failures, the last thing the public sector needs is another string of initiatives which would rely on even bigger and costlier IT systems. Yet what are the government's latest wheezes? An ID card scheme involving a database of biometric data on 40 million adults, and a national road-pricing scheme which would have to track the daily movements of 30 million vehicles.

It isn't hard to predict where these two schemes are heading: computer crashes, identity fraud, costs over-running by billions, Bob from Bognor surrounded by armed police at Dover because his eyes match those of bin Laden; motorists sent bills bearing no relation to where they have been. Already a study by the London School of Economics has denounced as wildly inaccurate the government's estimate that the ID card scheme will cost a mere £7 billion. The lowest it could possibly cost, say the academics, is £10.6 billion, and that is before the inevitable bugs and cost over-runs are taken into account. More likely it would cost more than £19 billion.

But at least somebody will get something out of it: the public sector's avid porn-viewers. After all, this is the one undoubted success of the government's drive on IT: to provide lurid entertainment for public servants during their lunch breaks and other quiet moments. According to a study by the Audit Commission, 2.3 million pages of porn were called up on the government's computers in just 10 months. Never mind your overdue tax credit, your delayed passport application or the spurious bill for driving through Stoke last Tuesday morning; Big Brother is too busy salivating in front of a screen full of tit and bum while he awaits the man from systems.
Readmore »»

A Two-Way Bioinformatic Street

send by ratna tiyani

The rapid emergence of Web-based bioinformatics systems reflects the research community's attempts to embrace the biological complexity uncovered by high-throughput genome, transcriptome, and proteome data acquisition and the sheer size of the modern scientific endeavor. If information systems can match this complexity, biology will be enriched as a result. If not, scientific excitement may paradoxically be dampened by data flow. The question is, how should biological information systems and the relationship between those who use them and contribute to them further evolve?

Before the advent of high-throughput research genres such as genomics and proteomics, fields already replete with information such as cell signaling (focused on uncovering the flow of information through a cell) advanced through scientists cross-communicating and assembling and synthesizing their own information. Because deciphering cell signal transduction is crucial to understanding normal and diseased biological processes, curating reliable data in the field has become at once a necessity and an enormous challenge, given the massive increase in available data. Cross-communication between the users and curators (also enlisted as experts, authorities, and gurus) of databases is now at the heart of enhancing data reliability. Efforts including the Connections Maps at Science's Signal Transduction Knowledge Environment (STKE) and pathway-building at Biocarta, Inc., exemplify Web-based databases that include an avenue for making the curator/user interface a two-way street. Enhancing curator/user exchanges might make visiting these environments a more lively and entertaining experience and increase their usage, large-scale participation being the sine qua non of usefulness to the scientific community.

A primary ingredient for massive exchange of information among multiple bioinformatics tools and databases is curator tagging of input information to enable proofreading and data correction. Minor changes in a protein or DNA sequence entered into a gene or protein database can be corrected and generally will not propagate error throughout the entire informational system. Bad information in a protein interaction or pathways database is trickier. If information gatherers skip a step (for example, entering interaction information based on one experimental approach before it is confirmed by another), the line between potential and actual information is blurred, and the data must be filtered for reliability to constrain legitimate signaling possibilities. Users should assert the primacy of stubborn experimental facts at all stages of signaling bioinformatics analysis, and curators must respond quickly to this input. At STKE, for example, information is encoded as either established or speculative, the latter to be deemed reliable or jettisoned in response to user input. Coupling a robust curator/user interface with the obligate entry of signaling data into a centralized repository upon publication, analogous to obligate submission of new DNA sequence information, is one way to combine greater intensity of curator/user interaction with increased database population, fostering greater data reliability. This might help both to accelerate the growth of cell signaling bioinformatics and to increase genuine open access to the knowledge derived from taxpayer-supported research.

Another critical element in developing cell signaling databases is providing access to the raw data for swapping among various software platforms for visualization and analysis of biological information, including cell signaling pathways. Molecular interaction data from the Biomolecular Interaction Network Database (BIND), for instance, can be exported to an assembly-based information software system such as Cytoscape, greatly enhancing the value of the underlying data set. The availability of curator-tagged input data wrapped for portability should promote efficient distribution of data entered at any port, into the entire network of signaling tools. It will also improve curation, avoid duplication of effort, and eliminate tools that lack content for application. The gurus should argue strongly for it.

Used intensively, a well-connected array of bioinformatic tools can form a computational "working memory" for assembling biological information from specialized organism, cell system, and molecular data that the scientist can access for designing new experiments that are maximally informative. Movement toward centralized electronic pathway submission and improved data portability will make it possible to integrate new sources of data, including cellular locations of signaling complexes and components, quantitative aspects of signaling, and pharmacological data, into current pathway analysis databases and tools. This should be a strong motivation for the scientific community to increase its collective investment in the next phase of signal transduction bioinformatics development.
Readmore »»

Monday, October 19, 2009


nama : Sukma Yuliardhiana
nim : 124.07.1049
In addition, First DataBank announced that it has partnered with PharmaSURVEYOR, a founding member of the Health 2.0 Accelerator (H2A), to launch a pilot project for direct online access to the company's proprietary drug information and standardized drug codes. The Drug Code Lookup Service (DCLS) will provide Web developers with easy online access to First DataBank's standardized drug codes with the aim to promote interoperability among Internet-based healthcare services and the portability of patient records across healthcare providers.

"We're excited to work with PharmaSURVEYOR and H2A to make reliable up-to-date drug information available and affordable to this emerging ecosystem of patient-centric healthcare technologies," said Don Nielsen, MD, president, First DataBank. "It is our intention that this pilot program will facilitate ease of information exchange so that ultimately the healthcare consumer can easily access the information they need to make informed health care decisions or improve the health care they receive."

Drug codes accessed through the DCLS are provided from First DataBank's National Drug Data File (NDDF(TM)) Plus, one of the healthcare industry's most widely used sources of up-to-date drug information utilized at the point of care. PharmaSURVEYOR will provide online access to NDDF Plus codes through Web services using a pay-per-use pricing model.

"The Drug Code Lookup Service couples the highest standards of drug data curation with the access, ease of use and limitless scalability of Web 2.0," said Erick Von Schweber, Executive co-chair of PharmaSURVEYOR, Inc.

The partnership between First DataBank and PharmaSURVEYOR also illustrates a key focus of the Health 2.0 Accelerator consortium: collaboration. H2A works with its participants to encourage collaboration around specific business or technical objectives to benefit consumers in ways that are challenging or inefficient for companies to tackle alone. For established health care companies, such as First DataBank, H2A provides an opportunity to collaborate with Health 2.0 companies and pursue innovative solutions for empowering patients to more actively engage in managing their health.

"We see this partnership as a potential model for how innovative Health 2.0 companies like PharmaSURVEYOR and seasoned health IT leaders like First DataBank can come together as H2A participants to make reliable data and powerful tools available to the community," said Julie Murchinson, executive director of the Health 2.0 Accelerator.

The DCLS pilot will be highlighted during the Health 2.0 Conference in San Francisco, on October 7, 2009, and will be made available by PharmaSURVEYOR to members of the Health 2.0 Accelerator. The pilot program is the first in a series of offerings to make drug data available through online services. Following the pilot program, general availability of the Drug Code Lookup Service is planned for early 2010. About Health 2.0 Accelerator H2A is a nonprofit organization whose goal is to advance consumer-centric health care by driving integration of technology and the consumer experience across a network of new and established technology companies and health care organizations. Through the development of a common understanding of consumer value and promoting trusted use of Health 2.0 technology solutions, Health 2.0 Accelerator aims to remove integration barriers that limit mutually beneficial collaborative business opportunities, while facilitating the integration of Health 2.0 technology solutions in "traditional" health care environments and with provider-centric technologies. For more information visit www.health2accelerator.org. About PharmaSURVEYOR PharmaSURVEYOR, the most advanced drug safety utility, goes far beyond drug interaction checkers, is free to consumers and is also available as an online service for integration with healthcare and medical websites, applications and record systems. Powered by combinatorial risk assessment and resolution, consumers, clinicians, providers and payers can now identify, diagnose and resolve adverse drug effects responsible for 1 in 10 American fatalities at a cost of $177 billion annually. For more information visit www.pharmasurveyor.com. About First DataBank First DataBank, a subsidiary of Hearst Corporation, drives patient safety and healthcare quality by providing drug databases that are used within information systems that touch every aspect of healthcare. For 30 years, we have partnered with system developers to integrate and optimize our drug information to improve user workflow and enhance clinical decision making by those entrusted with treating patients at the point-of-need. For more information about First DataBank, call 800-633-3453 or visit www.firstdatabank.com. Readmore »»

nama : I Putu Adi Pratama Yuda
nim : 124.07.9021
Adaptive and science-based management is widely accepted as necessary to safeguard wildlife and their habitats in the future (Bookhout 1994, Walters 2001). However, many decisions in this management field are still based on unsupported ideas, soft concepts, qualitative evidence, beliefs, and even myths and folklore that lack a true back-up assessment, validation with real data, and quantitative analysis available for a public review. Often, data for such decisions are missing, do not provide clear support towards the decision, or even show results opposite to what the public assumes to be the case (Morris and Doak 2002, Anderson et al. 2003). A widely accepted excuse for such situations is that "we have no better data," that "we cannot do any better with the current funding and technology," or that "science did its best job under the circumstances." There are many examples where the lack of data can be used strategically to delay decisions, which is not conducive to efficient wildlife management (Tear et al. 1995, Ehrlich and Ehrlich 1996).

Decisions based on soft foundations can harm wildlife and habitat and threaten their future survival (Lichatowich 1999, Noss 2000, Beissinger and McCullough 2002). To assure quality and minimum standards during times of certifications, I suggest a new management approach that is now possible, if not imperative, with improved computer and information technology and with increasing access to available and free data over the internet. This new management approach is partly a consequence of the effective implementation of the U.S. Freedom of Information Act (U.S.C. § 552) and the National Spatial Data Infrastructure (Organic Act of March 3, 1879, 43 U.S.C. 31; Executive Order 12906, April 13, 1994) inaugurated by President Clinton in 1994 (see also Czech and Krausman 2001 for decisions regarding endangered wildlife species). The approach I suggest is in line with the ISO certification processes that are going on world-wide and are believed to standardize and improve the output of the work place.

The proposed concept and implementation of this new management scheme is relatively simple and follows common sense (see Eisner et al. 1995). It also presents great promise and true progress for wildlife biology, a discipline that became somewhat struck with the public perception of being conservative, politically driven, somewhat intransparent, and suffering from technophobia.

NEW MANAGEMENT PARADIGM

The new management paradigm I propose is founded on scientific databases.

First, to justify management decisions relating to wildlife, habitat, and conservation, the listing, use, and full investigation of all available and relevant databases needs to be implemented. secondly, all the situations, and wildlife and conservation topics, where no or insufficient data exist, need to be pointed out boldly and improved immediately. In the meantime, the current lack of specific information must not be used to hinder wildlife management and conservation efforts. If data are missing, the simple knowledge of their non-existence needs to further facilitate funding of specific research to fill such gaps. I suggest that such cases receive prioritized funding attention. The results and products from this funding will then be made available to the scientific and management communities and the general public. Managers are sometimes forced to make quick decisions without adequate information (Morris and Doak 2002), but they must not be penalized for doing so, even if they are proved wrong when the research and data analysis is eventually completed. A manager who has to make a decision in the wildlife and habitat discipline might be reluctant to push hard for the research to be done and the data collected out of fear that the new research and data would not support the former decision.

Perhaps the problem outlined above could be circumvented by implementing a strategy that is similar to ISO standards. Each major wildlife management decision would add a standard management documentation system as a support-backup that will clearly describe what data and research were available at the time of the decision. The documentation system will also recommend what research needs to address to complete the data situation and improve the data uncertainty. For instance, a selfrating system for decisions might be included. Based on a scale of 1-5, a 1 would mean the decision was based on no research data at all, and 5 would mean that complete research and data were available and acted upon. The rating system would also include an explanation of why the decision was made if it was based on incomplete research and data (Smallwood and Morrison 1999).

In addition, the underlying raw data of major management decisions will then be presented to the public over the internet to ensure transparency. These internet sites will further link to and be authorized by the decision-makers. Drawing on existing and established policies, these data will be peer-reviewed to assure minimum quality standards (i.e., in regards to a valid research design and analysis for drawing sound inference from data). A peer-review system for approved internet data has not yet been established, and will represent an additional work requirement for the peers involved, in addition to reviewing loads that often are already heavy. However, technological advances and changes require new approaches, new job descriptions, and new visions. The wildlife profession needs to embrace all digital opportunities and responsibilities.

The publication of authoritative digital data sets, instead of traditional and difficult to use hardcopy text publications of data interpreted by experts, is urgently needed for the betterment of efficient wildlife management. The sole use of expert opinion has been widely criticized by an increasingly critical public (Beazley and Boardman 2001, Lomborg 2001). The wildlife discipline will be flooded with data (for applications see Smallwood and Morrison 1999, Huettmann 2004). Therefore, a reviewing system of quality assurance for wildlife and habitat-related data sets needs to be established, preferably with appropriate professional rewards for reviewers.

With the raw data for the management decision will come its interpreted version as a peerreviewed publication that outlines all steps in detail and in a transparent fashion to derive conclusions. Focusing on the detailed presentation of the methods becomes increasingly important for such publications and data sets because technological and statistical details are becoming more and more sophisticated and complex. Thus, they are more prone to error; for instance, if 1 methods step was carried out wrongly or done in an unjustified fashion, the wrong results could be obtained. For traditional reviewers, journals require that papers contain sufficient information to allow the reader to fully replicate the experiment (Romesburg 1981). It is often the case that this requirement appears to be met in current publications and datasets, but critical information often is lacking or obscured (Riisgard 2000, Anderson et al. 2003). This may be due to lack of journal space, to concern about intellectual property protection, or from lack of review or editorial attention and awareness to the problem. Such issues need to become transparent.

IMPLEMENTATION OF THE NEW PARADIGM

Wider legal frameworks exist in support of the implementation of the paradigm (IUCN 1991; UNCED 1992 in Strong 2000). Data, raw and interpreted, can easily be made available to managers and the public over the internet or by using other methods (e.g., in-house servers or digital media). In this way, data could become directly incorporated into management decisions (i.e., in a legal or public context). Availability through authors' personal websites is a stop-gap measure, not providing for long-term storage guarantee nor adequate search capacities. Web-based journal archive sites and long-term solutions are preferable to assure data availability for a long time.

In times of mixed project funding, where industrial funds play a crucial role and where governments closely collaborate with industrial partners (Boardman 2001), not all projects may be willing to make their raw data fully available. Due to visionless funding situations, the above noted problem occurs commonly, e.g. with some cosponsored University projects and Canadian Model Forests. Often the commercial advantage of data is in its timeliness. For many wildlife and habitat applications, the release of industrial data can be delayed a few months until it is no longer strategically useful to the industry. Most governmental project data have already a qualifier that they will be made available after project completion (e.g., 4 years). At the end of that time, these data should be made available for a truly transparent decision-making process benefiting wildlife and habitat. However, this concept has not yet been fully integrated (with suitable metadata for instance) into most of the world's national forest, fisheries, oil resource inventory data, or data on species that experience hunting and harvest pressure such as grizzly bear (Ursus horribilis), wolf (Cams spec), deer (Odocoileus spp.), moose (Akes alces), whales (Cetaceans), and salmon (Oncorhynchus spp.). The situation described is particularly relevant to industries that are heavily supported by governmental tax funds and which, therefore, must conform to the national Freedom of Information Act and contribute to the common good. The legal situation regarding data availability needs to be clear, or otherwise improved and solved immediately. Violations need to be pursued and situations resolved.

Difficulties Implementing the New Paradigm

My suggested new management scheme will neither be simple to implement in the real and public world of wildlife and habitat management, nor will it be readily accepted by the current governmental administration and industrial support system: it requires a new digital culture. The biggest hurdle might be the amount of time and money required for adequate documentation of the management decision. Also, there may be vested interests of data keepers who make a living from knowing data sources, specifics of data, or the ability to operate data sources to their own advantage. Such people do not share data with anybody or with unwanted groups. Last, an inherent inertia against all new developments, the public exposure of workplace issues, technology, and new scientific approaches (Starzomski et al. 2004) has to be overcome (Franklin [2001] for Remote Sensing Applications, Foreman [2001] for Landscape Ecology, Young and Clarke [2000] for Genetics and, Morris and Doak [2001] for Population Viability Analysis).

A recent addition to these potential problems is presented by security and privacy regulations, different in various jurisdictions, that regulate who may access what information under what circumstances. Other relevant questions are what level of raw wildlife and habitat data is to be provided and whether these data have to be peer-reviewed and undergo any sort of quality control and standard, such as the widely suggested Federal Geographic Data Committee (FGDC) Metadata reference (http://www.fgdc.gov) by U.S. Geological Survey (USGS). Smaller but still expensive problems are presented by the necessity of database maintenance and by the required long-term computer infrastructure (e.g., transferring files to updated software versions). Further known problems can occur with environmentally sensitive data. The classic case is that of the location of endangered and precious species (e.g., falcon nests being exposed to the public for potential harvest and destruction). A real threat is often not well founded, and this concern is used as an argument to hold back the release of data to the public. If the release of data is valid, this issue can be addressed with a sensitive time delay or with other measures that coarsen the resolution of the information.

If science-based management is really to be carried out, the above noted situations need to be addressed and solved rather than delayed and ignored. There is no reason to give political arguments a preference to well-thought-out sciencebased arguments. Peer-reviewed scientific facts are not always immediately available when needed. Nevertheless, as long as science-based management is the legally binding, official paradigm, the availability of peer-reviewed information needs to be the only goal, and this goal must not be sacrificed. Decisions that satisfy political reasoning alone, not giving much weight to data-driven scientific ones, have not proven to be effective for species and habitat management in the short or long term (Harris [1998] and Weber [2002] for fisheries, and Myers and Kent [2001] for other resources worldwide). If safeguarding wildlife and their habitats, as well as adhering to the professional ethics of wildlife managers, is the goal, then it is crucial to identify all situations in which there is a policy of ignoring or not developing mutually accepted scientific databases. These situations have to be improved rather than or ignoring scientific facts.

Most importantly, managers need to be dataaware and preferably be able to handle databases themselves; they need to deal both with raw data sets and their interpretation. I perceive this also as an important topic for our wildlife biology and management education (e.g., at the university). Being "data aware" and "data fluent" will allow wildlife biologists and managers to make efficient decisions based on data and reality. Instead, in many wildlife workplaces, the computer hardware, software, and network infrastructure is not up to date. It is harmful to use unrepresentative database subsets, or slim, already pre-interpreted but "manager-friendly" data sets for drawing management conclusions. Representative high quality wildlife and habitat data need to exist first. Wildlife biologists and scientists across disciplines need to work hard to provide these data with a sound scientific and biological design, put them in useful formats for easy analysis and interpretation, and present their conclusions in a transparent fashion. The use of the World Wide Web as a transparent data repository for raw data and as an interactive presentation tool of interpreted data facilitates all these steps, as occurs for meterological and many other data sets. If I want to examine temperature or rainfall patterns for some time and place, I can usually readily find the data on publicly available web sites. In contrast, if I want to examine plant or animal populations or changes in range data for some species, it is unlikely to be readily available. A noteworthy institutional exception to this is the posting of breeding bird survey data on the web by the Office of Migratory Bird Management (http://www. pwrc.usgs.gov/bbs/). Unfortunately, some species are not provided because of fear that people will visit sites (e.g., for falconry). I suggest the wildlife profession needs to embrace technology and its exciting achievements further.

At first glance, the outlined approach seems to apply only to quantitative approaches. Quantitative solutions dealing with wildlife and habitat are still not possible in many cases, and where they are possible, they can go hand in hand with qualitative approaches. For instance, if one does not know the exact population size and trend of an endangered species, as derived from surveys and a statistical research design, one could go with expert estimations, local, and traditional knowledge derived from questionnaires and interviews, or at least with 3 scenarios such as best, medium, worst. There is also increasing use of predictive modeling, machine-learning, decision-support software, and similar approaches, all of which should document the facts and processes used in an open, repeatable, and intelligible way. In the case of wildlife population estimates, rather than quantitative measurements, latest methods, tools and data used to derive these estimates need to be made available and then to be applied. This has not been fully achieved yet even in many progressive wildlife capture-recapture, telemetry and behavior studies.

Specific Database Issues

Data in a database can be nonrepresentative, sampled in a biased fashion, or be originally collected for a different purpose. Large databases with many representative data points are therefore preferred. However, they can have additional problems such as lack of quality assurance for every individual record. Adequate metadata-reporting should make this situation obvious for the specific wildlife species and for a habitat application. If wildlife and habitat management decisions are based on undocumented and weak data foundations, effects could prove highly negative. Such data shortcomings can more easily be fixed in future monitoring schemes and studies when their lack is made clear and documented. The role of science to cater to the needs of managers and the public remains of crucial importance. However, today many publications in the field of wildlife management and research still do not show the entire data set as part of the research methods, nor do they make the data freely and easily available for evaluation of wildlife management decisions (e.g., PDF files). This potential should be used to our full advantage.

Proof of Concepts from Elsewhere

The suggested approach is already proven to work. For example, it is common in many statistical publications such as classic statistical sample data sets used in S-PLUS to calibrate and evaluate analysis techniques (Venables and Ripley 2002) and the latest policy in Journal of Ecological Modelling (http://www.elsevier.eom/pub/9/17/elan.htt?jn 1=ecomod). There is no reason why it will not be successful in the discipline of management and decision-making (Smallwood and Morisson 1999). Philosophies outlined in the OpenCola and CopyLeft movements (Lawton, G. 2003. The Great Giveaway, http://www.newscientist.com/hottopics/copyleft/copyleftart.jsp), open source software such as the R-Project (http://www.r-project.org/), MYSQL Project (http://dev.mysql. com/), and LINUX (http://www.linux.org/) already enjoy global success. Therefore, there is no reason why this transparent approach will not also be possible and become mandatory in the field of wildlife management and its science. It can be implemented in any research publication if insisted upon by reviewers, editors, and the general public. Following these procedures, progress can be made and professionalism can be further improved. In contrast, many datasets from wildlife-related fieldwork are still bunkered and strictly guarded, or they are hidden in hardcopy publications rather than being readily digitally available for implementation (Beazley and Boardmnn 2001).

Over the last 50 years, it appears that sections of the wildlife discipline have developed a certain subculture that is not inclined to release strategic wildlife and habitat data on which decisions are based. The specific reasons for this phenomenon are hard to locate. In support of this claim, and despite increasing use of telemetry projects worldwide, one might try in vain to locate available wildlife telemetry data that are freely available for public download from the internet. (Note that BirdLife International is in the process of compiling and releasing global seabird satellite telemetry data). None of the major telemetry companies promote this concept either, and therefore they cater to the development of "secret wildlife data fiefdoms" to be used in the political arena as data holders please.

Traditionally, the public had to believe the expert. After decades of experience with this system, the public developed ways around these experts (e.g., ballot initiatives). Despite new technology available and applied elsewhere, this expert tradition, without public or external peerreview, is extremely harmful to the credibility of the profession, to efficient wildlife management, and to the long-term survival of wildlife and habitats (see Czech and Krausman 2001 for further elaboration about "Technocracy"). The artificial nature of the expert tradition becomes clear when we understand that any of these accepted policies are man-made. For instance, besides lacking wildlife telemetry data ready for download, one will also rarely find complete wildlife survey datasets for public display, or even ready for free download and assessment of decisions.

The widely used policy in management circles to refer to their own in-house and gray research literature to back up management decisions is ambiguous and difficult to defend. The public has already noticed that when an article is peerreviewed it does not automatically assure transparency (Romesburg 1981); external, public review remains of major importance. This can only be achieved by making the raw data available for public evaluation (e.g., on the internet). The opinion and hierarchical position of the data collector's manager often comes across in publications related to wildlife management decisions rather than the pure data and objective facts. The wildlife community needs to identify such situations, access all relevant data, investigate them fully, evaluate management decisions and, if necessary, improve them for the sake of efficient and reliable management of wildlife and its habitats. This follows an outline of professional ethics (Bookhout 1994); we owe it to the wildlife.

Importance of Habitat and Available Analysis Tools for Wildlife

Besides data on wildlife populations and ecology, the crucial importance of habitat information has long been recognized, even within our discipline (Bookhout 1994). It is already a well known fact that urgently needed demographic data on individual wildlife populations of interest to management are usually missing (Morris and Doak 2002, Anderson et al. 2003), and few are freely available on the internet. For general habitat data and when using Geographic Information Systems (GIS) to describe habitats, the situation is a little better because data are available online: for example, via the GAP project (http://www.gap. uidaho.edu/) and GeoGratis (http://www.geo gratis.ca/). However, it is still difficult to obtain even basic and biologically meaningful habitat data and GIS maps, outside of the United States and its influence zone, that are relevant for specific wildlife species. Consistent landcover maps, vegetation structure information, food and prey layers, easy-to-use high resolution satellite images, and Digital Elevation Models (DEM) are still hard to come by. This lack of specific habitat data is particularly true for applications in the developing world and for the ocean (these 2 regions cover over 80% of the globe and capture huge proportions of the global biodiversity). In addition, sufficient scientific and management tools to analyze and handle wildlife and habitat data are still not available (exceptions are, for instance the free Animal Movement Extension for ArcView GIS http://www.absc.usgs.gov/glba/gistools/animal_ mvmt.htm, PopTools (http://www.cse. csiro.au/poptools), or DISTANCE Sampling software http://www.ruwpa.st-and.ac.uk/distance/, and the Wildlife Ecology software Clearinghouse http://nhsbig.inhs.uiuc.edu/). Sound and efficient decision-making for wildlife and habitat is not assured without the availability of such crucial high quality databases and tools. First moves towards such approaches are already underway, such as the National Wetlands Inventory database (http://www.nwi.fws.gov/), Direct Access to Climate Data http://ferret.wrc.noaa.gov/las/main. pl?cookieCheck=1), the United States-wide GAP Analysis, Breeding Bird Survey (BBS, http://www. mp2-pwrc.usgs.gov/bbs/retrieval/menu.cfm), and others. However, once again this covers primarily the continental United States, and speciesspecific and updated wildlife food and predator layers are still missing.

Global View Required

Despite national or continent based approaches and standards, a truly global approach and vision is still required, especially for our migratory wildlife. Fortunately, the Global Register of Migratory Species (GROMS; http://www.biologie.uni-freiburg.de/data/riede/groms.html), the International Steering Committee for Global Mapping (http://www.iscgm.org) and SAGE (Center for Sustainability and the Global Environment) and HYDE Datasets (Historical Land Use Changes over the past 300 years http://www. sage.wisc.edu/pages/datamodels. html), and a few others, have already started a global habitat mapping approach, but much more effort is still required. Standardized survey protocols and efforts are necessary to assure high quality data provision for the survival of wildlife and habitats.

The fragmented European Union and Asia fall behind on most of these matters and do not make use of their full potential and national wealth. Considering the tremendous global importance of data, the U.N. (United Nations) also has too few relevant global funding initiatives to improve the entire global data situation and does not yet provide a well-funded and science-based leadership for a sound and fair infrastructure for decision-making for the global community. Considering the importance of data for wildlife and habitats of the world, the Global Biodiversity Information Facility (GBIF; http://www. gbif.org/) can hardly fulfill even its basic mandate because it must be considered underfunded and understaffed. Of interest should be the data efforts by powerful NGOs such as World Wildlife Fund, Worldwatch Institute, National Geographic, Smithsonian, Conservation International, Audubon Society, and others: these organizations often have effective data branches and may even own the only available data sets, for example, the only consistent map of the Remaining North American Coastal Old-Growth Forest (Sierra Legal Fund) or African species data (World Conservation Society).

In times of globalization, the global outlook must not be ignored in the database context. Currently, most free wildlife and habitat data warehouse approaches are found in U.S. However, some other countries such as Canada (GeoGratis http://www.geogratis.ca/, CBIF [Canadian Biodiversity Facility] http://www.cbif.ca) and Mexico (CONABIO, http://www.conabio.gob.mx/) are readily following these approaches. Pressing conservation and habitat problems are largely found in the tropical regions and international oceans. In these regions, underdeveloped countries neither gather much data nor release them free of charge, nor do they have the means to maintain databases at great expense. There might still be huge potential for international NGOs to contribute. Adopting such policies could be made an integral part of scientific wildlife assistance programs, development aid, or international trade agreements. I hope that approaches like GBIF on a global level and GAP and the Canadian GeoGratis and CBIF on a national level will be further used to support the suggested data-driven approach to the management of wildlife and habitat.

MANAGEMENT IMPLICATIONS

It is crucial that scientific wildlife and habitat databases now enter the legal management process and that the importance of databases is fully acknowledged by managers. I appreciate that policies suggested in this article can result in a redistribution of funds and reorganization, if not redefinition, of governmental funding agencies, universities, research positions, scientific agencies, and scientific publications towards better serving the global community when it comes to wildlife, habitat, and human society. Ignoring data and their availability in the management process lead to fatal decisions that harm wildlife and humans alike. There is a lot of work to do to enable sound decision-making in the field of global wildlife management; the wildlife biology community is encouraged to take the outlined issues seriously and to start implementing them. Readmore »»