避免不當的數據處理 How to avoid inappropriate data processing 中興大學生物醫學研究所陳鴻震 教授 (96.1.24) 演講人:中央研究院 生物化學研究所 陳瑞華研究員 (96.1.26) 中央研究院分子生物研究所 孫以瀚究員 (96.2.6) 主要目的: 避免「無心之失」,而非防止蓄意造假 對象: 學生、助理、博士後、老師、學術主管 Definition of research misconducts "Research Misconduct" means fabrication, falsification, plagiarism, or other practices that seriously deviate from those that are commonly accepted within the scholarly community for proposing, conducting, publishing or otherwise reporting research. It does not include honest error or honest differences in interpretations or judgments of data. http://www.ncsu.edu/sparcs/policy/references/interim.html 重要的結果一定會被人重複, 造假一定會被發現!後果嚴重! 蓄意造假、欺騙 全民偵探,現代技術,造假容易被發現! 避免自己騙自己,只看到想要的結果 不當操作,非蓄意欺騙 正當操作 避免mis-representation,誇大結果 The most prevalent problem Sloppy experimentation and misrepresentation of data Data acquisition (data selection or questionable statistics) Improperly handling image data Experiments Data acquisition Data analysis Data presentation Guidelines for image manipulation (J. Cell Biol.) No specific feature within an image may be enhanced, obscured, moved, removed, or introduced. Adjustments of brightness, contrast, or color balance are acceptable if they are applied to the whole image and as long as they do not obscure or eliminate any information present in the original. Nonlinear adjustments (e.g., changes to gamma settings) must be disclosed. The grouping of images from different parts of the same gel, or from different gels, fields, or exposures must be made explicit by the arrangement of the figure (e.g., using dividing lines) and in the text of the figure legend. (http://www.jcb.org/misc/ifora.shtml#digim) Not recommended (pixel number changed) Do not use nonlinear adjustments Make sure the "Resample Image" box in "Image Size" dialog window is not checked and the "Width", "Height", and "Resolution" boxes should be linked by the graphic chain. (JCB) linked Do not check this box Resampling refers to changing the “pixel dimensions” (and therefore display size) of an image. When you downsample (decrease the number of pixels), information is deleted from the image. When you resample up (increase the number of pixels), new pixels are added. Downsampled (decrease the number of pixels) Original Resampled up (increase the number of pixels) Pitfalls in image acquisition from microscope • Sample preparation fixation, permeabilization mountant, sealant • Objectives (NA, reflective index of immersion medium) • Fluorochromes and filters • Chromatic aberration • Spherical aberration • Acquisition settings • Quantitations North, A. J. (2006) Seeing is believing? A beginners’ guide to practical pitfalls in image acquisition. JCB 172: 9-18. Guidelines for digital images (Nat. Cell Biol.) Authors should list all image acquisition tools and image software packages used. Authors should document key image-gathering settings and processing manipulations in the Supplementary information. Images gathered at different times or from different locations should not be combined into a single image, unless it is stated that the resultant image is a product of time-averaged data or time-lapse sequence. If juxtaposing images is essential, the borders should be clearly demarcated in the figure and described in the legend. Guidelines for digital images (Nat. Cell Biol.) The use of touch-up tools, such as cloning and healing tools in Photoshop, or any feature that deliberately obscures manipulation, is to be avoided. Processing (such as changing brightness and contrast) is appropriate only when applied equally across the entire image and when it is applied equally to controls. Contrast should not be adjusted so that data disappear. When submitting revised final figures upon conditional acceptance, authors may be asked to submit original, unprocessed images. (http://www.nature.com/ncb/about/ed_policies/index.html#i mages) Guidelines for gels and blots (NCB) Vertically sliced gels that juxtapose lanes that were not contiguous in the experiment must have a clear separation or a black line delineating the boundary between the gels. Cropped gels in the paper must retain important bands. Cropped blots in the body of the paper should retain at least six band widths above and below the band. High-contrast gels and blots are discouraged. Multiple exposures should be presented in supplementary information if high contrast is unavoidable. Immunoblots should be surrounded by a black line to indicate the borders of the blot, if the background is faint. For quantitative comparison, appropriate reagents, controls and imaging methods with linear signal ranges should be used. Guidelines for microscopy images (NCB) Cells from multiple fields should not be juxtaposed in a single field; instead multiple supporting fields of cells should be shown as supplementary information. Adjustments should be applied to the entire image. Threshold manipulation, expansion or contraction of single ranges and the altering of high signals should be avoided. Pseudo-coloring and nonlinear adjustment (e.g., gamma changes) are only allowed if unavoidable and must be disclosed. Adjustments of individual color channels are sometimes necessary on “merged” images, but this should be noted in the figure legend. Note: Any manipulation that violates these guidelines is a misrepresentation of the original data and is a form of misconduct (JCB 166: 1115, 2004). Tips from “Digital Imaging: Ethics” Digital images that will be compared should be acquired under identical conditions. Intensity measurements of digital images should be performed on raw data and the data should be calibrated to a known standard. Manipulation of digital image should always be done with a copy of the raw image data. Sample adjustment to the entire image or cropping an image is usually acceptable. Manipulations that are specific to one area of an image and are not performed on other areas are questionable. Use of software “filters’ to improve image quality is usually not recommended Cloning objects into an image, or from other parts of an image, is very questionable. Avoid the use of lousy compression (JPEG file: may lead to change resolution of image and intensity value of any given pixel.) Tips from “Digital Imaging: Ethics” (cont) Be careful when changing the size (in pixels) of a digital image. Decreasing the image size can cause the XY resolution in an image to be reduced. If the size reduction is not by a power of two, the software program has to be creative in determining the intensity values of each pixel (guessing). Increasing image size causes the software to interpolate (guessing) to create pixels in between the existing pixels, which does not increase resolution. In fact, it may make it more difficult to resolve features because of aliasing artifacts. http://micro.magnet.fsu.edu/primer/java/digitalimaging/processing/jpegco mpression/ Examples of improper image manipulations (JCB: 166 11-15, 2004) (JCB 166: 11-15, 2004) Adjustment the intensity of a single band. O X Improper adjustment of contrast (JCB 166: 11-15, 2004) (JCB 166: 11-15, 2004) Manipulated images (left)- cut individual band and paste it to a new image -revealed by contrast adjustment (right) NCB 5: 320-329, 2003/Corrigendum in NCB 6: 373, 2004 (JCB 166: 11-15, 2004) Enhancing a specific feature – the immunogold particles. Acceptable ways to highlight a feature such as immunogold particle: 1. Add arrows 2. Pseudocoloring particles without altering the brightness of individual pixels (e.g., colorize function of Photoshop) : should be disclosed in the figure legend. Misrepresentation of a microscope field-combine images from separate microscope fields into a single field. (JCB 166: 11-15, 2004) JCB 176: 131-132, 2007 Detecting image manipulation in the Hwang et al. stem cell paper. The image in the top row purports to show negative staining for a particular cell surface marker in four different cell lines. Adjustment of tonal range in Photoshop clearly shows that the two middle images are identical (lower panel). Cell 123: 833-847, 2005/Erratum in Cell 124: 645, 2006 Image checking system: Used by JCB, JEM, JGP (Rockefeller University Press) All digital images in manuscript accepted for publication will be scrutinized by our production department for any indication of improper manipulation. Questions raised by the production department will be referred to the editors, who will request the original data from the authors for comparison to the prepared figures. If the original data cannot be produced, the acceptance of the manuscript may be revolved. Cases of deliberate misrepresentation of data will result in revocation of acceptance, and will be reported to the corresponding author’s home institution or funding agency. (http://www.jcb.org/misc/ifora.shtml#digim) Data acquisition Garbage in = Garbage out Sample selection Unbiased, representative (patients, cells from a population) Define the question. => What information are needed? • Human tissue samples (well defined clinical manifestation; pathology; pedigree) • Tumor vs. surrounding (normal) tissues Question => required information => collection of relevant data Importance of Controls 位於日本兵庫縣的西宮神社10日清晨舉辦了一年一度的「福男賽跑大 賽」,據說跑前3名的民眾,一整年都會鴻運當頭。 今年的冠軍「福男」 跟去年相同,蟬連冠軍。連霸的的福男表示,去年一整年,工作上常受 老闆責罵,很不順利,希望今年運氣會更好。 去年的福男一年運氣不好,今年為何還要再來? 如果去年不是福男,運氣可能更差! No controls ! The hypothesis can never be checked. Each control experiment controls for one variable. Be explicit about the purpose of the control experiment. Data selection It is easy to see the results you expect, and ignore the rest. Don’t fool yourself ! The unexpected, unfit data may be meaningful. 孫老師的血壓 => 決定是否可以吃大餐? Mendel’s peas? 300/150, 145/103, 138/97, 125/85, 60/60 Which data point do you choose? Selecting or discarding certain data should have explicit rationale. Best to describe the rationale clearly. Double-blind test Presenting ALL information could be meaningful ! Inverse PCR from an Olfactory Receptor (OR) gene, to look for association with an H enhancer element Expected result 2.3kb (expected size) DNA sequence => M71-H Unexpected results. 1 kb (localized to chr. 13) Not associated with any other OR OR regulated by another trans element? Lomvardas et al. (2006) Cell 126: 403-13. Not specific to olfactory tissue. Common to nose and spleen? Ubiquitous regulation? Chrosomsomal association? Describe and discuss the criteria for data acquisition and analysis Colocalization of the H Enhancer with OR Promoters (identified by Anti-M50) Define colocalization as 25% overlap of pixels from the H and OR signals on sections from a Z series DNA FISH M50/M71/OMP: DIG; FITC-conjugated anti-DIG antibody (green). H: biotin; rhodamine-conjugated neutravidin (red) Anti-M50/M71 (blue) The inability to detect colocalization of H with OR genes in all cells is likely due to our inability to retain these interchromosomal interactions stoichiometrically throughout the fixation and denaturation procedures. Lomvardas et al. (2006) Cell 126: 403-13. Are these images representative (typical)? Describe the phenotypic range and distribution Yao & Sun (2005) EMBO J. 24: 260212. Sample size Keep clear lab notes • • • • • Organize your thoughts Historical records Helps you to remember Allows other people to replicate your experiment Defense against fraud Belongs to the lab. Content: • Date (bound note book, not loose leaf; numbered pages) • Title of the experiment • Brief statement of purpose • Description of the experiment • Summary (interpretation) of the results Record everything as soon as you can. Ref: At the Bench: A Laboratory Navigator. Chapter 5, Laboratory Notebooks. CSHL Press Always preserve raw data Microscopy Society of America on ethical digital image processing (as published in Microscopy Today Nov/Dec 2003, p61): Ethical digital imaging requires that the original uncompressed image file be stored on archival media (e.g., CD-R) without any image manipulation or processing operation. All parameters of the production and acquisition of this file, as well as any subsequent processing steps, must be documented and reported to ensure reproducibility. Generally, acceptable (non-reportable) imaging operations include gamma correction, histogram stretching, and brightness and contrast adjustments. All other operations (such as Unsharp-masking, Gaussian blur, etc.) must be directly identified by the author as part of the experimental methodology. However, for diffraction data or any other image data that is used for subsequent quantification, all imaging operations must be reported. Reproducibility A reliable result should be reproducible (by you and by others). Single case: A single event (patient, comet), but multiple observations. A single observation: not reliable. Extraordinary claims demand extraordinary proof. Responsibility of the scientist You may be pushed to obtain certain results. In many cases of fraud, the perpetrator has blamed the PI, saying that the PI expected a particular result, and the researcher felt compelled to produce it. It is true that a PI may want a result. But your data are your responsibility, and it is up to you to be sure the data are recorded honestly and accurately. Ref: At the Bench: A Laboratory Navigator. Chapter 5, Laboratory Notebooks. CSHL Press Responsibility of PI Check all primary data? (may be difficult) Pay attention to details. Do not look only at final assembled figures. Establish lab culture, attitude for proper ethics Responsibility of the coauthors Authorship should be justified Each coauthor should write down his/her specific contribution. Plagiarism The action to pass off someone else’s work as your own Self-plagiarism: ranging from duplicate publication to “salami-slicing”, where authors add small amounts of new data to a previous paper (e.g., The previous paper analyzed 15 patient samples and the new one adds 15 more). General rules to avoid plagiarism The article cannot contain a large chunk of material (e.g., a whole paragraph) that has been published previously by others– including Introduction, Materials and Methods... etc. If reuse of substantial part of your own work is necessary, clear citation of the previous publication is required. Warning: a number of publishers are planning to set up “plagiarism detection software” to tackle this problem. (Nature 435: 258-259, 2005) Dealing with fraud Prevention is better than treatment. Catch it within the lab. Whistle-blowing. Do not treat it lightly. A single case may destroy your career. Credibility is key in science. Investigation; internal and external http://www.ncsu.edu/sparcs/policy/references/interim.html Damage control. Admission to wrong-doing. Honesty is the best principle. (http://www.indiana.edu/~poynter/) Additional Info Sources: Responsible Conducts in Research UC, San Diego http://ethics.ucsd.edu/resources/resources-topics.html Online Research Ethics Course http://ori.hhs.gov/education/products/montana_round1/research_ethics.html Office of Research Integrity http://ori.dhhs.gov/ Progress of science is built on trust. Jean Shepherd, “In God We Trust, All Others Bring Data.” 感謝 中研院 孫以瀚教授及陳瑞華教授提供教材