ACS Publications portfolio of journals offer the highest levels of rigor in the review and publication of scientific articles and research data. To assist authors to include data as a prioritized publication component, we have developed ACS Research Data Guidelines.
Browse data guidelines for reporting
- Biological data
- Simulations, Machine Learning, Computational Data
- Organic chemistry data
- Coming Soon Nanoscience / Materials science / Energy (in 2022)
Biological data
-
BIOLOGICAL SPECIMENS
- Antibodies: Authors are required to report the name of the antibody, the host species in which the antibody was produced and whether it is monoclonal or polyclonal. For commercial antibodies, report the company and catalog or code number and the antibody identifier obtained from The Antibody Registry. For academic antibodies, report the source laboratory and relevant reference. Clearly state the application for each antibody used in the manuscript. Include batch numbers for experiments in which variability is found among different antibody batches. Clearly state the final antibody concentration or dilution. Whenever possible, report the antigen or antigen location.
- Cell Lines and Microorganisms: To avoid inadvertent use of cross-contaminated or misidentified cell lines/microorganisms, authors are urged to validate each cell line/microorganism used. Authors must report the source of all cell lines/microorganisms in their manuscript, the date of authentication (must be within a year of manuscript submission date) and a description of the authentication method. Authors should be able to provide the authentication test results upon request. If no testing was done, provide the date when cells/microorganisms were purchased from authenticated source. For mammalian cell lines, authors must state whether the cell line has recently been tested for mycoplasma contamination. Resources for using cell lines as biological models:
- Human Subjects: A statement confirming that the research has been approved by relevant ethical committees and performed under The Code of Ethics of the World Medical Association (Declaration of Helsinki) must be provided. Details listed in the latest version of the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) guidelines and description of informed consent protocols must also be provided. Authors reporting clinical trials should follow the CONSORT Statement for recommendations regarding the reporting of clinical trial results.
-
Animal subjects Research involving animals must be performed in accordance with institutional guidelines as defined by the
Institutional Animal Care and Use Committee
for U.S. institutions or an equivalent regulatory committee in other countries. A statement confirming that all animal
experiments performed for the manuscript were conducted in compliance with these guidelines is required. In the experimental
section, the source, age, sex, species, and strain of animals should be included. For each treatment group, the number of
animals used and sex should be clearly stated. Appropriate statistical methods should be used to test the significance of
differences in results, and claims thereof. It is encouraged that all figure and table captions include the number of animals
and sex for each treatment group, the method of statistical analysis as well as the corresponding p-values where
significant differences are found.
- Further information on research involving animal and human subjects can be found in the following resources:
- For key reagents and tools, we recommend the use of Research Resource Identifiers:
- Biological Assays: Assay interference can cause misleading results. Thus, appropriate controls experiments should be performed to exclude common artefacts caused by reactive molecules (covalent and redox activity), colloidal aggregation, decomposition, and interference with the spectroscopic method. Authors should consult the recent ACS Editorial on assay interference compounds. The routes of administration of test compounds and vehicles should also be indicated. Benchmarks should be included in the form of appropriate positive or negative control substances or reference materials. Especially for studies on nanomaterials, additional steps and controls are needed such as sterilization procedures, checking assays for optical or chemical interferences, reporting different measuring units related to dose (e.g. surface area, mass, particle number per surface area, volume, cell number), and others as described in:
-
SEQUENCE DATA
Authors should submit sequence data to a public repository prior to submission and include accession numbers in their paper where appropriate.
- High-throughput sequencing data GEO
- DNA and RNA sequences: GenBank or NCBI Sequence Read Archive
- Nucleic acid sequencing data: NCBI Trace Archive or NCBI Sequence Read Archive (SRA)
- Protein sequences: Uniprot
-
COMPOUND CHARACTERIZATION
Manuscripts should at least provide exemplary characterization and purity data for key compounds, including 1H NMR, 13C NMR, and HRMS and preferably full characterization of all compounds described. Infrared absorbance (IR), specific optical rotation, and melting points should be included when appropriate. The purity and method of purity determination should be reported for all key compounds. For compounds prepared in a library format, a general experimental procedure should be provided, including full experimental details, with yields, for a representative selection of library members. The synthesis protocols and selected characterized compounds must reflect the reliability and scope of the reaction sequence. The purity of all reported library compounds should be explicitly stated. The submission of manuscripts purely based on mixture synthesis and/or mixture analysis is discouraged. Peptides should be characterized by HRMS and HPLC. For nanomaterials, the stability and reactivity of the nanomaterials as well as influence of external parameters like the composition of cell culture media, buffers etc. on nanomaterial properties should be addressed, for example by measuring dissolution, formation of reactive oxygen species or agglomeration under experimental conditions, and physical characterization techniques for nanomaterials should be sufficient, including description of algorithms and methods used to analyze the data. Please review the following resource:
-
PROTEOMICS
-
Sequence Analysis
- The method and/or program (including version number) used to create the "peak list" from the raw data and the parameters used in the creation of this peak list, particularly any which might affect the quality of the subsequent database search. Examples include whether smoothing was applied, any signal-to-noise criteria, whether charge states were calculated or peaks de-isotoped, etc. In cases where additional customized processing of the collections of peak lists has been performed, e.g. clustering or filtering, the method and/or program (including version number) should be referenced.
- The name and version of the program(s) used for database searching and the values of search parameters. Examples include precursor-ion mass tolerance, fragment-ion mass tolerance, modifications allowed for, any missed cleavages, protein cleavage chemistry, (if any), etc.
- The name and version of the sequence database(s) used. If a database was compiled in-house, a complete description of the source of the sequences is required. The number of entries actually searched from each database should be included. Authors should justify the use of a very small database or database that excludes common contaminants, since this may generate misleading assignments.
- Methods used to interpret MS/MS data, thresholds and values specific to judging certainty of identification, whether any statistical analysis was applied to validate the results, and a description of how applied.
- For large-scale experiments, provide the results of any additional statistical analyses that indicate or establish a measure of identification certainty, or allow a determination of the false-positive rate, e.g., the results of randomized database searches or other computational approaches.
-
Information for each protein sequence identified
- Include accession number and database source
- Report score(s) and any associated statistical information obtained for searches conducted
- Report sequence coverage, expressed as the number of amino acids spanned by the assigned peptides divided by the sequence length
- Report the total number of peptides assigned to the protein
-
Sequence Analysis
-
IMAGES
-
Western Blots
- Manuscripts must report the primary antibody species, the secondary antibody species, isotypes, and generated epitopes. Catalog and lot numbers for commercially obtained antibodies must be reported, as well as blotting membrane and blocking agents.
- Full scans or images of uncropped blots must be provided
-
Other images
- Original, uncropped images must be provided, at a minimum in the Supplementary Information. Any image manipulation must be described and justified in the text.
-
Western Blots
- Biological Macromolecules from Electron Microscopy Experiments: Density maps should be deposited at either the Protein Data Bank in Europe (UK) or RCSB (USA) EMDB deposition site. Once the map has been deposited, any fitted atomic coordinates should be deposited with the Protein Data Bank (PDB) by following the link provided from the EMDB deposition session. The EMDB and PDB IDs should be included in the manuscript. Both the map and the coordinate data will be made public when the associated article is published.
Simulations, Machine Learning, Computational Data
-
QUANTUM CHEMISTRY:
Calculations should include coordinates or all key stationary points in a machine-readable format, such as .xyz, .mol2, or .pdb as appropriate. A table in the SI or a PDF are not acceptable sources for coordinates. Authors should make coordinate files available through ioChem-BD (www.iochem-bd.org), NOMAD (www.nomad-lab.eu), or another FAIR compliant repository. To find an appropriate repository, authors may refer to re3data.org and FAIRsharing.org for information on available repositories, their certification status, and services offered.
Absolute energies of all key stationary points, as well as relevant vibrational frequencies should be provided as text within the body of the manuscript or the Supplementary Information.
Inclusion of information on the normal modes, either directly or in the form of a Hessian matrix, is also encouraged where appropriate and relevant. This information can be supplied as text within the body of the manuscript or in the Supplementary Information.
Quantum chemistry derived properties such as NMR shifts, key TD-DFT excitation energies, and oscillator strengths must also be supplied in a machine-readable format.
In all cases, sufficient data must be supplied to reproduce the calculations. For calculations of molecular systems, this includes all keywords indicating method, basis set, solvation model, etc. The inclusion of the input file for each reported calculation, with indication of the package version used, is required. For periodic calculations of materials, information must include method, K-point list, and cutoffs. If widely distributed open-source or commercial software is used, full details identifying the specific version of software used must be provided. If in-house software is used, references must be provided to work that describes the software in detail and reports benchmark computations. In all cases a reader must be able to access the software and reproduce the calculations in the paper.
Papers making use of quantum chemistry calculations must include a discussion of the method uncertainty, system size, DFT functional, and basis set.
-
SIMULATION
All relevant simulation details should be described in the main text: the level of theory used, solvent model, integration method, software versions, and input parameters.
Starting structures must be made available. Preference is for a link to a structure in the PDB or CCDC databases. Unpublished experimentally obtained structures should be deposited into appropriate repositories and referenced in the manuscript. If substantial modifications are made to an experimental structure prior to simulation, coordinates of the modified structures should be provided. Structure preparation, including solvation, and equilibration steps, must be described with sufficient detail to be reproduced.
Simulation software must be described, including version and any plug-ins used. Simulation input files must be provided. In cases where a large number of simulations were performed, a template input file should be provided along with a description of the changes for each individual run, with enough detail for the entire set of simulations to be reproduced. Any adjustments to standard parameters must be described, and the modified parameters provided.
Output trajectories must be provided with an appropriate level of detail. Full trajectories are required except in such cases for which the full trajectory is prohibitively large, in which cases, at a minimum, snapshots along the trajectory with solvent/counterions included if used. For free energy calculations, energy files should also be included. For simulations with large outputs, authors are encouraged to discuss the level of required trajectory detail with the Editor.
Statistical uncertainties for all properties calculated from a trajectory must be described, and a description of how these were determined must be provided
-
MACHINE LEARNING
- Data sources
Data sources must be listed and must be publicly available. Guidance on acceptable repositories is provided in the ACS Research Data Policy. If data is stored in an external database, an access date or a version number must be provided. Authors should discuss in the manuscript text any potential biases in the source data, as well as what approaches were used to mitigate any bias.
-
Data Cleaning
Data cleaning steps must be clearly and fully described, either in the text or as a code pipeline in publicly accessible code.
The amount of source data removed must be explicitly listed and evaluated.
Any instances of combining data from multiple sources must be clearly identified, with mitigation of potential issues discussed.
-
Data Representations
The methods for representing data as features or descriptors must be clearly articulated, with software implementations.
Comparisons against standard feature sets must be provided.
-
Model Choice
The implementation of models used must be provided such that it can be trained and tested with new data.
Baseline comparisons to simple/trivial models must be provided (e.g. 1-nearest neighbor, random forest, most frequent class).
Baseline comparisons to current state-of-the-art must be provided.
-
Model Training and Validation
The model must clearly split data into different sets for training (model selection), validation (hyperparameter optimization), and testing (final evaluation).
The method of splitting data must be clearly stated, and mimic the anticipated real-world application.
The data splitting procedure must avoid leakage (e.g. the same composition present in training and test sets).
-
Code and Reproducibility
The code or workflow must be available in a public repository. GitHub is preferred, but other repositories compliant with FAIR principles are also acceptable. See the ACS Data Policy guidance on choosing appropriate repositories. Any scripts used to produce findings in the paper should be provided in full.
- Data sources
Organic chemistry data
-
COMPOUND CHARACTERIZATION
Authors are required to report the data used to characterize compounds included in a manuscript to establish their identity and demonstrate purity. The requirement applies both to new compounds and to known compounds whose isolation or preparation by a new or modified method is being reported.
New compounds: data should include a proton NMR spectrum, a carbon NMR spectrum, and either a high-resolution mass spectroscopy (HRMS) or elemental analysis data.
Known compounds: synthesized by a new/improved method, a reference in the experimental details section should be included and one or more of the following should be provided: 1H or 13C NMR spectra, elemental analysis, HPLC, GC.
If required data cannot be obtained (a compound is too insoluble to record a carbon NMR, or too unstable to obtain a good elemental analysis, etc.), the reason for the absence of the data should be noted in the Experimental Section.
Authors are responsible for retaining their original data or having available original data from collaborators or from contractors who perform analyses on their behalf. Authors may be asked to provide copies of spectra or analytical reports if an editor or reviewer raises a question about reported results.
-
NMR DATA
-
Reporting NMR Data in the Experimental Details Proton and carbon
NMR resonances should be listed for each new compound.
A typical example to report 1H and 13C NMR data to conform to ACS Style Guide format is (data to be reported from high to low):
1H NMR (C6D6, 400 MHz): δ 6.00 (t, 1H, J = 4.0 Hz), 5.62 (t, 1H, J = 4.0 Hz), 1.95 (d, 1H, J = 4.0 Hz), 1.73 (s, 15H), 1.62 (s, 3H), 1.58 (s, 15H), 0.98 (s, 1H), 0.72 (d, 1H, J = 4.0 Hz), -0.53 (s, 1H). 13C{1H} NMR (C6D6, 125 MHz): δ 88.7, 88.0, 81.0, 80.8, 60.6, 54.2, 51.5, 38.3, 17.4, 10.6, 10.2.- The use of broadband decoupling should be indicated with braces, for example: 13C{1H} for proton-decoupled carbon data.
- Proton NMR shifts, reported to 0.01 ppm precision, should be accompanied by an abbreviation for any multiplet structure, the number of atoms represented by the peak or multiplet, and coupling constants where applicable.
- Carbon NMR peak shifts should be rounded off to the nearest 0.1 ppm except when greater precision is needed to distinguish closely spaced peaks.
Information about numbers of attached hydrogen atoms (reported as C, CH, CH2, CH3) from DEPT, DEPTQ, PENDANT, or 2D spectra may be included with the carbon peak shifts. For compounds with carbon-bonded fluorine atoms, the carbon peak multiplicity (d, t, q) and coupling in Hz should be reported.
Detailed peak assignments (including "ArH" for aromatic protons and "C=O" for carbonyl carbons) should not be reported in the Experimental Section unless one or more 2D methods have been used to establish atom connectivities and spatial relationships, and the type(s) of 2D methods are identified in a General Experimental Methods paragraph or in the individual compound data listings. Authors using software for automated data analysis are reminded to check numerical data, including proton counts and coupling constants, before including them in the manuscript.
For products isolated as inseparable isomer mixtures, if the NMR absorptions can be attributed to individual isomers, the NMR chemical shift data for those isomers should be reported in two or more separate lists, one for each isomer, instead of as a single list. For proton NMR data, the integrals in each isomer’s list should be reported in whole numbers of protons.
-
Spectra images should include the following:
- Spectra are labeled with a representation of the compound on the spectrum—please use ChemDraw or a related program.
- The compound identifier used in the manuscript should be included on the spectrum, typically a compound number is used as the identifier.
- Spectra are legible and images are not faint or blurry.
- Spectra are at least a half page in size; horizontal orientation is preferred.
-
NMR baseline is displayed and all peaks should be visible on the spectrum; typical range is:
- Proton NMR: -1 ppm to 10 ppm
- Carbon NMR: 0 ppm to 200 ppm
- Insets are encouraged to show expanded regions
- Extended range for functional groups that resonate from 9 ppm to 14 ppm
- Solvent identified on each spectrum
- Instrument frequency listed on each spectrum
- Peaks in the 1H NMR spectrum are integrated
- Chemical shift values are included for all peaks in the 1H NMR and 13C NMR spectra
It is not acceptable to use peak-editing software or other means to suppress or obscure peaks arising from impurities (including byproducts, unconsumed reactants, and incompletely removed extraction, chromatography, or recrystallization solvents). Peak suppression may be used on the NMR solvent peak for samples run in protic solvents, but it is never necessary for samples run in deuterated solvents.
-
Primary NMR Data Files
Authors are highly encouraged to submit primary NMR data files (FID files, acquisition data, processing parameters). All original primary NMR data supporting a submission should be retained and provided if requested. For more information on packaging primary NMR data and metadata for submission, see the ACS Research Data Center.
When submitting FID files:
- One folder should be created for each compound
- Folder should be named clearly, using the compound number
- Include the FID files, acquisition data and processing parameters for each experiment
- Name each spectrum according to the type of nucleus measured: 1H, 13C, DEPT, COSY, etc.
- NMR files should be compressed into zip file(s)
- Name the zipped file, "FID for Publication."
-
Reporting NMR Data in the Experimental Details Proton and carbon
NMR resonances should be listed for each new compound.
-
X-RAY CRYSTOLLOGRAPHIC DATA
-
CIF Preparation and Validation
Authors submitting work containing new, unpublished organic, metal-organic, and inorganic crystallographic data intended for publication with their manuscript must prepare this data in the Crystallographic Information File (CIF) format. It is the responsibility of the author(s) to check all CIFs prior to submission for syntax errors, numerical self-consistency of the data, or possible higher symmetry in the space group assignment
Two programs are recommended for validating CIFs:
enCIFer syntax checks are integrated into the CCDC deposition process and a standalone version is distributed freely by the Cambridge Crystallographic Data Centre (CCDC), at http://www.ccdc.cam.ac.uk/free_services/encifer.
CIF-checking software is also available free of charge from the International Union of Crystallography, at http://checkcif.iucr.org. CheckCIF is now available during deposition to the CCDC.
-
Depositing CIFs and Related Files
Note that CIFs, structure factor tables, and checkCIF reports must be submitted to the CCDC prior to manuscript submission. CCDC will accept organic, metal-organic, and inorganic compounds, including extended molecular solids and also powder data where a constrained refinement has been used. Structural data for inorganic compounds will be transferred by CCDC to the Inorganic Crystal Structure Database after publication and will maintain the original deposition number(s). Any subsequent revisions to the CIFs or structure factor tables should be deposited directly with the CCDC before resubmitting the manuscript in ACS Paragon Plus. For all other crystallographic data that are not accommodated by the CCDC (for example protein structures, nucleic acids, or metals & alloys), authors are encouraged to deposit into other available databases (see below) in addition to uploading the data in ACS Paragon Plus during manuscript submission. In addition, authors are required to upload the checkCIF output files (combined into one PDF file) as Supporting Information for Review Only. Any A and/or B level alerts must also be addressed prior to submission or otherwise explained in the checkCIF PDF.
-
Submission of CIFs and Structure Factor Tables to CCDC
To facilitate the reviewing of CIF files, ACS requires that prior to manuscript submission, organic, metal- organic, and inorganic CIFs must be deposited with the CCDC via http://www.ccdc.cam.ac.uk/services/structure_deposit, even if text tables of crystallographic data are included with the manuscript.
In addition, authors are required to deposit structure factor tables with the CCDC alongside their CIFs. Structure factor tables should include h, k, l, Fo, Fc, and σ|Fo| values. The embedded original, unmerged, uncut, unmasked hkl file must be embedded in the CIF file. Upon depositing their CIFs (and structure factor tables) with the CCDC, authors will be provided with a CCDC deposition number for each CIF file. This number must be entered into ACS Paragon Plus when prompted during the submission process, as shown below.
-
CIF Preparation and Validation
-
Reporting Chromatography, Elemental Analysis, HRMS, IR, Melting Point, and
Optical Rotation Data
-
Chromatography
When flash chromatography is used for product purification, both the support and solvent should be identified for each compound, and recommend including an Rf value. Below are representative examples how to report chromatography data:
- The crude material was purified by silica gel chromatography (Biotage SNAP Ultra 25 g, 10 – 40% EtOAc in hexanes over 15 CV, product coelutes with any residual starting material at 9 CV) to afford the final product as a clear, colorless oil that solidified to a white solid upon standing (165 mg, 82% yield).
- TLC: (33% EtOAc in hexanes, Rf): 0.33 (UV, I2)
-
Elemental Analysis
The ACS Style Guide format for reporting elemental analysis data is: Anal. Calcd for C13H17NO3: C, 66.36; H, 7.28; N, 5.95. Found: C, 66.55; H, 7.01; N, 6.22
Elemental analysis Found values for carbon, hydrogen, and nitrogen should be within 0.4% of the Calcd values for the proposed formula. The need to include fractional molecules of solvent or water in the molecular formula to improve the fit of the data usually reflects incomplete purification of the sample.
-
HRMS
Please report the complete molecular formula, including added atoms or fragment such as H or Na with reported HRMS data. The ACS Guide format for reporting accurate mass data is: HRMS (ESI-TOF) m/z: [M + Na]+ Calcd for C13H17NO3Na 258.1101; Found 258.1074.
Elemental analysis Found values for carbon, hydrogen, and nitrogen should be within 0.4% of the Calcd values for the proposed formula. The need to include fractional molecules of solvent or water in the molecular formula to improve the fit of the data usually reflects incomplete purification of the sample.
-
IR
When IR data is reported, only include IR absorptions diagnostic for major functional groups. Please include the unit, cm-1, for all reported IR data. Below is a representative example:
FTIR (neat) cm-1: 3351 (m), 3276 (m), 2971 (w), 2867 (w), 2844 (w), 2451 (w), 1739 (w), 1653 (w), 1543 (w), 1512 (s), 1468 (m), 1365 (m), 1318 (w), 1218 (m), 1142 (m), 1058 (m), 1033 (s).
-
Melting Point
A melting point range should be reported for every new crystalline solid product. Melting point ranges may be reported to document the purity of known, but not new, synthesis products and when possible, report melting point ranges for recrystallized samples of known compounds that were previously reported only in noncrystalline (and presumably less pure) form. Below is a representative example:
mp 175.5 °C (lit.25 mp 175-176) -
Optical Rotation
Specific optical rotations should be reported for isolated natural products and enantioenriched compounds when sufficient sample is available. Specific rotations based on the equation [-α] = (100·α)/(l·c) should be reported as unitless numbers as in the following example:
[α]D20 -25 (c 1.9, CHCl3), where the concentration c is in g/l00 mL and the path length l is in decimeters.The units of the specific rotation, (deg·mL)/(g·dm), are implicit and are not included with the reported value and the sign of "+" or "-" should be added before the value of optical rotation.
-
Chromatography
Author Support
If you have questions about the ACS Research Data Guidelines or other research data questions, please contact researchdata@acs.org. If you have questions about a specific ACS journal’s data policy, please refer to the Author Guidelines for that journal.