lambda-ber-schema
lambda-ber-schema is a comprehensive schema for representing multimodal structural biology imaging data, from atomic-resolution structures to tissue-level organization. It supports diverse experimental techniques including cryo-EM, X-ray crystallography, SAXS/SANS, fluorescence microscopy, and spectroscopic imaging.
Schema Organization
The schema follows a relational design with flat entity collections and explicit association tables for many-to-many relationships. This maps cleanly to SQL databases while supporting flexible data reuse across studies.
The top-level entity is a Dataset, which serves as a container for related research. A dataset might represent all data from a specific grant, collaboration, or publication.
Entity Tables
All entities are stored in flat collections at the Dataset level:
Biological Materials - Samples: The biological specimens being studied (proteins, nucleic acids, complexes, cells, tissues). Each sample includes detailed molecular composition, buffer conditions, and storage information. For example, a purified protein with its sequence, concentration, and buffer pH.
- Sample Preparations: How samples were prepared for specific techniques. This includes cryo-EM grid preparation (vitrification parameters), crystallization conditions for X-ray studies, or staining protocols for fluorescence microscopy.
Data Collection - Instruments: The equipment used, from Titan Krios microscopes to synchrotron beamlines. Each instrument type (CryoEMInstrument, XRayInstrument, SAXSInstrument) has specific parameters like accelerating voltage, detector type, or beam energy.
- Experiment Runs: Individual data collection sessions. An experiment run captures when, how, and under what conditions data was collected, including quality metrics like resolution and completeness.
Data Processing - Workflow Runs: Computational processing steps applied to raw data. This includes motion correction for cryo-EM movies, 3D reconstruction, model building, or phase determination for crystallography. Each workflow tracks the software used, parameters, and computational resources.
Data Products - Data Files: Any files generated or used, from raw data to final models. Each file is tracked with checksums for data integrity and typed (micrograph, particles, volume, model).
- Images: Specialized classes for different imaging modalities:
- Image2D: Micrographs, diffraction patterns
- Image3D: 3D reconstructions, tomograms
- FTIRImage: Molecular composition maps from infrared spectroscopy
- FluorescenceImage: Fluorophore-labeled cellular components
- OpticalImage: Brightfield/phase contrast microscopy
- XRFImage: Elemental distribution maps
Logical Groupings - Studies: Lightweight groupings representing focused investigations of specific biological questions. For example, a study might investigate "Heat stress response in Arabidopsis" or "Structure of the human ribosome under different conditions."
Association Tables
Many-to-many relationships are represented via explicit association tables, which can carry relationship metadata (e.g., the role of a sample in an experiment):
- StudySampleAssociation: Links samples to studies (with role: target, control, reference)
- StudyExperimentAssociation: Links experiments to studies
- StudyWorkflowAssociation: Links workflows to studies
- ExperimentSampleAssociation: Links samples to experiments (with role and preparation used)
- ExperimentInstrumentAssociation: Links instruments to experiments (with role: primary, detector)
- WorkflowExperimentAssociation: Links source experiments to workflows
- WorkflowInputAssociation: Links input files to workflows
- WorkflowOutputAssociation: Links output files to workflows
This relational design enables: - Sample reuse: The same sample can be used in multiple studies and experiments - Multi-instrument experiments: An experiment can use multiple instruments with different roles - Integrative workflows: A workflow can combine data from multiple experiments
Example Usage
A typical cryo-EM study of a protein complex would include: 1. Sample records for the purified complex with molecular weight and buffer composition 2. Grid preparation details with vitrification parameters 3. Microscope specifications and data collection parameters 4. Processing workflows from motion correction through 3D refinement 5. Final reconstructed volumes and fitted atomic models
A multimodal plant imaging study might combine: 1. Whole plant optical imaging for morphology 2. XRF imaging to map nutrient distribution 3. FTIR spectroscopy to identify stress-related molecular changes 4. Fluorescence microscopy to track specific protein responses 5. Cryo-EM of isolated organelles for ultrastructural details
Key Features
- Relational design: Flat entity tables with explicit association tables for M:N relationships
- SQL-friendly: Maps directly to normalized database tables
- Technique-agnostic core: The same schema handles data from any structural biology method
- Rich metadata: Comprehensive tracking from sample to structure
- Workflow provenance: Complete computational reproducibility
- Multimodal support: Seamlessly integrate data across scales and techniques
- Standards-compliant: Follows FAIR principles and integrates with existing ontologies
URI: https://w3id.org/lambda-ber-schema/
Name: lambda-ber-schema
Classes
| Class | Description |
|---|---|
| Any | |
| Attribute | A domain, measurement, attribute, property, or any descriptor for additional ... |
| AttributeGroup | A grouping of related data attributes that form a logical unit |
| BiophysicalProperty | Measured or calculated biophysical properties |
| BufferComposition | Buffer composition for sample storage |
| ComputeResources | Computational resources used |
| ConformationalState | Individual conformational state |
| CrystallizationConditions | Crystal growth conditions for X-ray crystallography (NSLS2 Crystallization ma... |
| CTFEstimationParameters | Parameters specific to CTF estimation workflows |
| DatabaseCrossReference | Cross-references to external databases |
| DataCollectionStrategy | Strategy for data collection |
| ExperimentalConditions | Environmental and experimental conditions |
| FSCCurve | Fourier Shell Correlation curve data |
| ImageFeature | Semantic annotations describing features identified in images using controlle... |
| LigandInteraction | Small molecule/ligand interactions with proteins |
| MolecularComposition | Molecular composition of a sample |
| MotionCorrectionParameters | Parameters specific to motion correction workflows |
| ParticlePickingParameters | Parameters specific to particle picking workflows |
| QualityMetrics | Quality metrics for experiments |
| RefinementParameters | Parameters specific to 3D refinement workflows |
| StorageConditions | Storage conditions for samples |
| TechniqueSpecificPreparation | Base class for technique-specific preparation details |
| CryoEMPreparation | Cryo-EM specific sample preparation |
| SAXSPreparation | SAXS/WAXS specific preparation |
| XRayPreparation | X-ray crystallography specific preparation |
| AttributeValue | The value for any attribute of an entity |
| DateTimeValue | A date or date and time value |
| QuantityValue | A simple quantity value, representing a measurement with a numeric value and ... |
| TextValue | A value described using a text string, optionally with a controlled vocabular... |
| ExperimentInstrumentAssociation | M:N link between ExperimentRun and Instrument |
| ExperimentSampleAssociation | M:N link between ExperimentRun and Sample with role metadata |
| NamedThing | A named thing |
| AggregatedProteinView | Aggregated view of all structural and functional data for a protein |
| ConformationalEnsemble | Ensemble of conformational states for a protein |
| DataFile | A data file generated or used in the study |
| Dataset | Root container holding flat entity collections and association tables |
| ExperimentRun | An experimental data collection session |
| Image | An image file from structural biology experiments |
| FTIRImage | Fourier Transform Infrared (FTIR) spectroscopy image capturing molecular comp... |
| Image2D | A 2D image (micrograph, diffraction pattern) |
| FluorescenceImage | Fluorescence microscopy image capturing specific molecular targets through fl... |
| Micrograph | Motion-corrected micrograph derived from movie |
| Movie | Raw cryo-EM movie with frame-by-frame metadata for motion correction |
| OpticalImage | Visible light optical microscopy or photography image |
| XRFImage | X-ray fluorescence (XRF) image showing elemental distribution |
| Image3D | A 3D volume or tomogram |
| Instrument | An instrument used to collect data |
| BeamlineInstrument | Multi-technique synchrotron beamline that supports multiple experimental meth... |
| CryoEMInstrument | Cryo-EM microscope specifications |
| SAXSInstrument | SAXS/WAXS instrument specifications |
| XRayInstrument | X-ray diffractometer or synchrotron beamline specifications |
| MeasurementConditions | Conditions under which biophysical measurements were made |
| OntologyTerm | A term from a controlled vocabulary or ontology |
| ProteinAnnotation | Base class for all protein-related functional and structural annotations |
| EvolutionaryConservation | Evolutionary conservation information |
| FunctionalSite | Functional sites including catalytic, binding, and regulatory sites |
| MutationEffect | Effects of mutations and variants on protein structure and function |
| PostTranslationalModification | Post-translational modifications observed or predicted |
| ProteinProteinInteraction | Protein-protein interactions and interfaces |
| StructuralFeature | Structural features and properties of protein regions |
| ProteinConstruct | Detailed information about a protein construct including cloning and sequence... |
| Sample | A biological sample used in structural biology experiments |
| SamplePreparation | A process that prepares a sample for imaging |
| Study | A logical grouping of related experiments investigating a research question |
| WorkflowRun | A computational processing workflow execution |
| StudyExperimentAssociation | M:N link between Study and ExperimentRun |
| StudySampleAssociation | M:N link between Study and Sample with role metadata |
| StudyWorkflowAssociation | M:N link between Study and WorkflowRun |
| WorkflowExperimentAssociation | M:N link between WorkflowRun and source ExperimentRuns |
| WorkflowInputAssociation | Links input DataFiles to WorkflowRun |
| WorkflowOutputAssociation | Links output DataFiles to WorkflowRun |
Slots
| Slot | Description |
|---|---|
| accelerating_voltage | Accelerating voltage in kV |
| acquisition_date | Date image was acquired |
| acquisition_group | Acquisition group identifier (e |
| acquisition_software | Acquisition software used (e |
| acquisition_software_version | Version of acquisition software |
| additional_software | Additional software used in pipeline |
| additives | Additional additives in the buffer |
| affinity_column | Affinity column specifications |
| affinity_type | Type of affinity chromatography |
| aggregation_assessment | Assessment of protein aggregation state |
| alignment_depth | Number of sequences in alignment |
| aliquoting | How the protein was aliquoted for storage |
| allele_frequency | Population allele frequency (range: 0-1) |
| amplitude_contrast | Amplitude contrast value |
| anatomy | Anatomical part or tissue (e |
| anisotropic_correction | Whether anisotropic motion correction was applied |
| annotation_method | Computational or experimental method used |
| anom_corr | Anomalous correlation |
| anom_sig_ano | Anomalous signal strength |
| anomalous_completeness | Completeness of anomalous data as a percentage (0-100) |
| anomalous_multiplicity | Multiplicity of anomalous data |
| anomalous_used | Whether anomalous signal was used |
| antibiotic_selection | Antibiotic or selection agent used |
| apodization_function | Mathematical function used for apodization |
| astigmatism | Astigmatism value, typically specified in Angstroms |
| astigmatism_angle | Astigmatism angle, typically specified in degrees |
| astigmatism_target | Target astigmatism in Angstroms |
| atmosphere | Storage atmosphere conditions |
| attenuator | Attenuator setting used |
| attribute | The attribute being represented |
| autoloader_capacity | Number of grids the autoloader can hold |
| autoloader_slot | Autoloader slot identifier |
| average_b_factor_a2 | Average B-factor in Angstroms squared |
| backbone_flexibility | B-factor or flexibility measure |
| background_correction | Method used for background correction |
| beam_center_x | Beam center X coordinate, typically specified in pixels ([px]) |
| beam_center_x_px | Beam center X coordinate in pixels |
| beam_center_y | Beam center Y coordinate, typically specified in pixels ([px]) |
| beam_center_y_px | Beam center Y coordinate in pixels |
| beam_energy | X-ray beam energy, typically specified in kiloelectronvolts (keV) |
| beam_shift_x | Beam shift X in microradians |
| beam_shift_y | Beam shift Y in microradians |
| beam_size | X-ray beam size, typically specified in micrometers |
| beam_size_max | Maximum beam size in micrometers |
| beam_size_min | Minimum beam size in micrometers |
| beam_size_um | Beam size, typically specified in micrometers |
| beamline | Beamline identifier (e |
| beamline_id | Beamline identifier at synchrotron/neutron facility |
| bfactor_dose_weighting | B-factor for dose weighting, typically specified in Angstroms squared |
| binding_affinity | Binding affinity value |
| binding_affinity_type | Type of binding measurement (Kd, Ki, IC50) |
| binding_affinity_unit | Unit of binding affinity |
| binding_energy | Calculated binding energy (kcal/mol) |
| binding_site_residues | Residues involved in ligand binding |
| binning | Binning factor applied during motion correction |
| biological_assembly | Whether this represents a biological assembly |
| biophysical_properties | Measured or predicted biophysical properties |
| blot_force | Blotting force setting |
| blot_number | Number of blots applied |
| blot_time | Blotting time, typically specified in seconds (range: 0 |
| blotter_height | Blotter height setting |
| blotter_setting | Blotter setting value |
| box_size | Particle box size in pixels |
| buffer_composition | Buffer composition including pH, salts, additives |
| buffer_matching_protocol | Protocol for buffer matching |
| c2_aperture | C2 aperture size in micrometers |
| calibrated_pixel_size | Calibrated pixel size in Angstroms per pixel |
| calibration_standard | Reference standard used for calibration |
| camera_binning | Camera binning factor |
| cc_anomalous | Anomalous correlation coefficient |
| cc_half | Half-set correlation coefficient CC(1/2) |
| cell_path_length | Path length, typically specified in millimeters (mm) |
| cell_type | Cell type if applicable (e |
| chain_id | Chain identifier in the PDB structure |
| chamber_temperature | Chamber temperature, typically specified in degrees Celsius |
| channel_name | Name of the fluorescence channel (e |
| characteristic_features | Key features of this conformation |
| checksum | SHA-256 checksum for data integrity |
| clashscore | MolProbity clashscore |
| cleavage_site | Protease cleavage site sequence |
| cleavage_temperature_c | Temperature during cleavage in Celsius |
| cleavage_time_h | Duration of protease cleavage in hours |
| clinical_significance | Clinical significance |
| cloning_method | Method used for cloning (e |
| clustering_method | Method used for conformational clustering |
| codon_optimization_organism | Organism for which codons were optimized |
| coevolved_residues | Pairs of coevolved residues |
| collection_mode | Mode of data collection |
| color_channels | Color channels present (e |
| coma | Coma aberration in nanometers |
| completed_at | Workflow completion time |
| completeness | Data completeness, typically specified as a percentage (0-100) |
| completeness_high_res_shell_percent | Completeness in highest resolution shell, typically specified as a percentage... |
| completeness_percent | Data completeness as a percentage (0-100) |
| complex_stability | Stability assessment of the complex |
| components | Buffer components and their concentrations |
| compute_resources | Computational resources used |
| concentration | Sample concentration, typically specified in mg/mL or µM |
| concentration_method | Method used to concentrate protein |
| concentration_series | Concentration values for series measurements |
| confidence_score | Confidence score for the annotation (range: 0-1) |
| conformational_ensemble | Conformational states and dynamics |
| conformational_state | Conformational state descriptor |
| conformational_states | Individual conformational states |
| conservation_method | Method used for conservation analysis |
| conservation_score | Evolutionary conservation score (range: 0-1) |
| conserved_residues | Highly conserved residues |
| construct | Construct description (e |
| construct_description | Human-readable description of the construct |
| construct_id | Unique identifier for this construct |
| contrast_method | Contrast enhancement method used |
| control_system | Low-level control system for device communication |
| cpu_hours | CPU hours used, measured in hours |
| creation_date | File creation date |
| cross_references | Database cross-references |
| cryo_protectant | Cryoprotectant used for crystal cooling |
| cryoprotectant | Cryoprotectant used |
| cryoprotectant_concentration | Cryoprotectant concentration, typically specified as a percentage |
| crystal_cooling_capability | Crystal cooling system available |
| crystal_id | Identifier for the specific crystal used |
| crystal_notes | Additional notes about crystal quality and handling |
| crystal_size_um | Crystal dimensions in micrometers (length x width x height) |
| crystallization_conditions | Complete description of crystallization conditions including precipitant, pH,... |
| crystallization_method | Method used for crystallization |
| cs | Spherical aberration (Cs) in millimeters |
| cs_corrector | Spherical aberration corrector present |
| cs_used_in_estimation | Spherical aberration (Cs) value used during CTF estimation, typically specifi... |
| ctf_estimation_params | CTF estimation specific parameters |
| ctf_quality_score | CTF estimation quality score |
| culture_volume_l | Culture volume, typically specified in liters (L) |
| current_status | Current operational status |
| daq_system | Data acquisition system used for experiment orchestration |
| data_collection_strategy | Strategy for data collection |
| data_files | All data files |
| data_type | Type of data in the file |
| database_cross_references | Cross-references to external databases |
| database_id | Identifier in the external database |
| database_name | Name of the external database |
| database_url | URL to the database entry |
| date_added | Date when sample was added to study |
| definition | The formal definition or meaning of the ontology term |
| defocus | Defocus value, typically specified in micrometers |
| defocus_range_increment | Defocus range increment in micrometers |
| defocus_range_max | Maximum defocus range in micrometers |
| defocus_range_min | Minimum defocus range in micrometers |
| defocus_search_max | Maximum defocus search range, typically specified in micrometers |
| defocus_search_min | Minimum defocus search range, typically specified in micrometers |
| defocus_step | Defocus search step, typically specified in micrometers |
| defocus_target | Target defocus value in micrometers |
| defocus_u | Defocus U, typically specified in micrometers |
| defocus_v | Defocus V, typically specified in micrometers |
| delta_delta_g | Change in folding free energy (kcal/mol) |
| deposited_to_pdb | Whether structure was deposited to PDB |
| description | A detailed textual description of this entity |
| detector_dimensions | Detector dimensions in pixels (e |
| detector_distance | Distance from sample to detector, typically specified in millimeters (mm) |
| detector_distance_max | Maximum detector distance in mm |
| detector_distance_min | Minimum detector distance in mm |
| detector_distance_mm | Detector distance, typically specified in millimeters |
| detector_manufacturer | Detector manufacturer (e |
| detector_mode | Supported or default detector operating mode |
| detector_model | Detector model (e |
| detector_position | Physical position of detector in microscope (e |
| detector_technology | Generic detector technology type |
| dimensions_x | Image width, typically specified in pixels |
| dimensions_y | Image height, typically specified in pixels |
| dimensions_z | Image depth, typically specified in pixels or slices |
| disease_association | Associated disease or phenotype |
| disorder_probability | Probability of disorder (range: 0-1) |
| dissociation_constant | Experimental Kd if available |
| domain_assignment | Domain database assignment (CATH, SCOP, Pfam) |
| domain_id | Domain identifier from domain database |
| dose | Electron dose in e-/Ų |
| dose_per_frame | Electron dose per frame in e-/Angstrom^2 |
| dose_rate | Dose rate in e-/pixel/s or e-/Angstrom^2/s |
| dose_weighting | Whether dose weighting was applied |
| drift_total | Total drift, typically specified in Angstroms |
| drop_ratio_protein_to_reservoir | Ratio of protein to reservoir solution in drop (e |
| drop_volume | Total drop volume, typically specified in nanoliters |
| drop_volume_nl | Total drop volume, typically specified in nanoliters |
| druggability_score | Druggability score of the binding site (range: 0-1) |
| duration | Storage duration |
| dwell_time | Dwell time per pixel, typically specified in milliseconds |
| ec_number | Enzyme Commission number for catalytic sites |
| effect_on_function | Effect on protein function |
| effect_on_stability | Effect on protein stability |
| elements_measured | Elements detected and measured |
| elution_buffer | Buffer composition for elution |
| emission_filter | Specifications of the emission filter |
| emission_wavelength | Emission wavelength, typically specified in nanometers |
| end_time | Data collection end timestamp |
| energy_filter_make | Energy filter manufacturer |
| energy_filter_model | Energy filter model |
| energy_filter_present | Whether energy filter is present |
| energy_filter_slit_width | Energy filter slit width in eV |
| energy_landscape | Description of the energy landscape |
| energy_max | Maximum X-ray energy in keV |
| energy_min | Minimum X-ray energy in keV |
| enzyme | Enzyme responsible for modification |
| error | Experimental error or uncertainty |
| ethane_temperature | Ethane temperature, typically specified in degrees Celsius |
| evidence_code | Evidence and Conclusion Ontology (ECO) code |
| evidence_type | Type of evidence supporting this annotation |
| evolutionary_conservation | Evolutionary conservation data |
| excitation_filter | Specifications of the excitation filter |
| excitation_wavelength | Excitation wavelength, typically specified in nanometers |
| experiment_code | Human-friendly laboratory or facility identifier for the experiment (e |
| experiment_date | Date of the experiment |
| experiment_id | Reference to the experiment run |
| experiment_instrument_associations | Links between experiments and instruments (M:N) |
| experiment_runs | All experiment runs (data collection sessions) |
| experiment_sample_associations | Links between experiments and samples (M:N with role) |
| experimental_conditions | Environmental and experimental conditions |
| experimental_method | Specific experimental method for structure determination (particularly for di... |
| exposure_time | Exposure time per image, typically specified in seconds (s) |
| exposure_time_per_frame | Exposure time per frame in milliseconds |
| expression_system | Expression system used |
| facility_name | Name of the research facility where the instrument is located |
| facility_ror | Research Organization Registry (ROR) identifier for the facility |
| feature_type | Type of structural feature |
| file_format | File format |
| file_id | Reference to the input data file |
| file_name | Name of the file |
| file_path | Path to the file |
| file_role | Role of the file (raw, intermediate, final, diagnostic, metadata) |
| file_size_bytes | File size in bytes |
| final_buffer | Final buffer composition after purification |
| final_concentration_mg_per_ml | Final protein concentration in mg/mL |
| flash_cooling_method | Flash cooling protocol |
| fluorophore | Name or type of fluorophore used |
| flux | Photon flux at sample position, typically specified in photons per second |
| flux_density | Photon flux density in photons/s/mm² |
| flux_end | Photon flux at end of data collection, typically specified in photons per sec... |
| flux_photons_per_s | Photon flux, typically specified in photons per second |
| frame_grouping | Number of frames grouped together |
| frame_rate | Frame rate, typically specified in frames per second |
| frames | Number of frames in the movie |
| frames_per_movie | Number of frames per movie |
| free_energy | Relative free energy (kcal/mol) |
| fsc_curve | Fourier Shell Correlation curve data |
| fsc_value | FSC values corresponding to each resolution |
| functional_effect | Known functional effect of this PTM |
| functional_impact_description | Description of functional impact |
| functional_importance | Description of functional importance |
| functional_sites | Functional site annotations for proteins in the sample |
| gene_name | Gene name |
| gene_synthesis_provider | Company or facility that synthesized the gene |
| glow_discharge_applied | Whether glow discharge treatment was applied |
| glow_discharge_atmosphere | Glow discharge atmosphere (air, amylamine) |
| glow_discharge_current | Glow discharge current, typically specified in milliamperes |
| glow_discharge_pressure | Glow discharge pressure, typically specified in millibars |
| glow_discharge_time | Glow discharge time, typically specified in seconds |
| go_terms | Associated Gene Ontology terms |
| gold_standard | Whether gold-standard refinement was used |
| goniometer_type | Type of goniometer |
| gpu_hours | GPU hours used, measured in hours |
| grid_material | Grid material |
| grid_square_id | Grid square identifier |
| grid_type | Type of EM grid used |
| growth_temperature_c | Growth temperature, typically specified in degrees Celsius |
| gunlens | Gun lens setting |
| harvest_timepoint | Time point when cells were harvested |
| hic_column | Hydrophobic interaction column used |
| hole_id | Hole identifier within grid square |
| hole_size | Hole size, typically specified in micrometers (range: 0 |
| holes_per_group | Number of holes per group |
| host_strain_or_cell_line | Specific strain or cell line used (e |
| humidity | Humidity, typically specified as a percentage (0-100) |
| humidity_percentage | Chamber humidity during vitrification (range: 0-100), typically specified as ... |
| i_over_sigma | Mean I/sigma(I) - signal to noise ratio |
| i_zero | Forward scattering intensity I(0) |
| ice_thickness_estimate | Estimated ice thickness, typically specified in nanometers |
| id | Globally unique identifier as an IRI or CURIE for machine processing and exte... |
| iex_column | Ion-exchange column used |
| illumination_type | Type of illumination (brightfield, darkfield, phase contrast, DIC) |
| images | All images |
| imaging_mode | Imaging mode for electron microscopy |
| indexer_module | Indexing module used (e |
| inducer_concentration | Concentration of induction agent |
| induction_agent | Agent used to induce expression (e |
| induction_temperature_c | Temperature during induction, typically specified in degrees Celsius |
| induction_time_h | Duration of induction, typically specified in hours |
| initial_hit_condition | Description of initial crystallization hit condition |
| input_type | Type of input for the workflow |
| insert_boundaries | Start and end positions of insert in vector |
| installation_date | Date of instrument installation |
| instrument_category | Category distinguishing beamlines from laboratory equipment |
| instrument_code | Human-friendly facility or laboratory identifier for the instrument (e |
| instrument_id | Reference to the instrument |
| instruments | All instruments used across studies |
| integrator_module | Integration module used |
| interaction_distance | Distance criteria for interaction (Angstroms) |
| interaction_evidence | Evidence for this interaction |
| interaction_type | Type of interaction |
| interface_area | Buried surface area at interface (Ų) |
| interface_residues | Residues at the interaction interface |
| ionic_strength | Ionic strength, typically specified in molar (mol/L) |
| is_cofactor | Whether the ligand is a cofactor |
| is_drug_like | Whether the ligand has drug-like properties |
| ispyb_auto_proc_program_id | ISPyB AutoProcProgram |
| ispyb_auto_proc_scaling_id | ISPyB AutoProcScaling |
| ispyb_data_collection_id | ISPyB DataCollection |
| ispyb_session_id | ISPyB BLSession |
| keywords | Keywords or tags describing the dataset for search and categorization |
| label | The human-readable label or name of the ontology term |
| laser_power | Laser power, typically specified in milliwatts |
| last_updated | Date of last update |
| ligand | Ligand or small molecule bound to sample |
| ligand_id | Ligand identifier (ChEMBL, ChEBI, PubChem) |
| ligand_interactions | Small molecule interaction annotations |
| ligand_name | Common name of the ligand |
| ligand_smiles | SMILES representation of the ligand |
| ligands | Bound ligands or cofactors |
| ligands_cofactors | Ligands or cofactors modeled in the structure |
| lims_system | Laboratory Information Management System used at this beamline |
| loop_size | Loop size, typically specified in micrometers |
| lysis_buffer | Buffer composition for lysis |
| lysis_method | Method used for cell lysis |
| magnification | Magnification used during data collection |
| mail_in_service | Whether mail-in sample service is available |
| manufacturer | Instrument manufacturer |
| map_sharpening_bfactor | B-factor used for map sharpening, typically specified in Angstroms squared (Å... |
| mass_shift | Mass change due to modification (Da) |
| maximum_numeric_value | The maximum value part, expressed as a number, of the quantity value when the... |
| mean_i_over_sigma_i | Mean I/sigma(I) |
| measurement_conditions | Conditions under which measurement was made |
| medium | Growth medium used |
| memory_gb | Maximum memory used, typically specified in gigabytes (GB) |
| method | Crystallization method used |
| microscope_software | Microscope control software (e |
| microscope_software_version | Software version |
| minimum_numeric_value | The minimum value part, expressed as a number, of the quantity value when the... |
| model | Instrument model |
| model_file_path | Path to deep learning model file if using a local or custom trained model fil... |
| model_name | Name or identifier of the deep learning model (e |
| model_source | Source or software associated with the model (e |
| modification_group | Chemical group added (e |
| modification_type | Type of PTM |
| modifications | Post-translational modifications or chemical modifications |
| modified_residue | Residue that is modified |
| molecular_composition | Description of molecular composition including sequences, modifications, liga... |
| molecular_signatures | Identified molecular signatures or peaks |
| molecular_weight | Molecular weight, typically specified in kilodaltons (kDa) |
| molprobity_score | Overall MolProbity score |
| monochromator_type | Type of monochromator |
| motion_correction_params | Motion correction specific parameters |
| mounting_method | Crystal mounting method |
| mounting_temperature | Temperature during mounting, typically specified in Kelvin |
| multiplicity | Data multiplicity (redundancy) |
| mutation | Mutation in standard notation (e |
| mutation_effects | Effects of mutations present in the sample |
| mutation_type | Type of mutation |
| mutations | Mutations present in the sample |
| n_total_observations | Total number of observations (before merging) |
| n_total_unique | Total number of unique reflections |
| ncbi_taxid | NCBI Taxonomy ID for source organism |
| ncc_score | Normalized cross-correlation score threshold |
| ncs_used | Whether Non-Crystallographic Symmetry restraints were used |
| nominal_defocus | Nominal defocus value, typically specified in micrometers |
| number_of_images | Total number of diffraction images collected |
| number_of_scans | Number of scans averaged for the spectrum |
| number_of_waters | Number of water molecules modeled |
| numeric_value | The numerical part of a quantity value, expressed as a number |
| numerical_aperture | Numerical aperture of the objective lens |
| objective_aperture | Objective aperture size in micrometers |
| od600_at_induction | Optical density at 600nm when induction was started |
| omim_id | OMIM database identifier |
| ontology | The ontology or controlled vocabulary this term comes from (e |
| operator_id | Identifier or name of the person who performed the sample preparation (e |
| optimization_strategy | Strategy used to optimize crystals |
| optimized_condition | Final optimized crystallization condition |
| organism | Source organism for the sample (e |
| organism_id | NCBI taxonomy ID |
| origin_movie_id | Reference to original movie file |
| oscillation_angle | Oscillation angle per image, typically specified in degrees |
| oscillation_per_image_deg | Oscillation angle per image, typically specified in degrees |
| outlier_rejection_method | Method for rejecting outlier reflections |
| output_binning | Output binning factor |
| output_files | Output files generated |
| output_type | Type of output from the workflow |
| parameters_file_path | Path to parameters file or text of key parameters |
| parent_sample_id | Reference to parent sample for derivation tracking |
| particle_picking_params | Particle picking specific parameters |
| partner_chain_id | Chain ID of interacting partner |
| partner_interface_residues | Partner residues at the interaction interface |
| partner_protein_id | UniProt ID of interacting partner |
| patch_size | Patch size for local motion correction |
| pdb_entries | PDB entries representing this state |
| pdb_entry | PDB identifier |
| pdb_id | PDB accession code if deposited |
| ph | pH of the buffer (range: 0-14) |
| phase_plate | Phase plate available |
| phase_plate_type | Type of phase plate if present |
| phasing_method | Phasing method used for X-ray crystallography structure determination |
| picking_method | Method used (manual, template_matching, deep_learning, LoG, Topaz, other) |
| pinhole_size | Pinhole size, typically specified in Airy units for confocal microscopy |
| pixel_size | Pixel size, typically specified in Angstroms |
| pixel_size_calibrated | Calibrated pixel size for this experiment, typically specified in Angstroms (... |
| pixel_size_physical | Physical pixel size in micrometers |
| pixel_size_physical_um | Physical pixel size of the detector in micrometers |
| pixel_size_unbinned | Unbinned pixel size, typically specified in Angstroms per pixel |
| pixel_size_x | Pixel size X dimension, typically specified in micrometers (µm) |
| pixel_size_y | Pixel size Y dimension, typically specified in micrometers (µm) |
| plasma_treatment | Plasma treatment details |
| population | Relative population of this state (range: 0-1) |
| power_score | Power score threshold |
| preparation_date | Date of sample preparation |
| preparation_id | Specific preparation used for this sample in this experiment |
| preparation_method | Method used to prepare the sample |
| preparation_type | Type of sample preparation |
| pressure | Pressure, typically specified in kilopascals (kPa) |
| principal_motions | Description of principal motions |
| processing_level | Processing level (0=raw, 1=corrected, 2=derived, 3=model) |
| processing_notes | Additional notes about processing |
| processing_parameters | Parameters used in processing |
| processing_status | Current processing status |
| promoter | Promoter used for expression |
| property_type | Type of biophysical property |
| protease | Protease used for tag cleavage |
| protease_inhibitors | Protease inhibitors added |
| protease_ratio | Ratio of protease to protein |
| protein_buffer | Buffer composition for protein solution |
| protein_concentration | Protein concentration for crystallization in mg/mL |
| protein_concentration_mg_per_ml | Protein concentration for crystallization in mg/mL |
| protein_constructs | All protein constructs |
| protein_id | UniProt accession number |
| protein_interactions | Protein-protein interaction annotations |
| protein_name | Name of the protein |
| protocol_description | Detailed protocol description |
| ptm_annotations | Post-translational modification annotations |
| ptms | All post-translational modifications |
| publication_ids | IDs of one or more publications supporting this annotation |
| purification_steps | Ordered list of purification steps performed |
| purity_by_sds_page_percent | Purity percentage by SDS-PAGE |
| purity_percentage | Sample purity, typically specified as a percentage (range: 0-100) |
| q_range_max | Maximum q value in inverse Angstroms |
| q_range_min | Minimum q value in inverse Angstroms |
| quality_metrics | Quality control metrics for the sample |
| quantum_yield | Quantum yield of the fluorophore |
| r_anomalous | Anomalous R-factor |
| r_factor | R-factor for crystallography (deprecated, use r_work) |
| r_free | R-free (test set) |
| r_merge | Rmerge - merge R-factor |
| r_pim | Rpim - precision-indicating merging R-factor |
| r_work | Refinement R-factor (working set) |
| ramachandran_favored | Percentage of residues in favored Ramachandran regions (0-100) |
| ramachandran_favored_percent | Percentage of residues in favored Ramachandran regions |
| ramachandran_outliers | Percentage of Ramachandran outliers (0-100) |
| ramachandran_outliers_percent | Percentage of Ramachandran outliers |
| raw_data_location | Location of raw data files |
| raw_value | The value that was specified in raw form, i |
| reconstruction_method | Method used for 3D reconstruction |
| refinement_params | 3D refinement specific parameters |
| refinement_resolution_a | Resolution cutoff used for refinement in Angstroms |
| regulatory_role | Role in regulation |
| related_entity | ID of the entity that owns this file |
| removal_enzyme | Enzyme that removes modification |
| reservoir_volume_ul | Reservoir volume, typically specified in microliters |
| residue_range | Range of residues (e |
| residues | List of residues forming the functional site |
| resolution | Resolution at edge of detector, typically specified in Angstroms (Å) |
| resolution_0_143 | Resolution at FSC=0 |
| resolution_0_5 | Resolution at FSC=0 |
| resolution_angstrom | Resolution values in Angstroms |
| resolution_at_corner | Resolution at corner of detector, typically specified in Angstroms (Å) |
| resolution_fit_limit | Resolution fit limit, typically specified in Angstroms |
| resolution_high | High resolution limit, typically specified in Angstroms (Å) |
| resolution_high_shell_a | High resolution shell limit, typically specified in Angstroms |
| resolution_low | Low resolution limit, typically specified in Angstroms (Å) |
| resolution_low_a | Low resolution limit, typically specified in Angstroms |
| restraints_other | Other restraints applied during refinement |
| rfree | R-free (test set) |
| rg | Radius of gyration, typically specified in Angstroms |
| rmerge | Rmerge - merge R-factor |
| rmsd_angles | RMSD from ideal bond angles, typically specified in degrees |
| rmsd_bonds | RMSD from ideal bond lengths, typically specified in Angstroms (Å) |
| rmsd_from_reference | RMSD from reference structure |
| rmsd_threshold | RMSD threshold for clustering (Angstroms) |
| role | Role of sample in study (e |
| rpim | Rpim - precision-indicating merging R-factor |
| rwork | Refinement R-factor (working set) |
| sample_applied_volume | Volume of sample applied, typically specified in microliters |
| sample_cell_type | Type of sample cell used |
| sample_changer_capacity | Number of samples in automatic sample changer |
| sample_code | Human-friendly laboratory identifier or facility code for the sample (e |
| sample_id | Reference to the sample being prepared |
| sample_preparations | All sample preparations |
| sample_type | Type of biological sample |
| samples | All samples across all studies |
| scaler_module | Scaling module used (e |
| screen_name | Name of crystallization screen used |
| search_model_pdb_id | PDB ID of search model for molecular replacement |
| sec_buffer | Buffer for size-exclusion chromatography |
| sec_column | Size-exclusion column used |
| second_affinity_reverse | Second affinity or reverse affinity step |
| secondary_structure | Secondary structure assignment |
| seed_stock_dilution | Dilution factor for seed stock |
| seeding_type | Type of seeding used (micro, macro, streak) |
| selectable_marker | Antibiotic resistance or other selectable marker |
| sequence_file_path | Path to sequence file |
| sequence_length_aa | Length of the protein sequence in amino acids |
| sequence_verified_by | Method or person who verified the sequence |
| sequences | Amino acid or nucleotide sequences |
| shots_per_hole | Number of shots taken per hole |
| sig_anomalous | Mean anomalous difference signal |
| signal_peptide | Signal peptide sequence if present |
| signal_to_noise | Signal to noise ratio |
| site_name | Common name for this site |
| site_type | Type of functional site |
| slit_gap_horizontal | Horizontal slit gap aperture, typically specified in micrometers (µm) |
| slit_gap_vertical | Vertical slit gap aperture, typically specified in micrometers (µm) |
| soak_compound | Compound used for soaking (ligand, heavy atom) |
| soak_conditions | Conditions for crystal soaking |
| software_name | Software used for processing |
| software_version | Software version |
| solvent_accessibility | Relative solvent accessible surface area (range: 0-1) |
| source_database | Source database or resource that provided this annotation |
| source_type | Type of X-ray source |
| space_group | Crystallographic space group |
| spectral_resolution | Spectral resolution, typically specified in inverse centimeters (cm⁻¹) |
| split_strategy | Strategy for data splitting |
| spotsize | Electron beam spot size setting |
| stage_position_x | Stage X position, typically specified in micrometers |
| stage_position_y | Stage Y position, typically specified in micrometers |
| stage_position_z | Stage Z position, typically specified in micrometers |
| stage_tilt | Stage tilt angle in degrees |
| start_angle | Starting rotation angle, typically specified in degrees |
| start_time | Data collection start timestamp |
| started_at | Workflow start time |
| state_id | Identifier for this state |
| state_name | Descriptive name (e |
| storage_conditions | Storage conditions for the sample |
| storage_gb | Storage used, typically specified in gigabytes (GB) |
| storage_uri | Storage URI (S3, Globus, etc |
| strategy_notes | Notes about data collection strategy |
| structural_features | Structural feature annotations |
| structural_motif | Known structural motif |
| studies | All studies in this dataset |
| study_experiment_associations | Links between studies and experiments (M:N) |
| study_id | Reference to the study |
| study_sample_associations | Links between studies and samples (M:N) |
| study_workflow_associations | Links between studies and workflows (M:N) |
| super_resolution | Whether super-resolution mode was used |
| support_film | Support film type |
| symmetry | Symmetry applied (C1, Cn, Dn, T, O, I) |
| synchrotron_mode | Synchrotron storage ring fill mode |
| tag | Affinity tag (e |
| tag_cterm | C-terminal tag |
| tag_nterm | N-terminal tag (e |
| tag_removal | Whether and how affinity tag was removed |
| taxonomic_range | Taxonomic range of conservation |
| technique | Technique used for data collection |
| techniques_supported | Experimental techniques available at this beamline |
| tem_beam_diameter | TEM beam diameter in micrometers |
| temperature | Storage temperature, typically specified in degrees Celsius |
| temperature_c | Crystallization temperature, typically specified in degrees Celsius |
| temperature_control | Temperature control settings |
| temperature_control_range | Temperature control range in Celsius |
| temperature_k | Data collection temperature, typically specified in Kelvin |
| terms | Ontology terms describing features identified in the image |
| threshold | Picking threshold |
| timestamp | Acquisition timestamp |
| title | A human-readable name or title for this entity |
| tls_used | Whether TLS (Translation/Libration/Screw) refinement was used |
| total_dose | Total electron dose in e-/Angstrom^2 |
| total_exposure_time | Total exposure time in milliseconds |
| total_frames | Total number of frames/images |
| total_rotation | Total rotation range collected, typically specified in degrees |
| total_rotation_deg | Total rotation range, typically specified in degrees |
| transition_pathways | Description of transition pathways between states |
| transmission | X-ray beam transmission as a percentage (0-100) |
| transmission_percent | Beam transmission, typically specified as a percentage (0-100) |
| undulator_gap | Undulator gap setting, typically specified in millimeters (mm) |
| uniprot_id | UniProt accession for the target protein |
| unit | The unit of measurement |
| unit_cell_a | Unit cell parameter a, typically specified in Angstroms (Å) |
| unit_cell_alpha | Unit cell angle alpha, typically specified in degrees |
| unit_cell_b | Unit cell parameter b, typically specified in Angstroms (Å) |
| unit_cell_beta | Unit cell angle beta, typically specified in degrees |
| unit_cell_c | Unit cell parameter c, typically specified in Angstroms (Å) |
| unit_cell_gamma | Unit cell angle gamma, typically specified in degrees |
| unit_cv_id | The unit of the quantity, expressed as a CURIE from the Unit Ontology (e |
| validation_report_path | Path to validation report |
| value | The value, as a text string |
| value_cv_id | For values that are in a controlled vocabulary (CV), this attribute should ca... |
| variable_residues | Highly variable residues |
| vector_backbone | Base plasmid backbone used |
| vector_name | Complete vector name |
| verification_notes | Notes from sequence verification |
| vitrification_instrument | Vitrification instrument used (e |
| vitrification_method | Method used for vitrification |
| voltage_used_in_estimation | Accelerating voltage value used during CTF estimation, typically specified in... |
| voxel_size | Voxel size, typically specified in Angstroms |
| wait_time | Wait time before blotting, typically specified in seconds |
| wash_buffer | Buffer composition for washing |
| wavelength | X-ray wavelength, typically specified in Angstroms (Å) |
| wavelength_a | X-ray wavelength, typically specified in Angstroms |
| wavenumber_max | Maximum wavenumber, typically specified in inverse centimeters (cm⁻¹) |
| wavenumber_min | Minimum wavenumber, typically specified in inverse centimeters (cm⁻¹) |
| website | Beamline website URL |
| white_balance | White balance settings |
| wilson_b_factor | Wilson B-factor, typically specified in Angstroms squared (Ų) |
| wilson_b_factor_a2 | Wilson B-factor in Angstroms squared |
| workflow_code | Human-friendly identifier for the computational workflow run (e |
| workflow_experiment_associations | Links between workflows and source experiments (M:N) |
| workflow_id | Reference to the workflow run |
| workflow_input_associations | Links between workflows and input files |
| workflow_output_associations | Links between workflows and output files |
| workflow_runs | All workflow runs (computational processing) |
| workflow_type | Type of processing workflow |
| yield_mg | Total yield in milligrams |
Enumerations
| Enumeration | Description |
|---|---|
| AffinityUnitEnum | Units for affinity measurements |
| AnnotationSourceEnum | Sources of functional annotations |
| BeamlineEnum | Specific beamline instances at DOE and other major structural biology facilit... |
| BindingAffinityTypeEnum | Types of binding affinity measurements |
| BiophysicalMethodEnum | Methods for biophysical measurements |
| BiophysicalPropertyEnum | Types of biophysical properties |
| ClinicalSignificanceEnum | Clinical significance of variants |
| CollectionModeEnum | Data collection modes |
| ComplexStabilityEnum | Stability of protein complexes |
| ConformationalStateEnum | Conformational states |
| ControlSystemEnum | Low-level control systems and middleware frameworks for device communication ... |
| CrystallizationMethodEnum | Methods for protein crystallization |
| DataAcquisitionSystemEnum | Data acquisition (DAQ) systems for orchestrating experimental data collection... |
| DatabaseNameEnum | External database names |
| DataTypeEnum | Types of data |
| DetectorModeEnum | Operating modes for detectors during data collection |
| DetectorTechnologyEnum | Generic detector technologies for structural biology imaging |
| DetectorTypeEnum | DEPRECATED: Use DetectorTechnologyEnum instead |
| EvidenceTypeEnum | Types of evidence |
| ExperimentalMethodEnum | Experimental methods for structure determination |
| ExperimentSampleRoleEnum | Role of a sample in an experiment |
| ExpressionSystemEnum | Expression systems for recombinant protein production |
| FacilityEnum | Major synchrotron and structural biology research facilities worldwide |
| FacilityTypeEnum | Types of research facilities |
| FileFormatEnum | File formats |
| FunctionalEffectEnum | Effect on protein function |
| FunctionalSiteTypeEnum | Types of functional sites in proteins |
| GridMaterialEnum | Materials used for EM grids |
| GridTypeEnum | Types of EM grids |
| IlluminationTypeEnum | Types of illumination for optical microscopy |
| ImagingModeEnum | Imaging modes for electron microscopy |
| InputTypeEnum | Type of input for a workflow |
| InstrumentCategoryEnum | Categories of instruments based on their nature and location |
| InstrumentRoleEnum | Role of an instrument in an experiment |
| InstrumentStatusEnum | Operational status of instruments |
| InteractionEvidenceEnum | Evidence for interactions |
| InteractionTypeEnum | Types of molecular interactions |
| LIMSSystemEnum | Laboratory Information Management Systems (LIMS) used at structural biology f... |
| MutationTypeEnum | Types of mutations |
| OutputTypeEnum | Types of outputs from computational workflows |
| PhasingMethodEnum | Methods for phase determination in X-ray crystallography |
| PreparationTypeEnum | Types of sample preparation |
| ProcessingStatusEnum | Processing status |
| PTMTypeEnum | Types of post-translational modifications |
| PurificationStepEnum | Protein purification steps and methods |
| SampleRoleEnum | Role of a sample in a study |
| SampleTypeEnum | Types of biological samples |
| SecondaryStructureEnum | Secondary structure types |
| StabilityEffectEnum | Effect on protein stability |
| StructuralFeatureTypeEnum | Types of structural features |
| SymmetryEnum | Crystallographic and non-crystallographic symmetry groups for cryo-EM |
| TechniqueEnum | Structural biology techniques |
| VitrificationMethodEnum | Methods for vitrification |
| WorkflowTypeEnum | Types of processing workflows |
| XRaySourceTypeEnum | Types of X-ray sources |
Types
| Type | Description |
|---|---|
| Boolean | A binary (true or false) value |
| Curie | a compact URI |
| Date | a date (year, month and day) in an idealized calendar |
| DateOrDatetime | Either a date or a datetime |
| Datetime | The combination of a date and time |
| Decimal | A real number with arbitrary precision that conforms to the xsd:decimal speci... |
| DecimalDegree | A decimal degree expresses latitude or longitude as decimal fractions |
| Double | A real number that conforms to the xsd:double specification |
| Float | A real number that conforms to the xsd:float specification |
| Integer | An integer |
| Jsonpath | A string encoding a JSON Path |
| Jsonpointer | A string encoding a JSON Pointer |
| Ncname | Prefix part of CURIE |
| Nodeidentifier | A URI, CURIE or BNODE that represents a node in a model |
| Objectidentifier | A URI or CURIE that represents an object in the model |
| SmilesString | A SMILES representation of a chemical structure |
| Sparqlpath | A string encoding a SPARQL Property Path |
| String | A character string |
| Time | A time object represents a (local) time of day, independent of any particular... |
| Uri | a complete URI |
| Uriorcurie | a URI or a CURIE |
Subsets
| Subset | Description |
|---|---|