Skip to content

extraction_stage

Schema for data extraction stage of the review workflow.

URI: https://open-and-sustainable.github.io/revaise-model/schema/stages/extraction

Name: extraction_stage

Classes

Class Description
Affiliation Institutional affiliation
AgreementMetrics Metrics for measuring agreement between reviewers/extractors
AIAssistance Configuration for AI assistance in any review stage
AIParameter AI model parameter configuration
AIPrompt Prompt configuration for AI interaction
AISession Session details for AI-assisted work
Author Author of a review or publication
        Participant A person participating in the review process
BatchProcessing Batch processing within a session
ChecklistItem Individual item in a quality checklist
ConfidenceInterval Confidence interval specification
ConflictPattern Pattern analysis of conflicts
ConflictResolution Resolution of conflicts or disagreements between reviewers
DatasetRef External dataset reference
DateRange A simple date range representation
DomainAssessment Assessment of a specific quality domain
ExternalTool External tool or platform used in the review process
ExtractedDataPoint Individual data point extracted from a study
ExtractedStudy A study with extracted data
ExtractionForm Template or form used for data extraction
FieldAgreement Agreement metrics for a specific field or item
FieldDefinition Definition of a field used in forms, templates, or data collection
FieldDependency Dependency between fields
FieldOption Option for select/multi-select fields
FormSection Section within a form or template
FundingSource Source of funding for the review
HumanModification Record of human modification to AI output
LiteratureRecord A literature record representing a publication that persists across review st...
LiteratureRecordCollection A collection of literature records
ParticipantWorkload Workload statistics for a participant
PerformanceMetrics General performance metrics
        AIPerformanceMetrics Performance metrics for AI assistance, extending general performance metrics ...
ProcessMetrics Metrics related to process efficiency and completion
Protocol Scoping and registration details of the review
QualityAssessment Quality or risk of bias assessment for a study or item
QualityChecklist Checklist for quality assessment
QualityControl Quality control measures and checks
RegistrationTemplate A template used for registering systematic reviews, defining required and opt...
ResolutionMethodCount Count of a specific resolution method usage
ResolutionStatistics Statistics about conflict resolution
Review Root container for a systematic review
SessionEnvironment Environment details for a work session
SessionMetrics Metrics for a work session
SoftwareEnv Software environment configuration
StageExecution Base class for executing a specific stage of the review
        ExtractionStage Data extraction from included studies
StageOutput Output file or log produced by a review stage (results, logs, code, visualiza...
StageOutputRef Reference to a stage output in this or another RevAIse bundle
StageProtocol Base class for protocols defining how a review stage should be conducted
        ExtractionProtocol Protocol for the data extraction process
StageQualityControl Base class for quality control measures across review stages
        ExtractionQualityControl Quality control measures for data extraction
StageStatistics Base class for statistics collected about review stages
        ExtractionStatistics Statistics about the extraction process
StageTrainingInfo Shared training information for review stages
StudyCharacteristics Key characteristics of a study
TemplateField A field within a registration template
ToolParameter Configuration parameter for a tool
ValidationRule Validation rule for a field
WorkSession A work session by a participant on review tasks

Slots

Slot Description
abstract Abstract or summary of the publication
accuracy Overall accuracy
active_duration Active working time in minutes
address Street address of the institution
affiliations Institutional affiliations of the author
agreement_rate Agreement rate for this field
agreement_sample_size Number of items compared
ai_assistance_config AI assistance configuration for extraction
ai_assisted Whether AI assisted
ai_confidence AI confidence score
ai_confidence_threshold Minimum confidence score for accepting AI outputs
ai_config_id Reference to the AIAssistance configuration used
ai_error_handling Strategy for handling AI errors or failures
ai_extraction_session AI extraction session
ai_id Unique identifier for AI assistance configuration
ai_model Name of the AI model
ai_parameters Model parameters and settings
ai_performance_metrics Performance metrics for the AI assistance
ai_prompts Prompts used to interact with the AI
ai_provider AI service provider
ai_purpose Purposes for which AI is used
ai_session_id Unique identifier for AI session
ai_training_description Description of training data or fine-tuning
ai_validation_rules Rules for validating AI outputs
ai_version Version of the AI model
api_calls Number of API calls made
area_under_curve Area under the ROC curve
assessment_date Date and time of assessment
assessment_id Unique identifier for this assessment
assessment_notes Notes about the assessment
assessment_time_minutes Time taken to complete assessment in minutes
assessment_tool Tool used for quality assessment
assessor Person who performed the assessment
assigned_items_count Number of items assigned to this participant
authors List of authors of the publication
average_items_per_day Average items processed per day
average_resolution_time Average time to resolve in minutes
average_time_per_study Average time per study (minutes)
backlog_size Number of items remaining
balanced_accuracy Average of sensitivity and specificity
base_agreement_metrics Base agreement metrics
base_calibration_performed Base calibration performed flag
base_completed_date Base stage completion date
base_completion_rate Base completion rate
base_discrepancy_resolutions Base discrepancy resolutions
base_double_checking_rate Base double-checking rate
base_items_completed Base items completed
base_participant_workloads Base participant workloads
base_protocol_date Base protocol date
base_protocol_description Base protocol description
base_protocol_deviations Base protocol deviations
base_protocol_id Base protocol identifier
base_protocol_notes Base protocol notes
base_protocol_software Base protocol software
base_protocol_tools Base protocol tools
base_protocol_training Base protocol training
base_protocol_version Base protocol version
base_qc_id Base QC identifier
base_qc_notes Base QC notes
base_qc_type Base QC type
base_spot_check_rate Base spot check rate
base_started_date Base stage start date
base_stats_id Base statistics identifier
base_stats_notes Base statistics notes
base_total_hours Base total hours
base_total_items Base total items
batch_completion_rate Proportion of batch completed
batch_id Unique identifier for this batch
batch_number Sequential number of this batch
batch_size Number of items in the batch
batch_started_at When batch processing started
blinding_level Level of blinding
break_frequency Average time between breaks in minutes
browser Browser used if applicable
checklist_id Unique identifier for the checklist
checklist_items Items in the checklist
checklist_name Name of the checklist
checklist_version Version of the checklist
checks_performed Specific checks performed
checksum Checksum of the output file
city City where the institution is located
cohen_kappa Cohen's kappa for inter-rater agreement
collection_created_at Timestamp when the collection was created
collection_description Description of the collection's purpose or contents
collection_id Unique identifier for the collection containing this record
collection_name Name of the collection containing this record
common_causes Common causes of this conflict pattern
comparator_details Details of the comparator
completed_at When batch processing completed
completed_items Number of items completed
completed_items_count Number of items completed by this participant
completion_date Date when the review was completed
completion_rate Proportion of items completed
conditional_on Field ID that this field's visibility depends on
confidence_calibration Calibration of AI confidence scores
confidence_interval Confidence interval for the agreement rate
confidence_level Confidence level (e
confidence_rating Assessor's confidence in the assessment
conflict_of_interest_declared Whether conflicts of interest were declared
conflict_type Type of conflict being resolved
conflicting_items IDs of items in conflict
conflicts_of_interest Declared conflicts of interest
connection_quality Network connection quality
containers Container images used
corrective_actions Corrective actions taken
corresponding_author Primary contact author for the review
cost Estimated cost
count Number of times this method was used
country Country where the institution is located
created_at Timestamp when the review record was created
created_date Date when the template was created
datapoint_id Data point identifier
dataset_name Name of the dataset
department Department or division within the institution
dependency_condition Condition for the dependency
dependency_type Type of dependency
dependency_value Value to check against
depends_on_field Field this depends on
deprecation_date Date when this template will be or was deprecated
diagnostic_odds_ratio Ratio of odds of positive test in diseased vs non-diseased
disagreement_count Number of disagreements
disagreement_resolution_method Method for resolving disagreements
disagreement_types Types of disagreements observed
doi Digital Object Identifier of the publication
domain_assessments Assessment of individual quality domains
domain_justification Justification for the rating
domain_name Name of the quality domain
domain_rating Rating for this domain
eligibility_criteria Inclusion and exclusion criteria
email Email address of the author
ended_at End time
error_message Error message when validation fails
error_rate Rate of errors made
errors_encountered Errors encountered during the session
ethical_approval Ethical approval information if applicable
expertise_areas Areas of expertise relevant to the review
extracted_data Extracted data points
extracted_studies Studies with extracted data
extracted_study_id Unique identifier for extracted study
extracted_unit Unit of measurement
extracted_value Extracted value
extraction_confidence Confidence in extraction
extraction_form_id Reference to extraction form
extraction_forms Data extraction forms used
extraction_guidance Detailed extraction guidance
extraction_notes Notes about extraction
extraction_order Order in which studies are extracted
extraction_protocol Protocol for data extraction
extraction_quality_control Quality control for extraction
extraction_sessions Extraction work sessions
extraction_specific_checks Extraction-specific quality checks
extraction_statistics Statistics for extraction stage
extraction_status Status of extraction
extraction_timestamp When extracted
extractor_id ID of extractor
extractor_notes Extractor notes
f1_score Harmonic mean of precision and recall
false_discovery_rate Expected proportion of false discoveries
false_negative_rate Type II error rate
false_omission_rate Proportion of false negatives among negative calls
false_positive_rate Type I error rate
family_name Family/last name of the author
field_category Category or section this field belongs to
field_default_value Default value for the field
field_definitions Formal definition of template fields (e
field_dependencies Dependencies on other fields
field_description Detailed description of the field
field_group Group this field belongs to
field_help_text Help text to guide users
field_id Unique identifier for this field
field_identifier Identifier of the field or item
field_label Human-readable label for the field
field_level_agreements Agreement metrics for individual fields
field_metadata Additional metadata as JSON
field_name Machine-readable name for the field
field_options Options for select/multi-select fields
field_order Display order of the field
field_placeholder Placeholder text for the field
field_required Whether this field is required
field_type Data type of the field
field_validation_rules Validation rules for the field
final_decision Final resolved decision or value
final_extracted_values Final extracted values
first_item_date Date of first item processed
fleiss_kappa Fleiss' kappa for multiple raters
focus_score Measure of session focus/consistency
form_created_at Form creation time
form_created_by Form creator
form_description Form description
form_id Form identifier
form_last_modified Last modification time
form_name Form name
form_sections Sections in form
form_uri URI to form
form_version Form version
format File format or MIME type
frequency How often this pattern occurs
full_text_url URL to access the full text of the publication
funder_id Unique identifier for the funding organization
funder_name Name of the funding organization
funding_source Source of study funding
funding_sources Sources of funding for the review
given_name Given/first name of the author
grant_number Grant or award number from the funder
guidance_text Guidance for answering this item
human_agreement_rate Rate of agreement with human reviewers
human_modifications Modifications made by humans to AI outputs
human_oversight_level Level of human review required
independent_extraction Whether extraction is independent
inputs Input artifacts
institution_name Name of the affiliated institution
interpretation_guide Guide for interpreting scores
intervention_details Details of the intervention
involved_fields Fields commonly involved in this pattern
involved_participants Participants commonly involved
is_default Whether this is a default selection
is_repeatable Whether this field can have multiple values
issue Issue number of the journal
issues_found Number of issues identified
issues_resolved Number of issues resolved
item_category Category this item belongs to
item_id ID of the item being assessed
item_ids_processed IDs of items processed in this session
item_modified Item or field that was modified
item_text Text of the checklist item
items_completed Number of items completed
items_in_batch IDs of items in this batch
items_per_day Average items processed per day
items_per_hour Items processed per hour
items_processed_count Total items processed
journal Name of the journal or publication venue
keywords Keywords associated with the publication
kind Kind or type of stage output
krippendorff_alpha Krippendorff's alpha reliability coefficient
last_item_date Date of last item processed
last_updated Timestamp of last update to the review
license License governing the dataset
literature_records Literature records associated with this review
location Location or setting of work
lockfiles Environment lockfiles
lower_bound Lower bound of the interval
matthews_correlation Matthews correlation coefficient
median_time_per_item Median time per item in minutes
method The resolution method
minimum_extractors_per_study Minimum number of extractors per study
missing_data_rate Rate of missing data points
modification_id Unique identifier for modification
modification_reason Reason for the modification
modification_timestamp When the modification was made
modified_value Value after human modification
modifier_id ID of the person who made the modification
most_common_conflict_type Most frequently occurring conflict type
name Full name of the author
negative_predictive_value Negative predictive value
notes Additional notes
option_description Description of the option
option_group Group this option belongs to
option_label Display label for the option
option_order Display order of the option
option_value Value of the option
optional_fields Fields that may be completed in this template
orcid ORCID identifier for the author
original_ai_value Original value produced by AI
original_decisions Original conflicting decisions or values
os Operating system description
outcome_measures Outcome measures used
output_created_at Creation timestamp
output_label Human-readable label
outputs Output artifacts
overall_agreement Overall agreement rate across all items
overall_quality Overall quality rating
page_number Page number
page_references Page references for the assessment
pages Page numbers (e
param_category Parameter category
param_default Default value
param_description Parameter description
param_id Parameter identifier
param_name Parameter name
param_required Whether parameter is required
param_type Parameter data type
param_value Parameter value
parameter_description Description of what this parameter controls
parameter_name Name of the parameter
parameter_value Value of the parameter
participant Participant conducting this session
participant_agreement_matrix Matrix showing agreement patterns between participants
participant_avg_time_per_item Average time spent per item (minutes)
participant_characteristics Description of participants
participant_id Reference to the participant
participant_notes Additional notes about the participant
participant_role Role(s) of the participant in the review
pattern_id Unique identifier for this pattern
pattern_type Type of conflict pattern
paused_duration Total paused time in minutes
peak_items_per_day Maximum items processed in a single day
pending_conflicts Number of conflicts pending resolution
percent_agreement Simple percentage agreement
picos Population, Intervention, Comparison, Outcomes, Study design details
pilot_test_date Date of pilot test
pilot_test_notes Pilot test notes
pilot_tested Whether pilot tested
platform Platform or system used
pmid PubMed identifier of the publication
precision Positive predictive value
process_avg_time_per_item Average time per item in minutes
process_total_time_hours Total time spent in hours
processing_speed Average processing time per item (seconds)
prompt_examples Few-shot examples included in the prompt
prompt_id Unique identifier for prompt
prompt_text The actual prompt text
prompt_type Type of prompting strategy
prompt_version Version of this prompt
protocol Protocol information for the review
protocol_start_date Protocol registration date
publication_language Language of the publication
publication_status Publication status of the study
publication_type Type of publication (e
publication_year Year of publication
python_version Python version
qc_id Unique identifier for this quality control check
qc_pass_rate Proportion of items passing QC
qc_performed_by Who performed the QC
qc_sample_size Size of the sample checked
qc_timestamp When QC was performed
qc_type Type of quality control performed
quality_assessment Quality/risk of bias assessment
quote Supporting quote
r_version R version
range_description Description of a date range
range_end End of a date range
range_start Start of a date range
record_count Number of records in the collection
record_created_at Timestamp when the record was created in the system
record_id Unique identifier for the literature record
record_updated_at Timestamp when the record was last updated
records List of literature records in the collection
registration_id Identifier assigned by the registry
registration_url URL to the registry entry
registry Registry where the protocol is recorded
required_fields Fields that must be completed in this template
research_question Primary research question
resolution_confidence Confidence in the resolution
resolution_duration Time taken to resolve in minutes
resolution_id Unique identifier for this resolution
resolution_method Method used to resolve the conflict
resolution_methods_used Count of each resolution method used
resolution_notes Notes about the resolution process
resolution_rate Proportion of conflicts resolved
resolution_success_rate Success rate of resolutions
resolution_timestamp When the conflict was resolved
resolved_conflicts Number of conflicts resolved
resolver Person or process that resolved the conflict
resource_uri URI locating the resource
response_options Possible responses to this item
review_abstract Abstract or summary of the review
review_artifacts Artifacts and quality assurance documents
review_authors Authors involved in the review
review_country Country where the review is conducted
review_id Unique identifier for the systematic review
review_keywords Keywords describing the review topic
review_language Primary language of the review
review_metadata Additional metadata about the review
review_question Primary research question addressed by the review
review_status Current status of the review
review_title Title of the systematic review
review_type Type of review (systematic review, scoping review, etc
rework_rate Rate of items requiring rework
ror_id Research Organization Registry identifier for the institution
rule_type Type of validation rule
rule_value Value or pattern for the rule
sample_method Method used for sampling
scoring_method Method for scoring the checklist
scoring_weight Weight of this item in scoring
section_conditions Conditions for showing this section
section_description Description of the section
section_fields Fields in this section
section_id Unique identifier for this section
section_label Human-readable label for the section
section_name Machine-readable name for the section
section_order Display order of the section
section_repeatable Whether this section can be repeated
section_required Whether this section is required
sensitivity True positive rate (recall)
session_avg_time_per_item Average time per item in minutes
session_duration Duration of the session in seconds
session_ended_at When the session ended
session_environment Environment details for the session
session_id Unique identifier for this session
session_item_ids IDs of items processed in this session
session_notes Notes about the session
session_started_at When the session started
session_status Current status of the session
session_timestamp When this AI session occurred
session_type Type of work session
severity Severity of validation failure
snapshot_hash Checksum of the dataset snapshot
software_env Software environment used for the review
software_environments Software environments used in the review
source_record_id Reference to source record
specificity True negative rate
stage_description Detailed description
stage_label Human-readable label
stage_outputs Outputs generated by review stages
stage_training_completion_rate Stage training completion rate
stage_training_duration_hours Stage training duration in hours
stage_training_materials_uri URI to stage training materials
stage_training_notes Stage training notes
stage_training_provided Whether stage training was provided
stage_type Type of stage executed
stages Stages of the review process
start_date Date when the review started
started_at Start time
study_characteristics Key characteristics of the study
study_country Countries where study was conducted
study_design Design of the study
study_duration Duration of the study
study_registration_id Clinical trial registration ID
study_sample_size Total sample size
study_setting Setting where study was conducted
success_rate Success rate of this method
successor_template_id ID of the template that supersedes this one
supporting_quotes Quotes supporting the assessment
target_field Field or task this prompt targets
template_description Description of the template and its purpose
template_doi DOI of the template publication if available
template_field_constraints Constraints on the field value
template_field_description Description of what should be entered in this field
template_field_example Example value for this field
template_field_format Expected format for the field value
template_field_help_text Help text to guide users
template_field_id Unique identifier for this template field
template_field_label Human-readable label for the field
template_field_name Technical name of the field
template_field_order Display order of the field
template_field_section Section of the template this field belongs to
template_field_type Data type of the field
template_id Unique identifier for the registration template
template_language Language of the template content
template_last_updated Date and time when the template was last updated
template_name Human-readable name of the template
template_provider Organization or registry providing the template
template_type Type of registration template
template_uri URI where the template can be accessed
template_version Version number of the template
throughput_trend Trend in processing speed
time_spent_hours Total time spent in hours
title Title of the publication
token_usage Number of tokens used
tool_ai_models AI models used by the tool
tool_api_endpoint API endpoint URL
tool_api_version API version
tool_citation Recommended citation
tool_configuration_file URI to configuration file
tool_configuration_parameters Configuration parameters
tool_cost Cost information
tool_documentation_url URL to documentation
tool_export_formats Supported export formats
tool_features_used Features of the tool that were used
tool_id Unique identifier for the tool
tool_import_formats Supported import formats
tool_license_type Type of license
tool_limitations Known limitations
tool_name Name of the tool or platform
tool_notes Additional notes
tool_purpose Purpose for which the tool was used
tool_settings Tool settings description
tool_subscription_level Subscription level or plan
tool_url URL to access the tool
tool_vendor Vendor or organization providing the tool
tool_version Version of the tool
tools_used Tools or software used during session
total_comparisons Total number of comparisons made
total_conflicts Total number of conflicts identified
total_data_points Total data points extracted
total_items Total number of items to process
total_studies_extracted Total studies extracted
training_completed Training or calibration exercises completed
updated_at Timestamp when the review record was last updated
upper_bound Upper bound of the interval
validation_rules Rules for validating template field values
version Version number of the review
volume Volume number of the journal

Enumerations

Enumeration Description
AIPurpose Purpose for which AI is being used
BlindingLevel Level of blinding in extraction
ConfidenceLevel Confidence levels
ConflictType Types of conflicts that can occur
DatasetKind Types of datasets used in systematic reviews
DependencyType Types of field dependencies
ExtractionOrder Order for extracting studies
ExtractionStatus Status of data extraction
FieldType Data types for template fields
HumanOversightLevel Level of human oversight required
ModificationReason Reason for modifying AI output
ParameterType Data types for tool parameters
ParticipantRole Roles that participants can have in the review process
PromptType Type of prompting strategy
PublicationStatus Publication status of studies
QualityAssessmentTool Tools for quality assessment
QualityControlType Types of quality control checks
QualityRating Quality rating levels
ResolutionMethod Methods for resolving conflicts
ReviewStatus Status of the review
ReviewType Types of reviews
SamplingMethod Methods for sampling items for quality control
SessionStatus Status of a work session
SessionType Types of work sessions
StageOutputKind Types of outputs that can be produced by review stages
StageType Logical stages of a systematic review (typical flow)
StudyDesign Types of study designs
TemplateType Types of registration templates
ToolLicenseType Types of tool licenses
ToolPurpose Purposes for which tools can be used
ValidationRuleType Types of validation rules
ValidationSeverity Severity levels for validation failures

Types

Type Description
Boolean A binary (true or false) value
Curie a compact URI
Date a date (year, month and day) in an idealized calendar
DateOrDatetime Either a date or a datetime
Datetime The combination of a date and time
Decimal A real number with arbitrary precision that conforms to the xsd:decimal speci...
Double A real number that conforms to the xsd:double specification
Float A real number that conforms to the xsd:float specification
Integer An integer
Jsonpath A string encoding a JSON Path
Jsonpointer A string encoding a JSON Pointer
Ncname Prefix part of CURIE
Nodeidentifier A URI, CURIE or BNODE that represents a node in a model
Objectidentifier A URI or CURIE that represents an object in the model
Sparqlpath A string encoding a SPARQL Property Path
String A character string
Time A time object represents a (local) time of day, independent of any particular...
Uri a complete URI
Uriorcurie a URI or a CURIE

Subsets

Subset Description