extraction_stage¶

Schema for data extraction stage of the review workflow.

URI: https://open-and-sustainable.github.io/revaise-model/schema/stages/extraction

Name: extraction_stage

Classes¶

Class	Description
Affiliation	Institutional affiliation
AgreementMetrics	Metrics for measuring agreement between reviewers/extractors
AIAssistance	Configuration for AI assistance in any review stage
AIParameter	AI model parameter configuration
AIPrompt	Prompt configuration for AI interaction
AISession	Session details for AI-assisted work
Author	Author of a review or publication
Participant	A person participating in the review process
BatchProcessing	Batch processing within a session
ChecklistItem	Individual item in a quality checklist
ConfidenceInterval	Confidence interval specification
ConflictPattern	Pattern analysis of conflicts
ConflictResolution	Resolution of conflicts or disagreements between reviewers
DatasetRef	External dataset reference
DateRange	A simple date range representation
DomainAssessment	Assessment of a specific quality domain
ExternalTool	External tool or platform used in the review process
ExtractedDataPoint	Individual data point extracted from a study
ExtractedStudy	A study with extracted data
ExtractionForm	Template or form used for data extraction
FieldAgreement	Agreement metrics for a specific field or item
FieldDefinition	Definition of a field used in forms, templates, or data collection
FieldDependency	Dependency between fields
FieldOption	Option for select/multi-select fields
FormSection	Section within a form or template
FundingSource	Source of funding for the review
HumanModification	Record of human modification to AI output
LiteratureRecord	A literature record representing a publication that persists across review st...
LiteratureRecordCollection	A collection of literature records
ParticipantWorkload	Workload statistics for a participant
PerformanceMetrics	General performance metrics
AIPerformanceMetrics	Performance metrics for AI assistance, extending general performance metrics ...
ProcessMetrics	Metrics related to process efficiency and completion
Protocol	Scoping and registration details of the review
QualityAssessment	Quality or risk of bias assessment for a study or item
QualityChecklist	Checklist for quality assessment
QualityControl	Quality control measures and checks
RegistrationTemplate	A template used for registering systematic reviews, defining required and opt...
ResolutionMethodCount	Count of a specific resolution method usage
ResolutionStatistics	Statistics about conflict resolution
Review	Root container for a systematic review
SessionEnvironment	Environment details for a work session
SessionMetrics	Metrics for a work session
SoftwareEnv	Software environment configuration
StageExecution	Base class for executing a specific stage of the review
ExtractionStage	Data extraction from included studies
StageOutput	Output file or log produced by a review stage (results, logs, code, visualiza...
StageOutputRef	Reference to a stage output in this or another RevAIse bundle
StageProtocol	Base class for protocols defining how a review stage should be conducted
ExtractionProtocol	Protocol for the data extraction process
StageQualityControl	Base class for quality control measures across review stages
ExtractionQualityControl	Quality control measures for data extraction
StageStatistics	Base class for statistics collected about review stages
ExtractionStatistics	Statistics about the extraction process
StageTrainingInfo	Shared training information for review stages
StudyCharacteristics	Key characteristics of a study
TemplateField	A field within a registration template
ToolParameter	Configuration parameter for a tool
ValidationRule	Validation rule for a field
WorkSession	A work session by a participant on review tasks

Slots¶

Slot	Description
abstract	Abstract or summary of the publication
accuracy	Overall accuracy
active_duration	Active working time in minutes
address	Street address of the institution
affiliations	Institutional affiliations of the author
agreement_rate	Agreement rate for this field
agreement_sample_size	Number of items compared
ai_assistance_config	AI assistance configuration for extraction
ai_assisted	Whether AI assisted
ai_confidence	AI confidence score
ai_confidence_threshold	Minimum confidence score for accepting AI outputs
ai_config_id	Reference to the AIAssistance configuration used
ai_error_handling	Strategy for handling AI errors or failures
ai_extraction_session	AI extraction session
ai_id	Unique identifier for AI assistance configuration
ai_model	Name of the AI model
ai_parameters	Model parameters and settings
ai_performance_metrics	Performance metrics for the AI assistance
ai_prompts	Prompts used to interact with the AI
ai_provider	AI service provider
ai_purpose	Purposes for which AI is used
ai_session_id	Unique identifier for AI session
ai_training_description	Description of training data or fine-tuning
ai_validation_rules	Rules for validating AI outputs
ai_version	Version of the AI model
api_calls	Number of API calls made
area_under_curve	Area under the ROC curve
assessment_date	Date and time of assessment
assessment_id	Unique identifier for this assessment
assessment_notes	Notes about the assessment
assessment_time_minutes	Time taken to complete assessment in minutes
assessment_tool	Tool used for quality assessment
assessor	Person who performed the assessment
assigned_items_count	Number of items assigned to this participant
authors	List of authors of the publication
average_items_per_day	Average items processed per day
average_resolution_time	Average time to resolve in minutes
average_time_per_study	Average time per study (minutes)
backlog_size	Number of items remaining
balanced_accuracy	Average of sensitivity and specificity
base_agreement_metrics	Base agreement metrics
base_calibration_performed	Base calibration performed flag
base_completed_date	Base stage completion date
base_completion_rate	Base completion rate
base_discrepancy_resolutions	Base discrepancy resolutions
base_double_checking_rate	Base double-checking rate
base_items_completed	Base items completed
base_participant_workloads	Base participant workloads
base_protocol_date	Base protocol date
base_protocol_description	Base protocol description
base_protocol_deviations	Base protocol deviations
base_protocol_id	Base protocol identifier
base_protocol_notes	Base protocol notes
base_protocol_software	Base protocol software
base_protocol_tools	Base protocol tools
base_protocol_training	Base protocol training
base_protocol_version	Base protocol version
base_qc_id	Base QC identifier
base_qc_notes	Base QC notes
base_qc_type	Base QC type
base_spot_check_rate	Base spot check rate
base_started_date	Base stage start date
base_stats_id	Base statistics identifier
base_stats_notes	Base statistics notes
base_total_hours	Base total hours
base_total_items	Base total items
batch_completion_rate	Proportion of batch completed
batch_id	Unique identifier for this batch
batch_number	Sequential number of this batch
batch_size	Number of items in the batch
batch_started_at	When batch processing started
blinding_level	Level of blinding
break_frequency	Average time between breaks in minutes
browser	Browser used if applicable
checklist_id	Unique identifier for the checklist
checklist_items	Items in the checklist
checklist_name	Name of the checklist
checklist_version	Version of the checklist
checks_performed	Specific checks performed
checksum	Checksum of the output file
city	City where the institution is located
cohen_kappa	Cohen's kappa for inter-rater agreement
collection_created_at	Timestamp when the collection was created
collection_description	Description of the collection's purpose or contents
collection_id	Unique identifier for the collection containing this record
collection_name	Name of the collection containing this record
common_causes	Common causes of this conflict pattern
comparator_details	Details of the comparator
completed_at	When batch processing completed
completed_items	Number of items completed
completed_items_count	Number of items completed by this participant
completion_date	Date when the review was completed
completion_rate	Proportion of items completed
conditional_on	Field ID that this field's visibility depends on
confidence_calibration	Calibration of AI confidence scores
confidence_interval	Confidence interval for the agreement rate
confidence_level	Confidence level (e
confidence_rating	Assessor's confidence in the assessment
conflict_of_interest_declared	Whether conflicts of interest were declared
conflict_type	Type of conflict being resolved
conflicting_items	IDs of items in conflict
conflicts_of_interest	Declared conflicts of interest
connection_quality	Network connection quality
containers	Container images used
corrective_actions	Corrective actions taken
corresponding_author	Primary contact author for the review
cost	Estimated cost
count	Number of times this method was used
country	Country where the institution is located
created_at	Timestamp when the review record was created
created_date	Date when the template was created
datapoint_id	Data point identifier
dataset_name	Name of the dataset
department	Department or division within the institution
dependency_condition	Condition for the dependency
dependency_type	Type of dependency
dependency_value	Value to check against
depends_on_field	Field this depends on
deprecation_date	Date when this template will be or was deprecated
diagnostic_odds_ratio	Ratio of odds of positive test in diseased vs non-diseased
disagreement_count	Number of disagreements
disagreement_resolution_method	Method for resolving disagreements
disagreement_types	Types of disagreements observed
doi	Digital Object Identifier of the publication
domain_assessments	Assessment of individual quality domains
domain_justification	Justification for the rating
domain_name	Name of the quality domain
domain_rating	Rating for this domain
eligibility_criteria	Inclusion and exclusion criteria
email	Email address of the author
ended_at	End time
error_message	Error message when validation fails
error_rate	Rate of errors made
errors_encountered	Errors encountered during the session
ethical_approval	Ethical approval information if applicable
expertise_areas	Areas of expertise relevant to the review
extracted_data	Extracted data points
extracted_studies	Studies with extracted data
extracted_study_id	Unique identifier for extracted study
extracted_unit	Unit of measurement
extracted_value	Extracted value
extraction_confidence	Confidence in extraction
extraction_form_id	Reference to extraction form
extraction_forms	Data extraction forms used
extraction_guidance	Detailed extraction guidance
extraction_notes	Notes about extraction
extraction_order	Order in which studies are extracted
extraction_protocol	Protocol for data extraction
extraction_quality_control	Quality control for extraction
extraction_sessions	Extraction work sessions
extraction_specific_checks	Extraction-specific quality checks
extraction_statistics	Statistics for extraction stage
extraction_status	Status of extraction
extraction_timestamp	When extracted
extractor_id	ID of extractor
extractor_notes	Extractor notes
f1_score	Harmonic mean of precision and recall
false_discovery_rate	Expected proportion of false discoveries
false_negative_rate	Type II error rate
false_omission_rate	Proportion of false negatives among negative calls
false_positive_rate	Type I error rate
family_name	Family/last name of the author
field_category	Category or section this field belongs to
field_default_value	Default value for the field
field_definitions	Formal definition of template fields (e
field_dependencies	Dependencies on other fields
field_description	Detailed description of the field
field_group	Group this field belongs to
field_help_text	Help text to guide users
field_id	Unique identifier for this field
field_identifier	Identifier of the field or item
field_label	Human-readable label for the field
field_level_agreements	Agreement metrics for individual fields
field_metadata	Additional metadata as JSON
field_name	Machine-readable name for the field
field_options	Options for select/multi-select fields
field_order	Display order of the field
field_placeholder	Placeholder text for the field
field_required	Whether this field is required
field_type	Data type of the field
field_validation_rules	Validation rules for the field
final_decision	Final resolved decision or value
final_extracted_values	Final extracted values
first_item_date	Date of first item processed
fleiss_kappa	Fleiss' kappa for multiple raters
focus_score	Measure of session focus/consistency
form_created_at	Form creation time
form_created_by	Form creator
form_description	Form description
form_id	Form identifier
form_last_modified	Last modification time
form_name	Form name
form_sections	Sections in form
form_uri	URI to form
form_version	Form version
format	File format or MIME type
frequency	How often this pattern occurs
full_text_url	URL to access the full text of the publication
funder_id	Unique identifier for the funding organization
funder_name	Name of the funding organization
funding_source	Source of study funding
funding_sources	Sources of funding for the review
given_name	Given/first name of the author
grant_number	Grant or award number from the funder
guidance_text	Guidance for answering this item
human_agreement_rate	Rate of agreement with human reviewers
human_modifications	Modifications made by humans to AI outputs
human_oversight_level	Level of human review required
independent_extraction	Whether extraction is independent
inputs	Input artifacts
institution_name	Name of the affiliated institution
interpretation_guide	Guide for interpreting scores
intervention_details	Details of the intervention
involved_fields	Fields commonly involved in this pattern
involved_participants	Participants commonly involved
is_default	Whether this is a default selection
is_repeatable	Whether this field can have multiple values
issue	Issue number of the journal
issues_found	Number of issues identified
issues_resolved	Number of issues resolved
item_category	Category this item belongs to
item_id	ID of the item being assessed
item_ids_processed	IDs of items processed in this session
item_modified	Item or field that was modified
item_text	Text of the checklist item
items_completed	Number of items completed
items_in_batch	IDs of items in this batch
items_per_day	Average items processed per day
items_per_hour	Items processed per hour
items_processed_count	Total items processed
journal	Name of the journal or publication venue
keywords	Keywords associated with the publication
kind	Kind or type of stage output
krippendorff_alpha	Krippendorff's alpha reliability coefficient
last_item_date	Date of last item processed
last_updated	Timestamp of last update to the review
license	License governing the dataset
literature_records	Literature records associated with this review
location	Location or setting of work
lockfiles	Environment lockfiles
lower_bound	Lower bound of the interval
matthews_correlation	Matthews correlation coefficient
median_time_per_item	Median time per item in minutes
method	The resolution method
minimum_extractors_per_study	Minimum number of extractors per study
missing_data_rate	Rate of missing data points
modification_id	Unique identifier for modification
modification_reason	Reason for the modification
modification_timestamp	When the modification was made
modified_value	Value after human modification
modifier_id	ID of the person who made the modification
most_common_conflict_type	Most frequently occurring conflict type
name	Full name of the author
negative_predictive_value	Negative predictive value
notes	Additional notes
option_description	Description of the option
option_group	Group this option belongs to
option_label	Display label for the option
option_order	Display order of the option
option_value	Value of the option
optional_fields	Fields that may be completed in this template
orcid	ORCID identifier for the author
original_ai_value	Original value produced by AI
original_decisions	Original conflicting decisions or values
os	Operating system description
outcome_measures	Outcome measures used
output_created_at	Creation timestamp
output_label	Human-readable label
outputs	Output artifacts
overall_agreement	Overall agreement rate across all items
overall_quality	Overall quality rating
page_number	Page number
page_references	Page references for the assessment
pages	Page numbers (e
param_category	Parameter category
param_default	Default value
param_description	Parameter description
param_id	Parameter identifier
param_name	Parameter name
param_required	Whether parameter is required
param_type	Parameter data type
param_value	Parameter value
parameter_description	Description of what this parameter controls
parameter_name	Name of the parameter
parameter_value	Value of the parameter
participant	Participant conducting this session
participant_agreement_matrix	Matrix showing agreement patterns between participants
participant_avg_time_per_item	Average time spent per item (minutes)
participant_characteristics	Description of participants
participant_id	Reference to the participant
participant_notes	Additional notes about the participant
participant_role	Role(s) of the participant in the review
pattern_id	Unique identifier for this pattern
pattern_type	Type of conflict pattern
paused_duration	Total paused time in minutes
peak_items_per_day	Maximum items processed in a single day
pending_conflicts	Number of conflicts pending resolution
percent_agreement	Simple percentage agreement
picos	Population, Intervention, Comparison, Outcomes, Study design details
pilot_test_date	Date of pilot test
pilot_test_notes	Pilot test notes
pilot_tested	Whether pilot tested
platform	Platform or system used
pmid	PubMed identifier of the publication
precision	Positive predictive value
process_avg_time_per_item	Average time per item in minutes
process_total_time_hours	Total time spent in hours
processing_speed	Average processing time per item (seconds)
prompt_examples	Few-shot examples included in the prompt
prompt_id	Unique identifier for prompt
prompt_text	The actual prompt text
prompt_type	Type of prompting strategy
prompt_version	Version of this prompt
protocol	Protocol information for the review
protocol_start_date	Protocol registration date
publication_language	Language of the publication
publication_status	Publication status of the study
publication_type	Type of publication (e
publication_year	Year of publication
python_version	Python version
qc_id	Unique identifier for this quality control check
qc_pass_rate	Proportion of items passing QC
qc_performed_by	Who performed the QC
qc_sample_size	Size of the sample checked
qc_timestamp	When QC was performed
qc_type	Type of quality control performed
quality_assessment	Quality/risk of bias assessment
quote	Supporting quote
r_version	R version
range_description	Description of a date range
range_end	End of a date range
range_start	Start of a date range
record_count	Number of records in the collection
record_created_at	Timestamp when the record was created in the system
record_id	Unique identifier for the literature record
record_updated_at	Timestamp when the record was last updated
records	List of literature records in the collection
registration_id	Identifier assigned by the registry
registration_url	URL to the registry entry
registry	Registry where the protocol is recorded
required_fields	Fields that must be completed in this template
research_question	Primary research question
resolution_confidence	Confidence in the resolution
resolution_duration	Time taken to resolve in minutes
resolution_id	Unique identifier for this resolution
resolution_method	Method used to resolve the conflict
resolution_methods_used	Count of each resolution method used
resolution_notes	Notes about the resolution process
resolution_rate	Proportion of conflicts resolved
resolution_success_rate	Success rate of resolutions
resolution_timestamp	When the conflict was resolved
resolved_conflicts	Number of conflicts resolved
resolver	Person or process that resolved the conflict
resource_uri	URI locating the resource
response_options	Possible responses to this item
review_abstract	Abstract or summary of the review
review_artifacts	Artifacts and quality assurance documents
review_authors	Authors involved in the review
review_country	Country where the review is conducted
review_id	Unique identifier for the systematic review
review_keywords	Keywords describing the review topic
review_language	Primary language of the review
review_metadata	Additional metadata about the review
review_question	Primary research question addressed by the review
review_status	Current status of the review
review_title	Title of the systematic review
review_type	Type of review (systematic review, scoping review, etc
rework_rate	Rate of items requiring rework
ror_id	Research Organization Registry identifier for the institution
rule_type	Type of validation rule
rule_value	Value or pattern for the rule
sample_method	Method used for sampling
scoring_method	Method for scoring the checklist
scoring_weight	Weight of this item in scoring
section_conditions	Conditions for showing this section
section_description	Description of the section
section_fields	Fields in this section
section_id	Unique identifier for this section
section_label	Human-readable label for the section
section_name	Machine-readable name for the section
section_order	Display order of the section
section_repeatable	Whether this section can be repeated
section_required	Whether this section is required
sensitivity	True positive rate (recall)
session_avg_time_per_item	Average time per item in minutes
session_duration	Duration of the session in seconds
session_ended_at	When the session ended
session_environment	Environment details for the session
session_id	Unique identifier for this session
session_item_ids	IDs of items processed in this session
session_notes	Notes about the session
session_started_at	When the session started
session_status	Current status of the session
session_timestamp	When this AI session occurred
session_type	Type of work session
severity	Severity of validation failure
snapshot_hash	Checksum of the dataset snapshot
software_env	Software environment used for the review
software_environments	Software environments used in the review
source_record_id	Reference to source record
specificity	True negative rate
stage_description	Detailed description
stage_label	Human-readable label
stage_outputs	Outputs generated by review stages
stage_training_completion_rate	Stage training completion rate
stage_training_duration_hours	Stage training duration in hours
stage_training_materials_uri	URI to stage training materials
stage_training_notes	Stage training notes
stage_training_provided	Whether stage training was provided
stage_type	Type of stage executed
stages	Stages of the review process
start_date	Date when the review started
started_at	Start time
study_characteristics	Key characteristics of the study
study_country	Countries where study was conducted
study_design	Design of the study
study_duration	Duration of the study
study_registration_id	Clinical trial registration ID
study_sample_size	Total sample size
study_setting	Setting where study was conducted
success_rate	Success rate of this method
successor_template_id	ID of the template that supersedes this one
supporting_quotes	Quotes supporting the assessment
target_field	Field or task this prompt targets
template_description	Description of the template and its purpose
template_doi	DOI of the template publication if available
template_field_constraints	Constraints on the field value
template_field_description	Description of what should be entered in this field
template_field_example	Example value for this field
template_field_format	Expected format for the field value
template_field_help_text	Help text to guide users
template_field_id	Unique identifier for this template field
template_field_label	Human-readable label for the field
template_field_name	Technical name of the field
template_field_order	Display order of the field
template_field_section	Section of the template this field belongs to
template_field_type	Data type of the field
template_id	Unique identifier for the registration template
template_language	Language of the template content
template_last_updated	Date and time when the template was last updated
template_name	Human-readable name of the template
template_provider	Organization or registry providing the template
template_type	Type of registration template
template_uri	URI where the template can be accessed
template_version	Version number of the template
throughput_trend	Trend in processing speed
time_spent_hours	Total time spent in hours
title	Title of the publication
token_usage	Number of tokens used
tool_ai_models	AI models used by the tool
tool_api_endpoint	API endpoint URL
tool_api_version	API version
tool_citation	Recommended citation
tool_configuration_file	URI to configuration file
tool_configuration_parameters	Configuration parameters
tool_cost	Cost information
tool_documentation_url	URL to documentation
tool_export_formats	Supported export formats
tool_features_used	Features of the tool that were used
tool_id	Unique identifier for the tool
tool_import_formats	Supported import formats
tool_license_type	Type of license
tool_limitations	Known limitations
tool_name	Name of the tool or platform
tool_notes	Additional notes
tool_purpose	Purpose for which the tool was used
tool_settings	Tool settings description
tool_subscription_level	Subscription level or plan
tool_url	URL to access the tool
tool_vendor	Vendor or organization providing the tool
tool_version	Version of the tool
tools_used	Tools or software used during session
total_comparisons	Total number of comparisons made
total_conflicts	Total number of conflicts identified
total_data_points	Total data points extracted
total_items	Total number of items to process
total_studies_extracted	Total studies extracted
training_completed	Training or calibration exercises completed
updated_at	Timestamp when the review record was last updated
upper_bound	Upper bound of the interval
validation_rules	Rules for validating template field values
version	Version number of the review
volume	Volume number of the journal

Enumerations¶

Enumeration	Description
AIPurpose	Purpose for which AI is being used
BlindingLevel	Level of blinding in extraction
ConfidenceLevel	Confidence levels
ConflictType	Types of conflicts that can occur
DatasetKind	Types of datasets used in systematic reviews
DependencyType	Types of field dependencies
ExtractionOrder	Order for extracting studies
ExtractionStatus	Status of data extraction
FieldType	Data types for template fields
HumanOversightLevel	Level of human oversight required
ModificationReason	Reason for modifying AI output
ParameterType	Data types for tool parameters
ParticipantRole	Roles that participants can have in the review process
PromptType	Type of prompting strategy
PublicationStatus	Publication status of studies
QualityAssessmentTool	Tools for quality assessment
QualityControlType	Types of quality control checks
QualityRating	Quality rating levels
ResolutionMethod	Methods for resolving conflicts
ReviewStatus	Status of the review
ReviewType	Types of reviews
SamplingMethod	Methods for sampling items for quality control
SessionStatus	Status of a work session
SessionType	Types of work sessions
StageOutputKind	Types of outputs that can be produced by review stages
StageType	Logical stages of a systematic review (typical flow)
StudyDesign	Types of study designs
TemplateType	Types of registration templates
ToolLicenseType	Types of tool licenses
ToolPurpose	Purposes for which tools can be used
ValidationRuleType	Types of validation rules
ValidationSeverity	Severity levels for validation failures

Types¶

Type	Description
Boolean	A binary (true or false) value
Curie	a compact URI
Date	a date (year, month and day) in an idealized calendar
DateOrDatetime	Either a date or a datetime
Datetime	The combination of a date and time
Decimal	A real number with arbitrary precision that conforms to the xsd:decimal speci...
Double	A real number that conforms to the xsd:double specification
Float	A real number that conforms to the xsd:float specification
Integer	An integer
Jsonpath	A string encoding a JSON Path
Jsonpointer	A string encoding a JSON Pointer
Ncname	Prefix part of CURIE
Nodeidentifier	A URI, CURIE or BNODE that represents a node in a model
Objectidentifier	A URI or CURIE that represents an object in the model
Sparqlpath	A string encoding a SPARQL Property Path
String	A character string
Time	A time object represents a (local) time of day, independent of any particular...
Uri	a complete URI
Uriorcurie	a URI or a CURIE

Subsets¶

Subset	Description