Sparrow.jl
Ship, Plane, and Anchored Radar Research and Operational Workflows
Sparrow.jl is a flexible, distributed workflow system for processing radar data. It provides a framework for building custom data processing pipelines with built-in support for quality control, gridding, merging, and visualization.
Features
- Flexible Workflow System: Define custom workflows with multiple processing steps
- Distributed Processing: Built-in support for parallel processing across multiple workers
- Radar Data Processing: Specialized tools for quality control, gridding, and merging radar data
- Extensible Architecture: Easy to add custom processing steps and workflow types
- Message System: Configurable logging with multiple severity levels
- Integration with HPC: Support for Slurm and other cluster managers
Quick Start
Installation
using Pkg
Pkg.add(url="https://github.com/csu-tropical/Sparrow.jl")Basic Usage
- Define a workflow type using the
@workflow_typemacro:
using Sparrow
@workflow_type MyRadarWorkflow- Define workflow steps using the
@workflow_stepmacro:
@workflow_step QualityControl
@workflow_step Gridding- Create a workflow instance with parameters:
workflow = MyRadarWorkflow(
base_working_dir = "/path/to/working/dir",
base_archive_dir = "/path/to/archive",
base_data_dir = "/path/to/data",
steps = [
# Format: (step_name, step_type, input_directory, archive)
("qc", QualityControl, "base_data", false),
("grid", Gridding, "qc", true)
],
# Add other parameters as needed
)- Implement step functions:
function Sparrow.workflow_step(workflow::MyRadarWorkflow, ::Type{QualityControl},
input_dir::String, output_dir::String;
step_name::String="", step_num::Int=0, kwargs...)
# Your processing logic here
msg_info("Running quality control on data from $(input_dir)")
# ... process files ...
return num_files_processed
end- Run the workflow from the command line:
julia sparrow my_workflow.jl --datetime 20240101_000000Command Line Interface
The sparrow script provides a command-line interface for running workflows:
julia sparrow [options]
Options:
workflow Workflow file to execute (required, positional)
--datetime DATETIME Process specific time YYYYmmdd_HHMMSS (default: "now")
--realtime Process an incoming realtime datastream
--force_reprocess Force reprocessing of previously processed data
--threads N Number of threads
--num_workers N Number of worker processes
-v, --verbose LEVEL Message verbosity level (0-4, default: 2)
--slurm Use Slurm cluster manager
--sge Use Sun Grid Engine
--paths_file FILE File overriding data pathsDocumentation Contents
Index
Module Overview
For detailed API documentation, see the API Reference.