miller (mlr) is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON. Stream processing for structured data with powerful transformation capabilities.
Basic Usage
- mlr --csv cat file.csv - Read CSV
- mlr --tsv cat file.tsv - Read TSV
- mlr --json cat file.json - Read JSON
- mlr --csv head -n 10 file.csv - First 10 rows
- mlr --csv tail -n 10 file.csv - Last 10 rows
Field Selection
- mlr --csv cut -f field1,field2 file.csv - Select fields
- mlr --csv cut -x -f field1 file.csv - Exclude field
- mlr --csv cut -o -f field1 file.csv - Only field
- mlr --csv rename field1,newfield1 file.csv - Rename field
- mlr --csv rename -r 'old(.*)', 'new$1' file.csv - Rename with regex
Filtering
- mlr --csv filter '$field == "value"' file.csv - Equal filter
- mlr --csv filter '$field > 10' file.csv - Numeric filter
- mlr --csv filter '$field =~ "pattern"' file.csv - Regex filter
- mlr --csv filter '$field != "value"' file.csv - Not equal
- mlr --csv filter '$field1 == "a" && $field2 > 5' file.csv - Multiple conditions
- mlr --csv filter -x '$field == "value"' file.csv - Exclude matching
Transformation
- mlr --csv put '$new = $old * 2' file.csv - Add computed field
- mlr --csv put '$total = $price * $quantity' file.csv - Calculate
- mlr --csv put '$date = strftime($timestamp, "%Y-%m-%d")' file.csv - Format date
- mlr --csv put '$upper = toupper($field)' file.csv - String functions
- mlr --csv put '$len = length($field)' file.csv - String length
Sorting
- mlr --csv sort -f field1 file.csv - Sort by field
- mlr --csv sort -f field1,field2 file.csv - Sort by multiple fields
- mlr --csv sort -nr -f field1 file.csv - Reverse numeric sort
- mlr --csv sort -f field1 -t field2 file.csv - Sort then tie-break
Grouping & Aggregation
- mlr --csv stats1 -a count -f field1 -g category file.csv - Count by group
- mlr --csv stats1 -a sum -f amount -g category file.csv - Sum by group
- mlr --csv stats1 -a mean -f value -g category file.csv - Mean by group
- mlr --csv stats1 -a min,max -f value -g category file.csv - Min/max
- mlr --csv group-by category file.csv - Group records
Joining
- mlr --csv join -j field1 -f file1.csv file2.csv - Inner join
- mlr --csv join -j field1 -l file1.csv file2.csv - Left join
- mlr --csv join -j field1 -r file1.csv file2.csv - Right join
- mlr --csv join -j field1 -u file1.csv file2.csv - Full outer join
Format Conversion
- mlr --csv --json cat file.csv - CSV to JSON
- mlr --json --csv cat file.json - JSON to CSV
- mlr --csv --tsv cat file.csv - CSV to TSV
- mlr --csv --markdown cat file.csv - CSV to Markdown
Common Examples
Select Fields
mlr --csv cut -f name,email users.csv
Extract specific columns.
Filter Rows
mlr --csv filter '$status == "active"' users.csv
Filter by condition.
Add Field
mlr --csv put '$total = $price * $qty' orders.csv
Calculate new field.
Sort
mlr --csv sort -f date -nr sales.csv
Sort by date descending.
Group Statistics
mlr --csv stats1 -a sum -f amount -g category sales.csv
Sum by category.
Join Files
mlr --csv join -j user_id -f users.csv orders.csv
Join on common field.
Convert Format
mlr --csv --json cat data.csv > data.json
Convert CSV to JSON.
Multiple Operations
mlr --csv filter '$age > 18' then cut -f name,email then sort -f name users.csv
Chain operations with 'then'.
Tips
- Use --csv, --tsv, --json to specify input format
- Chain operations with 'then' keyword
- Use $field to reference fields in expressions
- Supports streaming for large files
- Great for data transformation pipelines
- More intuitive than awk for structured data
- Supports many statistical functions
- Excellent documentation and examples