Converting Windows Text to Linux Format

Master line ending conversions across platforms

Page content

Line ending inconsistencies between Windows and Linux systems cause formatting issues, Git warnings, and script failures. This comprehensive guide covers detection, conversion, and prevention strategies.

windows-to-unix document conversion This nice image is generated by AI model Flux 1 dev.

Understanding Line Ending Differences

Operating systems use different conventions to mark the end of a line in text files, creating compatibility challenges in cross-platform development:

  • Windows: Carriage Return + Line Feed (\r\n or CRLF, hex 0D 0A)
  • Linux/Unix: Line Feed only (\n or LF, hex 0A)
  • Classic Mac OS: Carriage Return only (\r or CR, hex 0D)

This historical difference stems from typewriter mechanics. Windows inherited the CRLF convention from DOS, which maintained compatibility with teletype machines that required both a carriage return (move to line start) and line feed (advance paper).

Common Problems Caused by Line Ending Mismatches

1. Script Execution Failures

Bash scripts with Windows line endings fail with cryptic errors:

bash: ./script.sh: /bin/bash^M: bad interpreter: No such file or directory

The ^M character (carriage return) becomes part of the shebang line, causing the interpreter lookup to fail.

2. Git Warnings and Diff Noise

When committing Windows files to Git on Linux, you’ll see:

warning: CRLF will be replaced by LF in file.txt.
The file will have its original line endings in your working directory

Git diffs may show entire files as changed when only line endings differ, obscuring actual code changes.

3. Visual Artifacts in Editors

Linux text editors that don’t auto-detect line endings display ^M characters at line ends, making files difficult to read and edit. This is especially problematic in Hugo markdown files where it can break frontmatter parsing.

4. Data Processing Issues

Scripts parsing text files may include carriage returns in extracted data, causing comparison failures and unexpected behavior in data pipelines.

Detecting Windows Line Endings

Before converting files, identify which ones need conversion to avoid unnecessary modifications.

Method 1: Using the file Command

The most reliable detection method:

file content/post/my-post/index.md

Output examples:

# Windows line endings:
index.md: UTF-8 Unicode text, with CRLF line terminators

# Linux line endings:
index.md: UTF-8 Unicode text

# Mixed line endings (problematic):
index.md: UTF-8 Unicode text, with CRLF, LF line terminators

Method 2: Visual Inspection with cat

Display control characters:

cat -A filename.txt

Windows files show ^M$ at line ends, while Linux files show only $.

Method 3: Using grep

Search for carriage returns:

grep -r $'\r' content/post/2025/11/

This identifies all files containing CRLF in the specified directory.

Method 4: Hexdump Analysis

For detailed byte-level inspection:

hexdump -C filename.txt | head -n 20

Look for 0d 0a (CRLF) versus 0a (LF) sequences.

Converting Windows to Linux Format

Multiple tools provide reliable conversion with different trade-offs in availability, features, and performance.

The most robust and feature-rich solution specifically designed for line ending conversion.

Installation

# Ubuntu/Debian
sudo apt install dos2unix

# Red Hat/CentOS/Fedora
sudo yum install dos2unix

# macOS (Homebrew)
brew install dos2unix

# Arch Linux
sudo pacman -S dos2unix

Basic Usage

# Convert single file (modifies in-place)
dos2unix filename.txt

# Convert with backup (creates .bak file)
dos2unix -b filename.txt

# Convert multiple files
dos2unix file1.txt file2.txt file3.txt

# Convert with wildcards
dos2unix *.txt
dos2unix content/post/2025/11/*/index.md

Advanced Options

# Dry run - preview without modifying
dos2unix --dry-run filename.txt

# Keep modification timestamp
dos2unix -k filename.txt

# Convert only if line endings differ
dos2unix -f filename.txt

# Recursive conversion
find . -name "*.md" -exec dos2unix {} \;

# Convert all markdown files in directory tree
find content/post -type f -name "*.md" -exec dos2unix {} \;

Batch Processing Hugo Posts:

# Convert all index.md files in 2025 posts
dos2unix content/post/2025/**/index.md

# Convert all markdown files excluding specific directories
find content/post -name "*.md" ! -path "*/drafts/*" -exec dos2unix {} \;

Solution 2: sed Command

Available on all Unix systems without additional installation, though less efficient for large batches.

# Convert single file
sed -i 's/\r$//' filename.txt

# Convert multiple files with loop
for file in content/post/2025/11/*/index.md; do 
    sed -i 's/\r$//' "$file"
done

# Convert with backup
sed -i.bak 's/\r$//' filename.txt

# Recursive with find
find . -name "*.txt" -exec sed -i 's/\r$//' {} \;

Important Notes

  • sed -i modifies files in-place
  • On macOS, use sed -i '' 's/\r$//' filename.txt
  • Creates temporary files during processing
  • Slower than dos2unix for large file sets

Solution 3: tr Command

Pipe-based approach useful in data processing workflows:

# Basic conversion (requires output redirection)
tr -d '\r' < input.txt > output.txt

# Process and convert in pipeline
cat input.txt | tr -d '\r' | process_data.sh

# Cannot modify in-place - use temp file
tr -d '\r' < input.txt > temp.txt && mv temp.txt input.txt

Advantages

  • Available on all Unix systems
  • Excellent for streaming data
  • Integrates well in pipes

Disadvantages

  • Cannot modify files in-place
  • Requires manual backup handling
  • Less convenient for batch operations

Solution 4: Using awk

Alternative for complex text processing:

awk '{sub(/\r$/,"")}1' input.txt > output.txt

# Or more explicitly:
awk 'BEGIN{RS="\r\n"} {print}' input.txt > output.txt

Comparison Table

Tool In-place Batch Backup Speed Availability
dos2unix Fast Requires install
sed Medium Built-in
tr Fast Built-in
awk Medium Built-in

Prevention Strategies

Preventing Windows line endings is more efficient than repeatedly converting files.

Git Configuration

Configure Git to automatically normalize line endings across platforms.

Option 1: Repository-level (.gitattributes)

Create .gitattributes in repository root:

# Auto detect text files and normalize to LF
* text=auto

# Explicitly declare text files
*.md text
*.txt text
*.sh text eol=lf
*.py text eol=lf
*.go text eol=lf
*.js text eol=lf
*.json text eol=lf

# Binary files
*.jpg binary
*.png binary
*.pdf binary

This ensures consistent line endings regardless of platform and prevents unnecessary conversions.

Option 2: Global User Configuration

Configure Git behavior for all repositories:

# Linux/macOS: Convert CRLF to LF on commit, leave LF unchanged
git config --global core.autocrlf input

# Windows: Convert LF to CRLF on checkout, CRLF to LF on commit
git config --global core.autocrlf true

# Disable auto-conversion (rely on .gitattributes only)
git config --global core.autocrlf false
  • Linux/macOS developers: core.autocrlf input
  • Windows developers: core.autocrlf true
  • All projects: Use .gitattributes for explicit control

Normalizing Existing Repository

If your repository already contains mixed line endings:

# Remove all files from Git index
git rm --cached -r .

# Restore files with normalized line endings
git reset --hard

# Commit the normalized files
git add .
git commit -m "Normalize line endings"

Editor Configuration

Configure text editors to use Unix line endings by default.

Visual Studio Code (settings.json)

{
  "files.eol": "\n",
  "files.encoding": "utf8",
  "files.insertFinalNewline": true,
  "files.trimTrailingWhitespace": true
}

Set per-language if needed:

{
  "[markdown]": {
    "files.eol": "\n"
  }
}

Vim/Neovim (.vimrc)

set fileformat=unix
set fileformats=unix,dos

Emacs (.emacs or init.el)

(setq-default buffer-file-coding-system 'utf-8-unix)

Sublime Text (Preferences.sublime-settings)

{
  "default_line_ending": "unix"
}

JetBrains IDEs (Settings → Editor → Code Style)

  • Line separator: Unix and macOS (\n)

EditorConfig

Create .editorconfig in project root for cross-editor compatibility:

root = true

[*]
end_of_line = lf
charset = utf-8
insert_final_newline = true
trim_trailing_whitespace = true

[*.md]
trim_trailing_whitespace = false

[*.{sh,bash}]
end_of_line = lf

[*.bat]
end_of_line = crlf

Most modern editors automatically respect EditorConfig settings, ensuring consistency across team members using different editors.

Automation and Scripting

Integrate line ending checks into development workflows to catch issues early.

Pre-commit Git Hook

Create .git/hooks/pre-commit:

#!/bin/bash
# Check for files with CRLF line endings

FILES=$(git diff --cached --name-only --diff-filter=ACM)
CRLF_FILES=""

for FILE in $FILES; do
    if file "$FILE" | grep -q "CRLF"; then
        CRLF_FILES="$CRLF_FILES\n  $FILE"
    fi
done

if [ -n "$CRLF_FILES" ]; then
    echo "Error: The following files have Windows line endings (CRLF):"
    echo -e "$CRLF_FILES"
    echo ""
    echo "Convert them using: dos2unix <filename>"
    echo "Or configure your editor to use Unix line endings (LF)"
    exit 1
fi

exit 0

Make executable:

chmod +x .git/hooks/pre-commit

Continuous Integration Check

Add to CI pipeline (GitHub Actions example):

name: Check Line Endings

on: [push, pull_request]

jobs:
  check-line-endings:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Check for CRLF line endings
        run: |
          if git ls-files | xargs file | grep CRLF; then
            echo "Error: Files with CRLF line endings detected"
            exit 1
          fi          

Bulk Conversion Script

Create convert-line-endings.sh for project maintenance:

#!/bin/bash
# Convert all text files in project to Unix line endings

set -e

EXTENSIONS=("md" "txt" "sh" "py" "go" "js" "json" "yml" "yaml" "toml")

echo "Converting line endings to Unix format..."

for ext in "${EXTENSIONS[@]}"; do
    echo "Processing *.$ext files..."
    find . -name "*.$ext" ! -path "*/node_modules/*" ! -path "*/.git/*" \
        -exec dos2unix {} \; 2>/dev/null || true
done

echo "Conversion complete!"

Troubleshooting Common Issues

Issue 1: Script Still Fails After Conversion

Symptom: Bash script converted with dos2unix still shows interpreter errors.

Solution: Check file encoding and byte order mark (BOM):

# Check encoding
file -i script.sh

# Remove BOM if present
sed -i '1s/^\xEF\xBB\xBF//' script.sh

# Verify shebang line
head -n 1 script.sh | od -c

Issue 2: Mixed Line Endings in Single File

Symptom: File shows both CRLF and LF endings.

Solution: Normalize with dos2unix force mode:

dos2unix -f filename.txt

Or use more aggressive sed:

# First convert all CR to nothing, then normalize
sed -i 's/\r//g' filename.txt

Issue 3: Git Still Shows File as Modified

Symptom: After converting line endings, Git shows file as modified with no visible changes.

Solution: Refresh Git index:

git add -u
git status

# If still showing, check Git config
git config core.autocrlf

# Temporarily disable autocrlf
git config core.autocrlf false
git add -u

Issue 4: Hugo Build Fails After Conversion

Symptom: Hugo fails to parse frontmatter after line ending conversion.

Solution: Check for Unicode BOM and frontmatter syntax:

# Remove BOM from markdown files
find content -name "*.md" -exec sed -i '1s/^\xEF\xBB\xBF//' {} \;

# Verify YAML frontmatter
hugo --debug

Issue 5: dos2unix Not Available

Symptom: System doesn’t have dos2unix and you can’t install packages.

Solution: Use portable shell function:

dos2unix_portable() {
    sed -i.bak 's/\r$//' "$1" && rm "${1}.bak"
}

dos2unix_portable filename.txt

Special Cases for Hugo Sites

Hugo static sites have specific considerations for line endings, particularly in content files and configuration.

Converting Hugo Content

# Convert all markdown content files
find content -name "*.md" -exec dos2unix {} \;

# Convert configuration files
dos2unix config.toml config.yaml

# Convert i18n translation files
find i18n -name "*.yaml" -exec dos2unix {} \;

# Convert layout templates
find layouts -name "*.html" -exec dos2unix {} \;

Handling Frontmatter

YAML frontmatter is particularly sensitive to line ending issues. Ensure consistency:

# Check frontmatter-containing files
for file in content/post/**/index.md; do
    if head -n 1 "$file" | grep -q "^---$"; then
        file "$file"
    fi
done | grep CRLF

Hugo Build Scripts

Ensure build and deployment scripts use Unix line endings:

dos2unix deploy.sh build.sh
chmod +x deploy.sh build.sh

Performance Considerations

For large projects with thousands of files, conversion performance matters.

Benchmark Comparison

Converting 1000 markdown files:

# dos2unix: ~2 seconds
time find . -name "*.md" -exec dos2unix {} \;

# sed: ~8 seconds
time find . -name "*.md" -exec sed -i 's/\r$//' {} \;

# Parallel dos2unix: ~0.5 seconds
time find . -name "*.md" -print0 | xargs -0 -P 4 dos2unix

Parallel Processing

Use GNU Parallel or xargs for faster batch conversion:

# Using xargs with parallel execution
find . -name "*.md" -print0 | xargs -0 -P 8 dos2unix

# Using GNU Parallel
find . -name "*.md" | parallel -j 8 dos2unix {}

Cross-Platform Development Best Practices

Establish team conventions to prevent line ending issues from the start.

1. Repository Setup Checklist

  • Add .gitattributes with text file declarations
  • Set core.autocrlf in team documentation
  • Include .editorconfig in repository
  • Add pre-commit hooks for validation
  • Document line ending policy in README

2. Team Onboarding

New team members should configure:

# Clone repository
git clone <repository>
cd <repository>

# Configure Git
git config core.autocrlf input  # Linux/macOS
git config core.autocrlf true   # Windows

# Verify setup
git config --list | grep autocrlf
cat .gitattributes

3. Code Review Guidelines

  • Reject PRs with line ending-only changes
  • Use git diff --ignore-cr-at-eol for reviews
  • Enable line ending checks in CI/CD

4. Documentation

Include in project README:

## Line Ending Convention

This project uses Unix line endings (LF) for all text files.

**Setup:**

- Linux/macOS: git config core.autocrlf input
- Windows: git config core.autocrlf true

**Converting Files:**
dos2unix filename.txt

See .gitattributes for file-specific configurations.

Working with text files across platforms involves understanding various related tools and workflows. Here are resources for deeper dives into complementary topics:

External Resources

These authoritative sources provided technical details and best practices for this article: